Abstract
We present an in-depth analysis of Curriculum Vitae documents consisting of unstructured text. We present a collection of Curriculum Vitae Topics with description. We introduce an ontology that gives a formal description of the domain Curriculum Vitae. We presents an analysis that compare the performance of the two PDF extractor algorithms, TIKA and PDFExtract, respectively. We presents an in-depth analysis of Curriculum Vitae documents consisting of unstructured text. We introduce a topic boundary detection algorithm that detects topic boundaries in Curriculum Vitae documents consisting of unstructured text.