Hide metadata

dc.contributor.authorBjerkeland, Daniel Clemeth
dc.date.accessioned2022-09-09T22:00:53Z
dc.date.available2022-09-09T22:00:53Z
dc.date.issued2022
dc.identifier.citationBjerkeland, Daniel Clemeth. Tagging and Parsing Old Texts with New Techniques. Master thesis, University of Oslo, 2022
dc.identifier.urihttp://hdl.handle.net/10852/96445
dc.description.abstractDespite the fact that substantial, language-specific treebanks have been available for many years, the best-performing parsers consistently have to relegate the Ancient Greek collections to the bottom third when it comes to parsing accuracy (such as in the CoNLL shared tasks). With recent developments such as the release of a monolingual BERT model for Ancient Greek, we see the potential to push the performance numbers up by combining contextual word embeddings with a number of high-performing strategies such as biaffine attention, joint modeling, and language-specific enhancements. We look at both dependency parsing and grammatical tagging. We evaluate on several datasets and show that our approach is successful, achieving state-of-the-art results on all metrics and datasets that we test for. We further probe the implications of our findings by identifying specific linguistic traits that we have been able to accommodate through the various methods, and contextualize the results with regard to multi-task learning and domain adaption. Specifically, we show that domain-specific training is crucial for performance on certain datasets, and that diacritics are essential to tagging accuracy in Ancient Greek in general, and propose a novel way of incorporating them into tokenizable text. We release our code and training details for reproducibility.eng
dc.language.isoeng
dc.subjectdependency parsing
dc.subjecttransformers
dc.subjectNLP
dc.subjectPOS tagging
dc.subjectdigital humanities
dc.subjectBERT
dc.subjectgrammatical tagging
dc.subjectAncient Greek
dc.titleTagging and Parsing Old Texts with New Techniqueseng
dc.typeMaster thesis
dc.date.updated2022-09-09T22:00:53Z
dc.creator.authorBjerkeland, Daniel Clemeth
dc.identifier.urnURN:NBN:no-98954
dc.type.documentMasteroppgave
dc.identifier.fulltextFulltext https://www.duo.uio.no/bitstream/handle/10852/96445/8/thesis.pdf


Files in this item

Appears in the following Collection

Hide metadata