Joint UD Parsing of Norwegian Bokmål and Nynorsk

This paper investigates interactions in parser performance for the two official standards for written Norwegian: Bokmål and Nynorsk. We demonstrate that while applying models across standards yields poor performance, combining the training data for both standards yields better results than previously achieved for each of them in isolation. This has immediate practical value for processing Norwegian, as it means that a single parsing pipeline is sufficient to cover both varieties, with no loss in accuracy. Based on the Norwegian Universal Dependencies treebank we present results for multiple taggers and parsers, experimenting with different ways of varying the training data given to the learners, including the use of machine translation.

Dette verket har følgende lisens: Attribution-NonCommercial 4.0 International

Joint UD Parsing of Norwegian Bokmål and Nynorsk

Åpne

År

Permanent lenke

CRIStin

Del av

Metadata

Finnes i følgende samling

Originalversjon

Sammendrag

Bla i:

For bibliotekansatte

RSS