The Nordic Dialect Corpus

Johannessen, Janne Bondi; Priestley, Joel; Hagen, Kristin; Nøklestad, Anders; Lynum, Andre

Chapter; PublishedVersion; Peer reviewed

Åpne

773_Paper.pdf (632.7Kb)

År

2012

Originalversjon

Proceedings of the Eighth International Conference on Language Resources and Evaluation. 2012, 3387-3391

Sammendrag

In this paper, we describe the Nordic Dialect Corpus, which has recently been completed. The corpus has a variety of features that combined makes it an advanced tool for language researchers. These features include: Linguistic contents (dialects from five closely related languages), annotation (tagging and two types of transcription), search interface (advanced possibilities for combining a large array of search criteria and results presentation in an intuitive and simple interface), many search variables (linguistics-based, informant-based, time-based), multimedia display (linking of sound and video to transcriptions), display of results in maps, display of informant details (number of words and other information on informants), advanced results handling (concordances, collocations, counts and statistics shown in a variety of graphical modes, plus further processing). Finally, and importantly, the corpus is freely available for research on the web. We give examples of both various kinds of searches, of displays of results and of results handling.

Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 3388-3391. Published with the permission of ELRA. This paper was published within the proceedings of the LREC 2008 and LREC 2012 Conferences. © 2008-2012 ELRA - European Language Resources Association. All rights reserved.