Now showing items 1-20 of 20

  • Honkapohja, Alpo; Thaisen, Jacob; Nøklestad, Anders (Journal article / Tidsskriftartikkel / PublishedVersion; Peer reviewed, 2023)
    Non-standardised early vernaculars present a problem for search tools due to the high degree of variation. The challenge lies in the variation found in orthography, syntax, and lexicon between titles, incipits, and explicits ...
  • Kåsen, Andre; Hagen, Kristin; Johannessen, Janne Bondi; Nøklestad, Anders; Priestley, Joel (Chapter / Bokkapittel / PublishedVersion; Peer reviewed, 2020)
  • Lane, Pia; Hagen, Kristin; Nøklestad, Anders; Priestley, Joel (Journal article / Tidsskriftartikkel / PublishedVersion; Peer reviewed, 2022)
    Language documentation, including the development and use of corpora, is frequently linked to revitalisation. This is also the case for the Kven language, a Finnic minoritised language, traditionally spoken in the two ...
  • Johannessen, Janne Bondi; Hagen, Kristin; Nøklestad, Anders; Priestley, Joel (Chapter / Bokkapittel / PublishedVersion; Peer reviewed, 2010)
    We will look at how maps can be integrated in research resources, such as language databases and language corpora. By using maps, search results can be illustrated in a way that immediately gives the user information that ...
  • Kapociute-Dzikiene, Jurgita; Nøklestad, Anders; Johannessen, Janne Bondi; Krupavicius, Algis (Chapter / Bokkapittel / PublishedVersion; Peer reviewed, 2013)
    Despite the existence of effective methods that solve named entity recognition tasks for such widely used languages as English, there is no clear answer which methods are the most suitable for languages that are substantially ...
  • Borthen, Kaja; Søfteland, Åshild; Kveen, Perlaug Marie; Karagjosova, Elena; Nøklestad, Anders (Journal article / Tidsskriftartikkel / PublishedVersion; Peer reviewed, 2021)
    Denne artikkelen rapporterer om ein studie av geografiske og demografiske trekk ved 46 etterstilte uttrykk i norske talemål, mellom anna gitt, sant og kan du skjønne. I fyrste del av studien er spørjeskjema nytta som metode. ...
  • Søfteland, Åshild; Nøklestad, Anders; Priestley, Joel; Hagen, Kristin (Journal article / Tidsskriftartikkel / PublishedVersion; Peer reviewed, 2020)
    In this article we show how the search interface Glossa has been developed in step with the various corpora that have been built at the Text Laboratory. Furthermore, we present statistics on what kind of searches people ...
  • Johannessen, Janne Bondi; Nygaard, Lars; Priestley, Joel; Nøklestad, Anders (Chapter / Bokkapittel / PublishedVersion; Peer reviewed, 2008)
    We describe a web-based corpus query system, Glossa, which combines the expressiveness of regular query languages with the user-friendliness of a graphical interface. Since corpus users are usually linguists with little ...
  • Hagen, Kristin; Wangensteen, Boye; Sköldberg, Emma; Trap-Jensen, Lars; Lars, Vikør; Vindenes, Urd; Enger, Hans-Olav; Malmgren, Sven-Göran; Wetås, Åse; Fjeld, Ruth Vatvedt; Nøklestad, Anders (Book / Bok / PublishedVersion; Peer reviewed, 2020)
  • Fjeld, Ruth E. Vatvedt; Nøklestad, Anders; Hagen, Kristin (Journal article / Tidsskriftartikkel / PublishedVersion; Peer reviewed, 2020)
    Denne artikkelen er en introduksjon til Leksikografisk bokmålskorpus (LBK). Vi starter med en historisk oversikt over ordboksarbeid som er utført for norsk språk, og forklarer bakgrunnen for at LBK ble bygd opp på den måten ...
  • Nøklestad, Anders; Hagen, Kristin; Johannessen, Janne Bondi; Kosek, Michał; Priestley, Joel (Chapter / Bokkapittel / PublishedVersion; Peer reviewed, 2017)
    This paper presents and describes a modernised version of Glossa, a corpus search and results visualisation system with a user-friendly interface. The system is open source and can be easily installed on servers or even ...
  • Johannessen, Janne Bondi; Priestley, Joel; Nøklestad, Anders (Chapter / PublishedVersion / Bokkapittel, 2010)
    Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation, pages 749-758
  • Mæhlum, Petter; Haug, Dag Trygve Truslew; Jørgensen, Tollef Emil; Kåsen, Andre; Nøklestad, Anders; Rønningstad, Egil; Solberg, Per Erik; Velldal, Erik; Øvrelid, Lilja (Journal article / Tidsskriftartikkel / AcceptedVersion; Peer reviewed, 2022)
    Published in: Proceedings of the Fifth Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC): We present the Norwegian Anaphora Resolution Corpus (NARC), ...
  • Johannessen, Janne Bondi; Priestley, Joel; Hagen, Kristin; Nøklestad, Anders; Lynum, Andre (Chapter / Bokkapittel / PublishedVersion; Peer reviewed, 2012)
    In this paper, we describe the Nordic Dialect Corpus, which has recently been completed. The corpus has a variety of features that combined makes it an advanced tool for language researchers. These features include: ...
  • Lundquist, Bjørn; Larsson, Ida; Westendorp, Maud; Tengesdal, Eirik; Nøklestad, Anders (Journal article / Tidsskriftartikkel / PublishedVersion; Peer reviewed, 2019)
    In this article, we present the Nordic Word Order Database (NWD), with a focus on the rationale behind it, the methods used in data elicitation, data analysis and the empirical scope of the database. NWD is an online ...
  • Haug, Dag Trygve Truslew; Yildirim, Ahmet; Hagen, Kristin; Nøklestad, Anders (Chapter / Bokkapittel / PublishedVersion; Peer reviewed, 2023)
    This paper reports on efforts to improve the Oslo-Bergen Tagger for Norwegian morphological tagging. We train two deep neural network-based taggers using the recently introduced Norwegian pre-trained encoder (a BERT model ...
  • Kåsen, Andre; Hagen, Kristin; Nøklestad, Anders; Priestley, Joel (Chapter / Bokkapittel / PublishedVersion; Peer reviewed, 2019)
    This paper describes an evaluation of five data-driven part-of-speech (PoS) taggers for spoken Norwegian. The taggers all rely on different machine learning mechanisms: decision trees, hidden Markov models (HMMs), conditional ...
  • Øvrelid, Lilja; Kåsen, Andre; Hagen, Kristin; Nøklestad, Anders; Solberg, Per Erik; Johannessen, Janne Bondi (Chapter / Bokkapittel / PublishedVersion; Peer reviewed, 2018)
    This article presents the LIA treebank of transcribed spoken Norwegian dialects. It consists of dialect recordings made in the period between 1950--1990, which have been digitised, transcribed, and subsequently annotated ...
  • Kåsen, Andre; Hagen, Kristin; Nøklestad, Anders; Priestley, Joel; Solberg, Per Erik; Haug, Dag Trygve Truslew (Chapter / Bokkapittel / PublishedVersion; Peer reviewed, 2022)
    This paper presents the NDC Treebank of spoken Norwegian dialects in the Bokm˚al variety of Norwegian. It consists of dialect recordings made between 2006 and 2012 which have been digitised, segmented, transcribed and ...
  • Kosek, Michał; Nøklestad, Anders; Priestley, Joel; Hagen, Kristin; Johannessen, Janne Bondi (Chapter / Bokkapittel / PublishedVersion; Peer reviewed, 2015)
    We present the Glossa web-based system for corpus search and results handling, focussing on two modes of visualisation implemented in the system. First, we describe the use of maps to show the geographical distribution of ...