Hide metadata

dc.date.accessioned2024-02-06T18:11:23Z
dc.date.available2024-02-06T18:11:23Z
dc.date.created2023-06-13T16:19:40Z
dc.date.issued2023
dc.identifier.citationKolesnichenko, Larisa Velldal, Erik Øvrelid, Lilja . Word Substitution with Masked Language Models as Data Augmentation for Sentiment Analysis. Proceedings of the Second Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL-2023). 2023 Association for Computational Linguistics
dc.identifier.urihttp://hdl.handle.net/10852/107608
dc.description.abstractThis paper explores the use of masked language modeling (MLM) for data augmentation (DA), targeting structured sentiment analysis (SSA) for Norwegian based on a dataset of annotated reviews. Considering the limited resources for Norwegian language and the complexity of the annotation task, the aim is to investigate whether this approach to data augmentation can help boost the performance. We report on experiments with substituting words both inside and outside of sentiment annotations, and we also present an error analysis, discussing some of the potential pitfalls of using MLM-based DA for SSA, and suggest directions for future work.
dc.languageEN
dc.publisherAssociation for Computational Linguistics
dc.rightsAttribution 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.titleWord Substitution with Masked Language Models as Data Augmentation for Sentiment Analysis
dc.title.alternativeENEngelskEnglishWord Substitution with Masked Language Models as Data Augmentation for Sentiment Analysis
dc.typeChapter
dc.creator.authorKolesnichenko, Larisa
dc.creator.authorVelldal, Erik
dc.creator.authorØvrelid, Lilja
cristin.unitcode185,15,5,48
cristin.unitnameForskningsgruppen for språkteknologi
cristin.ispublishedtrue
cristin.fulltextoriginal
dc.identifier.cristin2154216
dc.identifier.bibliographiccitationinfo:ofi/fmt:kev:mtx:ctx&ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.btitle=Proceedings of the Second Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL-2023)&rft.spage=&rft.date=2023
dc.identifier.startpage42
dc.identifier.endpage47
dc.identifier.pagecount150
dc.subject.nviVDP::Annen informasjonsteknologi: 559
dc.type.documentBokkapittel
dc.type.peerreviewedPeer reviewed
dc.source.isbn978-1-959429-73-9
dc.type.versionPublishedVersion
cristin.btitleProceedings of the Second Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL-2023)
dc.relation.projectNFR/270908


Files in this item

Appears in the following Collection

Hide metadata

Attribution 4.0 International
This item's license is: Attribution 4.0 International