Understanding Multilingual Language Models: Training, Representation and Architecture

dc.date.accessioned	2023-07-31T10:37:26Z
dc.date.available	2023-07-31T10:37:26Z
dc.date.issued	2023
dc.identifier.uri	http://hdl.handle.net/10852/102833
dc.description.abstract	The field of natural language processing has seen numerous advancements since its inception during the Cold War, when attempts were made to automate translating Russian text to English. The 2010s have been a particularly fertile decade, going by sheer research volume – the use of a class of machine learning models known as neural networks has led to substantial improvements in performance on a broad range of language-related tasks, ranging from classification, to summarisation, to machine translation. In addition, some of these models have demonstrated an excellent multilingual capacity, where they can be trained to process text in English, and with only a bit of extra data, show solid results across languages. Yet these improvements, rapid as they have been, are not without their drawbacks. One such drawback is that we lack the tools to rationalise how and why these models function, particularly given the many (often arbitrary) decisions that went into their design. This dissertation adopts multiple analytical frameworks in an effort to answer questions surrounding the behaviour of multilingual neural models. How does the data they were trained on affect their performance? What linguistic capabilities can they exhibit? And finally, how do different model components help enable multilinguality?	en_US
dc.language.iso	en	en_US
dc.relation.haspart	Paper I. From Zero to Hero: On the Limitations of Zero-Shot Language Transfer with Multilingual Transformers. Anne Lauscher, Vinit Ravishankar, Goran Glavaš, Ivan Vulić. Appears in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, November 2020, pp. 4483–4499. DOI: 10.18653/v1/2020.emnlp-main.363. The article is included in the thesis. Also available at: https://doi.org/10.18653/v1/2020.emnlp-main.363
dc.relation.haspart	Paper II. Multilingual ELMo and the Effects of Corpus Sampling. Vinit Ravishankar, Andrey Kutuzov, Lilja Øvrelid, Erik Velldal. Appears in Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), May 2021, pp. 378-384. The article is included in the thesis.
dc.relation.haspart	Paper III. The Effects of Corpus Choice and Morphosyntax on Multilingual Space Induction. Vinit Ravishankar, Joakim Nivre. Appears in Findings of the Association for Computational Linguistics: EMNLP 2022, December 2022, pp. 4130–4139. The article is included in the thesis.
dc.relation.haspart	Paper IV. Probing Multilingual Sentence Representations with X-Probe. Vinit Ravishankar, Lilja Øvrelid, Erik Velldal. Appears in Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019), August 2019, pp. 156-168. DOI: 10.18653/v1/W19-4318. The article is included in the thesis. Also available at: https://doi.org/10.18653/v1/W19-4318
dc.relation.haspart	Paper V. Multilingual Probing of Deep Pre-Trained Contextual Encoders. Vinit Ravishankar, Memduh Gökırmak, Lilja Øvrelid, Erik Velldal. Appears in Proceedings of the First NLPL Workshop on Deep Learning for Natural Language Processing, September 2019, pp. 37–47. The article is included in the thesis.
dc.relation.haspart	Paper VI. Do Neural Language Models Show Preferences for Syntactic Formalisms? Artur Kulmizev, Vinit Ravishankar, Mostafa Abdou, Joakim Nivre. Appears in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, July 2020, pp. 4077–4091. DOI: 10.18653/v1/2020.aclmain.375. The article is included in the thesis. Also available at: https://doi.org/10.18653/v1/2020.acl-main.375
dc.relation.haspart	Paper VII. Attention Can Reflect Syntactic Structure If You Let It. Vinit Ravishankar, Artur Kulmizev, Mostafa Abdou, Anders Søgaard, Joakim Nivre. Appears in Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, April 2021, pp. 3031–3045. DOI: 10.18653/v1/2021.eacl-main.264. The article is included in the thesis. Also available at: https://doi.org/10.18653/v1/2021.eacl-main.264
dc.relation.haspart	Paper VIII. The Impact of Positional Encodings on Multilingual Compression. Vinit Ravishankar, Anders Søgaard. Appears in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, November 2021, pp. 763–777. DOI: 10.18653/v1/2021.emnlpmain.59. The article is included in the thesis. Also available at: https://doi.org/10.18653/v1/2021.emnlpmain.59
dc.relation.haspart	Paper IX. Word Order Does Matter And Shuffled Language Models Know It. Mostafa Abdou, Vinit Ravishankar, Artur Kulmizev, Anders Søgaard. Appears in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), May 2022, pp. 6907–6919. DOI: 10.18653/v1/2022.acl-long.476. The article is included in the thesis. Also available at: https://doi.org/10.18653/v1/2022.acl-long.476
dc.relation.uri	https://doi.org/10.18653/v1/2020.emnlp-main.363
dc.relation.uri	https://doi.org/10.18653/v1/W19-4318
dc.relation.uri	https://doi.org/10.18653/v1/2020.acl-main.375
dc.relation.uri	https://doi.org/10.18653/v1/2021.eacl-main.264
dc.relation.uri	https://doi.org/10.18653/v1/2021.emnlpmain.59
dc.relation.uri	https://doi.org/10.18653/v1/2022.acl-long.476
dc.title	Understanding Multilingual Language Models: Training, Representation and Architecture	en_US
dc.type	Doctoral thesis	en_US
dc.creator.author	Ravishankar, Vinit
dc.type.document	Doktoravhandling	en_US

Files in this item

Name:: PhD-Ravishankar-2023.pdf
Size:: 25.72Mb
Format:: application/

View/Open

Appears in the following Collection

Institutt for informatikk [4956]

Hide metadata

Understanding Multilingual Language Models: Training, Representation and Architecture

Files in this item

Appears in the following Collection

Browse

For library staff

RSS Feeds