Hide metadata

dc.contributor.authorNybø, Olav
dc.date.accessioned2023-08-24T22:02:41Z
dc.date.available2023-08-24T22:02:41Z
dc.date.issued2023
dc.identifier.citationNybø, Olav. Generation probability of immune receptors and full sequence implanting in adaptive immune receptor repertoires. Master thesis, University of Oslo, 2023
dc.identifier.urihttp://hdl.handle.net/10852/103899
dc.description.abstractThe adaptive immune system protects the body by remembering previously encountered antigens, so it can react more efficiently when encountering the same antigen in the future. The adaptive immune receptors, collectively called the adaptive immune receptor repertoire, on T-cells and B-cells, play a key role when recognizing antigens. Analyzing these immune repertoires gives us a deeper understanding of them and aids the development of diagnostic technologies. The immune signal is the set of features in the adaptive immune receptor repertoire that are associated with antigen binding or disease status. Simulating these immune signals allows us to have precise control of the ground truth of the immune signal when using the simulated data to assess machine learning models. One approach to simulating immune signals is to assume it will take the form of full sequences. Full sequence implanting simulates the effect of an immune event on the immune repertoire dataset by implanting one or more sequences many times into immune repertoires. Due to biases when generating immune receptors naturally, they have very different probability of generation. This generation probability can be computed. However, if a full sequence that is unlikely to be generated naturally is implanted many times in a dataset, this could make it an easily detectable outlier. This could produce unrealistic simulated data that can give false benchmarking results. The full sequence implanting in immuneML, an open-source immune repertoire machine learning platform, can produce generation probability outliers. This thesis presents two implementations with different approaches to signal implanting strategy solutions for this generation probability outlier-problem, that will extend the full sequence simulation in immuneML. The distribution of how the generation probability of sequences relate to how often the sequences appear were analyzed in synthetic and experimental datasets to examine how the signal implanting strategies should behave and what parameters should be controlled by the user. Finally, a method that can detect candidates for these generation probability outliers was used to assess the new immune signal implanting strategies. The new signal implanting strategies both successfully showed that they could implant the signal in such a way that the generation probability outlier-problem could not reliably be exploited. The two strategies have different strengths and weaknesses, and can both be used to simulate full sequence immune signals for different types of machine learning models.eng
dc.language.isoeng
dc.subjectimmune repertoire
dc.subjectimmuneML
dc.subjectAIRR
dc.subjectV(D)J recombination
dc.subjectadaptive immune receptor
dc.subjectgeneration probability
dc.subjectAIR
dc.titleGeneration probability of immune receptors and full sequence implanting in adaptive immune receptor repertoireseng
dc.typeMaster thesis
dc.date.updated2023-08-25T22:04:03Z
dc.creator.authorNybø, Olav
dc.type.documentMasteroppgave


Files in this item

Appears in the following Collection

Hide metadata