Hide metadata

dc.contributor.authorKujime, Yoshimi
dc.date.accessioned2022-08-21T22:00:27Z
dc.date.available2022-08-21T22:00:27Z
dc.date.issued2022
dc.identifier.citationKujime, Yoshimi. Priority boosting and Lasso-based block boosting: two novel approaches for multi-omics analysis. Master thesis, University of Oslo, 2022
dc.identifier.urihttp://hdl.handle.net/10852/95364
dc.description.abstractDevelopments in high-throughput technology have made multi-omics data available on a large scale. Multi-omics data are datasets consisting of different types of high-dimensional molecular variables, such as transcriptomic, proteomic, and methylation data. In recent decades, predictive modeling incorporating different types of data has attracted much attention. This thesis presents two novel boosting approaches to build a regression model for high-dimensional data consisting of multiple groups of variables such as multi-omics data. One method is priority boosting and the other is Lasso-based block boosting. Priority boosting processes data in a hierarchical manner by setting the priority order among groups, which builds a model incorporating prior knowledge and/or practical constraints. On the other hand, Lasso-based block boosting (hereinafter called LBboost) does not have a hierarchical structure. In this method, fitting will be performed in each group separately at each boosting round, and iteratively updates the model via comparing the estimates by each group and selecting the group that gives the best update, by what we call the subset-updating approach. Both priority boosting and LBboost have several desirable properties especially in the high-dimensional setting, such as automated variable selection, shrinkage of estimates and interpretability of the resulting models. We applied these two methods on simulation data and a real multi-omics dataset, and compared their prediction performances with three other methods, priority-Lasso, Lasso and componentwise gradient boosting (glmboost). Priority boosting tended to provide sparser prediction models that favor predictors in blocks with higher priorities over predictors in blocks with lower priorities. These results suggest that priority boosting can be regarded as a practical method that is easy to apply and interpret. On the other hand, the resulting models of LBboost tended to be less sparser than the other methods. However, in terms of the prediction accuracy, it showed relatively good results. It could often reach better or similar prediction accuracy compared to the priority boosting and priority-Lasso in our datasets. Furthermore, the results show that LBboost works well in the situations where glmboost and Lasso are prone to be overfitting.eng
dc.language.isoeng
dc.subjectboosting
dc.subjectmulti-omics
dc.titlePriority boosting and Lasso-based block boosting: two novel approaches for multi-omics analysiseng
dc.typeMaster thesis
dc.date.updated2022-08-21T22:00:27Z
dc.creator.authorKujime, Yoshimi
dc.identifier.urnURN:NBN:no-97884
dc.type.documentMasteroppgave
dc.identifier.fulltextFulltext https://www.duo.uio.no/bitstream/handle/10852/95364/47/Yoshimi-Kujime-master-s-thesis.pdf


Files in this item

Appears in the following Collection

Hide metadata