Skjul metadata

dc.contributor.authorHalsteinslid, Eirik Lødøen
dc.date.accessioned2019-08-22T23:46:43Z
dc.date.available2019-08-22T23:46:43Z
dc.date.issued2019
dc.identifier.citationHalsteinslid, Eirik Lødøen. Addressing collinearity and class imbalance in logistic regression for statistical fraud detection. Master thesis, University of Oslo, 2019
dc.identifier.urihttp://hdl.handle.net/10852/69491
dc.description.abstractWe study how one can improve upon a logistic regression model for statistical fraud detection. Fraud data are often characterized by uneven class distributions as well as high dependence among covariates. With a focus on recreating such dependence structures found in fraud data, we propose a stochastic model from which we can generate data. The model utilizes copulas to create a highly flexible framework for generating dependent covariates. This allows for a wide range of dependence structures among the covariates, and does not put any restrictions on the marginal distributions for the covariates themselves. We use this data generation scheme to conduct a simulation study of which regularization methods for logistic regression are best suited when covariates are highly dependent. We evaluate this in terms of both prediction and variable selection. The second problem, namely an uneven class distribution, introduces challenges as well. First, selection of an appropriate measure of predictive performance is important. Secondly, it has been demonstrated that some methods may struggle with poor predictive performance on the under-represented class. We study how such a class imbalance affects the predictive performance and variable selection capabilities of the penalized logistic regression methods. In the last part of this thesis we model tax fraud on a real-life data set provided by The Norwegian Tax Administration. Our results show that penalized logistic regression can be a helpful tool for detecting tax fraud.eng
dc.language.isoeng
dc.subject
dc.titleAddressing collinearity and class imbalance in logistic regression for statistical fraud detectioneng
dc.typeMaster thesis
dc.date.updated2019-08-23T23:45:45Z
dc.creator.authorHalsteinslid, Eirik Lødøen
dc.identifier.urnURN:NBN:no-72642
dc.type.documentMasteroppgave
dc.identifier.fulltextFulltext https://www.duo.uio.no/bitstream/handle/10852/69491/1/Eirik_Lodoen_Halsteinslid_thesis.pdf


Tilhørende fil(er)

Finnes i følgende samling

Skjul metadata