Hide metadata

dc.contributor.authorRiiser, Ingvild
dc.date.accessioned2024-02-22T00:32:22Z
dc.date.available2024-02-22T00:32:22Z
dc.date.issued2023
dc.identifier.citationRiiser, Ingvild. Privacy and Utility Evaluation of Synthetic Data for Multi-State Time-to-Event Applications. Master thesis, University of Oslo, 2023
dc.identifier.urihttp://hdl.handle.net/10852/108497
dc.description.abstractSynthetic data has gained attention over the last years because of its ability to safeguard the privacy of real data points while still ensuring data utility. These properties are beneficial in many domains and sectors working with sensitive data, particularly to public agencies, which govern large amounts of data on individuals. Most previous works on synthetic data centres around tabular data, and while some research has been done on synthetic survival data, the topic of synthetic multi-state time-to-event (MS-TTE) data has yet to be considered. In this thesis, we develop a novel semi-parametric approach to synthesising MS-TTE data, which combines a non-parametric tabular synthesiser with a parametric multi-state survival regression model. We use Weibull regression and both clock-reset and clock-forward models. Moreover, we extend our approach into an MS-TTE model with a differential privacy guarantee. We also introduce a novel differentially private Weibull regression model. We review selected evaluation methods for synthetic data in terms of privacy and utility evaluation. The standard approach evaluates synthetic data based on a single data set, which does not account for the variance between synthetic data sets generated from the same synthesiser. We propose a distance-based evaluation framework which adjusts for this variance. Using an open-access data set, we demonstrate our proposed synthesisers for MS-TTE data with and without differential privacy. Furthermore, we exemplify the evaluation of these synthesisers and their synthetic data by adapting reviewed methods to an MS-TTE setting and utilising our proposed evaluation framework.eng
dc.language.isoeng
dc.subjectmulti-state time-to-event data
dc.subjectsurvival analysis
dc.subjectsynthetic data
dc.subjectsurvival regression
dc.subjectdata privacy
dc.subjectdata utility
dc.subjectWeibull regression
dc.subjectmembership inference attack
dc.subjectdifferential privacy
dc.titlePrivacy and Utility Evaluation of Synthetic Data for Multi-State Time-to-Event Applicationseng
dc.typeMaster thesis
dc.date.updated2024-02-23T00:31:08Z
dc.creator.authorRiiser, Ingvild
dc.type.documentMasteroppgave


Files in this item

Appears in the following Collection

Hide metadata