Causal discovery with Bayesian networks

dc.contributor.author	Syed, Rayyan Ahmad Shah
dc.date.accessioned	2023-08-22T22:03:19Z
dc.date.available	2023-08-22T22:03:19Z
dc.date.issued	2023
dc.identifier.citation	Syed, Rayyan Ahmad Shah. Causal discovery with Bayesian networks. Master thesis, University of Oslo, 2023
dc.identifier.uri	http://hdl.handle.net/10852/103704
dc.description.abstract	One of the most widely used tools for causal discovery is based on causal models represented by the framework of Bayesian network. In the most challenging cases of causal discovery the underlying BN structure is not known and must be computed in a way that it takes into account the uncertainty that exist when trying to predict the underlying structure. The structure uncertainty can then be transformed into an uncertainty regarding a causal relationship between variables reflecting the strength of how likely a causal relationship is given data assumed to come from the underlying causal model. There are different methods account for such uncertainty. We will focus on Bayesian model averaging over structures implemented trough Markov Chain Monte Carlo(MCMC) and a state-the-art dynamic programming algorithm.The general way of expressing parameters for a causal model is through the use of conditional probability tables CPTs. It has been demonstrated that more expressive models that account for additional structures in each CPT may lead to improved predication over traditional causal models. We will represent the regularities within CPTs through more refined independency relations, defined according to the concept of context-specific independence(CSI), in the form of CSI-trees which are learned with a greedy algorithm. To identify plausible models, we use a score-equivalent Bayesian score. An optimal combination of these models will be found with the help of Bayesian model averaging in order to find the posterior distribution over the causal target of interest. These methodologies where tested on synthetic data generated from known benchmark Bayesian networks. A comparison between CPTs and CSI-trees with the help of AUC show that no significant improvement was made on the tested networks. However for some data sizes some improvement could be seen. One reason might be that no exact CSI-tree representation of the conditional distribution exist for these networks,since the true distributions are defined through CPD tables. Another reason might be that it was necessary to regulate the model fit with a model structure prior to avoid overfitting in the learning process. The prior used in this work might have been suboptimal. A comparison between MCMC and state-the-art dynamic programming algorithm shows that the result under AUC are similar,however the convergence of the MCMC over structure for some networks tested is slow.	eng
dc.language.iso	eng
dc.subject
dc.title	Causal discovery with Bayesian networks	eng
dc.type	Master thesis
dc.date.updated	2023-08-23T22:00:47Z
dc.creator.author	Syed, Rayyan Ahmad Shah
dc.type.document	Masteroppgave

Files in this item

Name:: Master-thesis-Rayyan-Syed.pdf
Size:: 1.432Mb
Format:: application/

View/Open

Appears in the following Collection

Matematisk institutt [3781]

Hide metadata

Causal discovery with Bayesian networks

Files in this item

Appears in the following Collection

Browse

For library staff

RSS Feeds