Improving latent binary Bayesian neural networks using the local reparametrization trick and normalizing flows

Skaaret-Lund, Lars

Master thesis

View/Open

LarsSkaaret-LundMastersthesis.pdf (1.308Mb)

Year

2022

Abstract

An artificial neural network (ANN) is a powerful machine learning method that is used in many modern big data applications such as facial recognition, machine translation and cancer diagnostics, to name a few. A common issue with ANNs is that they usually have millions of trainable parameters, and therefore tend to overfit to the training data. This is especially problematic in applications where it is important to have reliable uncertainty estimates. Bayesian neural networks (BNN) can improve on this, since they include parameter uncertainty in the model. In addition, latent binary Bayesian neural networks (LBBNN) are able to sparsify the networks to a large degree, without losing predictive power. In this thesis, we will build on the LBBNN model in two ways. Firstly, by using the local reparametrization trick (LRT) to sample the hidden units directly, and secondly by using normalizing flows on the variational posterior distribution of the LBBNN parameters. Experimental results show that using the LRT significantly improves predictive accuracy, in addition to being more computationally efficient. Using normalizing flows further improves accuracy. In both cases, these results are obtained with a similar or higher degree of sparsity than the LBBNN model.