dc.description.abstract | The main objective of this thesis is to analyze the demographics of NRK’s digital logged-in users, for which consumption behaviour data is also available. In particular, we examine NRK’s reach across demographic groups by comparing the logged-in user population to the Norwegian population at large. In addition, we investigate the extent to which user demographics can be predicted based on users’ digital content consumption behaviour. This is addressed by building classification models using known information on users and subsequently predicting on test sets, where results are then used to evaluate classifier performance. We examine in detail the quality of predictions made across classes as well as seek to determine whether or not these improve with quantity of content consumed. Being able to predict user traits, such as gender and age, implies that there is some understanding of viewing patterns across demographic groups. For NRK this could mean for example, being able to identify and analyze variation in consumption within the population beyond a broad perspective. We find that NRK has the most room for improvement in terms of reach amongst youth. We show that while age classification is challenging in a 6-class setting, improvements can be made by using instead 4 classes, where we can outperform the baseline by 15.2%. For gender classification we show that we can outperform the baseline by 17.3%. We also find that prediction accuracy has the tendency to increase with the quantity of unique contents consumed, for both age group and gender prediction. | eng |