Hide metadata

dc.contributor.authorCetinoglu, Ece
dc.date.accessioned2023-09-21T22:00:48Z
dc.date.available2023-09-21T22:00:48Z
dc.date.issued2023
dc.identifier.citationCetinoglu, Ece. Text-Based Prediction of Dwelling Condition. Master thesis, University of Oslo, 2023
dc.identifier.urihttp://hdl.handle.net/10852/105216
dc.description.abstractThe exploration of regression analysis based on text is understudied compared to other tasks, and there are limited literature on this topic, with very few studies delving into this specific task. This thesis aims to contribute to this topic while solving a real-life problem, i.e., not by experimenting on a benchmark. Our objective was to predict the condition score, a value between 0 and 3, of dwellings in the real estate market in Norway based on the features extracted from the textual content of their respective listing advertisements. Usually, the condition of a dwelling is described in a publicly available condition report written by a certified assessor. We aspired to obtain a benchmark method that can be utilized to predict the score in case of missing condition reports. We approached the regression task by creating progressively more complex models. We experimented with these models to improve the accuracy, for instance by hyperparameter tuning and oversampling. The results have shown that while the BERT-based regression models demonstrated superior performance, simpler regression methods trained on features extracted from text using Bag of Words (BoW) approaches produced comparable results. Among the models explored in this thesis, the gradient boosting regression model trained on bag-of-words features, and the unfrozen NB-BERT-BASE model both trained on the oversampled data set, stood out with noteworthy results, yielding mean absolute errors of 0.1835 and 0.1578 respectively. The results obtained in this thesis present convincing evidence that text-based regression analysis with BoW-based and BERT-based approaches is a viable and promising downstream task. This thesis can potentially contribute to the advancement of knowledge in the real estate market and introduces a novel application of Natural Language Processing, a field that traditionally emphasizes classification tasks rather than prediction of continuous variables.eng
dc.language.isoeng
dc.subject
dc.titleText-Based Prediction of Dwelling Conditioneng
dc.typeMaster thesis
dc.date.updated2023-09-21T22:00:48Z
dc.creator.authorCetinoglu, Ece
dc.type.documentMasteroppgave


Files in this item

Appears in the following Collection

Hide metadata