Raw Audio End-to-End Deep Learning Architectures for Sound Event Detection

dc.contributor.author	Falch, Arvid Andreas
dc.date.accessioned	2023-08-21T22:03:16Z
dc.date.available	2023-08-21T22:03:16Z
dc.date.issued	2023
dc.identifier.citation	Falch, Arvid Andreas. Raw Audio End-to-End Deep Learning Architectures for Sound Event Detection. Master thesis, University of Oslo, 2023
dc.identifier.uri	http://hdl.handle.net/10852/103569
dc.description.abstract	This thesis proposes deep learning architectures for sound event detection that aims to work fully end-to-end, working with raw audio as input, which can be directly compared to models using fixed graphical time-frequency representations as input. The primary objective is to assess the effectiveness of employing raw audio input in comparison to the conventional fixed graphical time-frequency representations. To achieve this, pairs of similar models based on convolutional recurrent neural networks commonly utilized in sound event detection, are trained using either raw audio or fixed graphical time-frequency representations to enable a comprehensive comparison. The findings reveal that the proposed deep learning architectures, operating on raw audio input, can achieve comparable performance to models based on fixed graphical time-frequency representations in sound event detection. Moreover, in specific applications where high temporal resolution is of importance, the architectures utilizing raw audio input showcase superior performance when compared to their fixed graphical counterparts. This finding highlights the potential of raw audio end-to-end deep learning architectures in capturing fine-grained temporal information critical for accurate sound event detection.	eng
dc.language.iso	eng
dc.subject	deep learning
dc.subject	audio analysis
dc.subject	machine learning
dc.subject	raw audio input
dc.subject	Sound event detection
dc.title	Raw Audio End-to-End Deep Learning Architectures for Sound Event Detection	eng
dc.type	Master thesis
dc.date.updated	2023-08-22T22:01:25Z
dc.creator.author	Falch, Arvid Andreas
dc.type.document	Masteroppgave

Files in this item

Name:: Master_thesis_Arvid_Falch.pdf
Size:: 15.14Mb
Format:: application/

View/Open

Appears in the following Collection

Musikkvitenskap [579]

Hide metadata

Raw Audio End-to-End Deep Learning Architectures for Sound Event Detection

Files in this item

Appears in the following Collection

Browse

For library staff

RSS Feeds