Hide metadata

dc.date.accessioned2021-05-07T17:53:02Z
dc.date.available2021-05-07T17:53:02Z
dc.date.created2021-05-06T10:23:11Z
dc.date.issued2021
dc.identifier.citationGramuglia, Emanuele Storvik, Geir Olve Stakkeland, Morten . Clustering and automatic labelling within time series of cate- gorical observations - with an application to marine log messages. The Journal of the Royal Statistical Society, Series C (Applied Statistics). 2021
dc.identifier.urihttp://hdl.handle.net/10852/85991
dc.description.abstractSystem logs or log files containing textual messages with associated time stamps are generated by many technologies and systems. The clustering technique proposed in this paper provides a tool to discover and identify patterns or macrolevel events in this data. The motivating application is logs generated by frequency converters in the propulsion system on a ship, while the general setting is fault identification and classification in complex industrial systems. The paper introduces an offline approach for dividing a time series of log messages into a series of discrete segments of random lengths. These segments are clustered into a limited set of states. A state is assumed to correspond to a specific operation or condition of the system, and can be a fault mode or a normal operation. Each of the states can be associated with a specific, limited set of messages, where messages appear in a random or semi‐structured order within the segments. These structures are in general not defined a priori. We propose a Bayesian hierarchical model where the states are characterised both by the temporal frequency and the type of messages within each segment. An algorithm for inference based on reversible jump MCMC is proposed. The performance of the method is assessed by both simulations and operational data.
dc.languageEN
dc.rightsAttribution 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.titleClustering and automatic labelling within time series of cate- gorical observations - with an application to marine log messages
dc.typeJournal article
dc.creator.authorGramuglia, Emanuele
dc.creator.authorStorvik, Geir Olve
dc.creator.authorStakkeland, Morten
cristin.unitcode185,15,13,0
cristin.unitnameMatematisk institutt
cristin.ispublishedtrue
cristin.fulltextpostprint
cristin.qualitycode2
dc.identifier.cristin1908388
dc.identifier.bibliographiccitationinfo:ofi/fmt:kev:mtx:ctx&ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.jtitle=The Journal of the Royal Statistical Society, Series C (Applied Statistics)&rft.volume=&rft.spage=&rft.date=2021
dc.identifier.jtitleThe Journal of the Royal Statistical Society, Series C (Applied Statistics)
dc.identifier.doihttps://doi.org/10.1111/rssc.12483
dc.identifier.urnURN:NBN:no-88652
dc.type.documentTidsskriftartikkel
dc.type.peerreviewedPeer reviewed
dc.source.issn0035-9254
dc.identifier.fulltextFulltext https://www.duo.uio.no/bitstream/handle/10852/85991/4/rssc.12483.pdf
dc.type.versionPublishedVersion
cristin.articleidrssc.12483
dc.relation.projectNFR/237718


Files in this item

Appears in the following Collection

Hide metadata

Attribution 4.0 International
This item's license is: Attribution 4.0 International