Hide metadata

dc.date.accessioned2020-04-11T18:27:05Z
dc.date.available2020-04-11T18:27:05Z
dc.date.created2018-11-15T21:17:29Z
dc.date.issued2018
dc.identifier.citationXiao, Guohui Hovland, Dag Bilidas, Dimitris Rezk, Martin Giese, Martin Calvanese, Diego . Efficient Ontology-Based Data Integration with Canonical IRIs. Lecture Notes in Computer Science. 2018, 10843 LNCS, 697-713
dc.identifier.urihttp://hdl.handle.net/10852/74467
dc.description.abstractIn this paper, we study how to efficiently integrate multiple relational databases using an ontology-based approach. In ontology-based data integration (OBDI) an ontology provides a coherent view of multiple databases, and SPARQL queries over the ontology are rewritten into (federated) SQL queries over the underlying databases. Specifically, we address the scenario where records with different identifiers in different databases can represent the same entity. The standard approach in this case is to use sameAs to model the equivalence between entities. However, the standard semantics of sameAs may cause an exponential blow up of query results, since all possible combinations of equivalent identifiers have to be included in the answers. The large number of answers is not only detrimental to the performance of query evaluation, but also makes the answers difficult to understand due to the redundancy they introduce. This motivates us to propose an alternative approach, which is based on assigning canonical IRIs to entities in order to avoid redundancy. Formally, we present our approach as a new SPARQL entailment regime and compare it with the sameAs approach. We provide a prototype implementation and evaluate it in two experiments: in a real-world data integration scenario in Statoil and in an experiment extending the Wisconsin benchmark. The experimental results show that the canonical IRI approach is significantly more scalable.
dc.languageEN
dc.titleEfficient Ontology-Based Data Integration with Canonical IRIs
dc.typeJournal article
dc.creator.authorXiao, Guohui
dc.creator.authorHovland, Dag
dc.creator.authorBilidas, Dimitris
dc.creator.authorRezk, Martin
dc.creator.authorGiese, Martin
dc.creator.authorCalvanese, Diego
cristin.unitcode185,15,5,80
cristin.unitnameCentre for Scalable Data Access
cristin.ispublishedtrue
cristin.fulltextpostprint
cristin.qualitycode1
dc.identifier.cristin1631206
dc.identifier.bibliographiccitationinfo:ofi/fmt:kev:mtx:ctx&ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.jtitle=Lecture Notes in Computer Science&rft.volume=10843 LNCS&rft.spage=697&rft.date=2018
dc.identifier.jtitleLecture Notes in Computer Science
dc.identifier.volume10843 LNCS
dc.identifier.startpage697
dc.identifier.endpage713
dc.identifier.doihttps://doi.org/10.1007/978-3-319-93417-4_45
dc.identifier.urnURN:NBN:no-77572
dc.type.documentTidsskriftartikkel
dc.type.peerreviewedPeer reviewed
dc.source.issn0302-9743
dc.identifier.fulltextFulltext https://www.duo.uio.no/bitstream/handle/10852/74467/2/main10494.pdf
dc.type.versionAcceptedVersion


Files in this item

Appears in the following Collection

Hide metadata