Towards a New Generation of Ontology Based Data Access

Tracking #: 2189-3402

Oscar Corcho
Freddy Priyatna
David Chaves-Fraga

Responsible editor: 
Guest Editor 10-years SWJ

Submission type: 
Ontology Based Data Access (OBDA) refers to a range of techniques, algorithms and systems that can be used to deal with the heterogeneity of data that is common inside many organisations as well as in inter-organisational settings and more openly on the Web. In OBDA, ontologies are used to provide a global view over multiple local datasets; and mappings are commonly used to describe the relationships between such global and local schemas. Since its inception, this area has evolved in several directions. Initially, the focus was on the translation of original sources into a global schema, and its materialisation, including non-OBDA approaches such as the use of Extract Transform Load (ETL) workflows in data warehouses and, more recently, in data lakes. Then OBDA-based query translation techniques, relying on mappings, were proposed, with the aim of removing the need for materialisation, something especially useful for very dynamic data sources. We think that we are now witnessing the emergence of a new generation of OBDA approaches, driven by the fact that a new set of declarative mapping languages, most of which stem from the W3C Recommendation R2RML for Relational Databases (RDB), are being created. In this paper, we discuss the reasons why new mapping languages are being introduced, and why we think that it may be relevant to work on translations among them, so as to benefit from the engines associated to each of them whenever one language and/or engine is more suitable than another. In this vision paper, we discuss the emerging concept of “mapping translation", the basis for this new generation of OBDA, together with some of its desirable properties: information preservation and query result preservation. We also discuss several scenarios where mapping translation can be or is being already applied, even though this term has not necessarily been used in existing literature.
Full PDF Version: 


Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 12/Jun/2019
Review Comment:

This is a vision article for the SWJ 10 years special issue.

The topic is Ontology Based Data Access (OBDA). A new generation of OBDA approaches is emerging as new declarative R2RML based languages are being developed stemming from the W3C Recommendation R2RML for Relational Databases (RDB). The authors present the vision of "mapping translation" arguing that it is now useful to be able to map mapping languages on each other.

The paper first presents a review of different approaches to OBDA when integrating data sources using data translation and query translation techiques, and explain why new mapping languages are needed. Then the idea of mapping translations with its characteristic properties are explained. To motivate and support the argument of the importance of translations in the future, section 3 presents several use scenarios that already now exists and challenges to be addressed in research. In conclusion, the vision is shortly summarized.

No new research results are not presented in paper, but rather a vision of the future and a related review of the past, with 24 references. I think the paper is from a topical and content viewpoint suitable for the special issue where this kind of views from the past to expected future are solicited. The vision is supported by quite a few examples from existing system that already are employing the idea of mapping translations. The paper is well structured, written and finished.

Review #2
By Tania Tudorache submitted on 14/Jul/2019
Review Comment:

The paper presents a vision for the future of OBDA that takes advantage of the multitude of existing source-to-RDF mapping languages. The paper proposes the creation of a mapping translation between the different mapping approaches that preserves the information and the query results.

This is a novel idea that will likely spark new research in the OBDA field. The authors have already done several steps toward it. The paper is very clearly written and well presented.

I only have minor suggestions:

- Please rephrase (maybe split into two sentences), not clear: "so as to benefit from the engines associated to each of them whenever one language and/or engine is more suitable than another."

- In general, the abstract contains several long sentences that make it harder to follow. I suggest to use shorter sentences that would improve the readability a lot, and would make the message even crisper.

- It would be great to add a figure or table that shows the proposed mapping translations for the four use cases. That would help the reader a lot in keeping track of the different proposals.


- Introduce "xR2RML"

- "RDF-based Knowledge graphs" -> RDF-based Knowledge Graphs or RDF-based knowledge graphs

Review #3
By Philippe Cudre-Mauroux submitted on 15/Jul/2019
Minor Revision
Review Comment:

This short paper, written for the 10-years special issue of the Semantic Web Journal, introduces the vision of mapping translation; the authors convincingly argue that there is value in developing algorithms and tools to translate among different mapping languages (i.e., to offer facilities for mapping translation in addition to data or query translation in an OBDA setting). I find their proposition both intriguing and exciting, as this would allow for a more vibrant ecosystem of mapping languages as well as for more meaningful comparisons between existing mapping languages.

The paper is overall well-written and interesting. My main concern relates to the lack of details/context/references for a number of sections. First, I think that it would be good to introduce a few more concepts / prior work in terms of RDBMS mappings in the introduction (eg introducing perhaps data exchange or query containment, as this might be useful for Section 2).

The two properties put forward in Section 2 make a lot of sense. However, I could not quite understand the interplay between the two, or why one property would not suffice (or what is, intuitively, the difference between weak and strong semantics preservation). Also, it is unclear how this relates to more classical work eg query containment (which was typically used to qualify query mapping properties), and how semantic mismatch (eg translation of mappings targetting attributes with potentially overlapping but different sets of instances) could be handled in practice.

Section 3 is compelling also, however I think it would be nice to add more explicit information about the challenges that are still open for each scenario (each subsection mostly ends by presenting some recent work in a subdomain, without explicitly mentioning the open challenges in that context. ) Finally, I think that it might be worth discussing how mapping translation and decentralized integration / PDMSs (peer data management systems) could coexist/build upon each other (it seems to me that the combination of the two could create some interesting opportunities).