PRSC: from PG to RDF and back, using schemas

Tracking #: 3675-4889

Authors: 
Julian Bruyat
Pierre-Antoine Champin
Lionel Médini
Frédérique Laforest

Responsible editor: 
Stefan Schlobach

Submission type: 
Full Paper
Abstract: 
Property graphs (PG) and RDF graphs are two popular database graph models, but they are not interoperable: data modeled in PG cannot be directly integrated with other data modeled in RDF. This lack of interoperability also impedes the use of the tools of one model when data are modeled in the other. In this paper, we propose PRSC, a configurable conversion to transform a PG into an RDF graph. This conversion relies on PG schemas and user-defined mappings called PRSC contexts. We also formally prove that a subset of PRSC contexts, called well-behaved contexts, can be used to reverse back to the original PG, and provide the related algorithm. Algorithms for conversion and reversion are available as open-source implementations.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Accept

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 23/Apr/2024
Suggestion:
Accept
Review Comment:

The revised manuscript introducing the PRSC tool for interoperability between Property Graphs (PGs) and RDF Graphs has been significantly improved in response to previous comments from reviewers. The authors have made commendable efforts to address the initial concerns raised, particularly in clarifying the motivation, enhancing the originality section, and expanding discussions on the significance of results and the soundness of their approach.

Motivation
The authors have clearly articulated the motivation behind the PRSC tool, presenting a detailed and compelling narrative that effectively communicates the challenges and limitations in current graph database management systems, specifically regarding interoperability. The revised manuscript includes a clear problem statement delineating the specific issues the PRSC tool aims to solve.

Significance of the results
The revisions have effectively highlighted the significance of the results. The authors have expanded their discussion on how the PRSC tool addresses key interoperability challenges and enables cross-model data utilization, which is crucial for data integration and application development across various domains. The manuscript now also emphasizes the implications of their findings for future research and development, outlining potential areas for further exploration and enhancement of the PRSC tool.

Soundness
The soundness of the research is well-established in the revised manuscript. The formal definitions and proof presented provide a solid foundation for reversibility and data integrity claims.

Clarity
The clarity of the manuscript has been considerably improved. The organizational structure and logical flow make it easier for readers to grasp the research's purpose, methods, and implications.

Review #2
By Olaf Hartig submitted on 20/Jun/2024
Suggestion:
Accept
Review Comment:

I am happy to see that the authors have addressed all comments of my review of the previously submitted version of this manuscript. In particular, I appreciate that the authors added the complexity analysis that I suggested. Therefore, I am okay with accepting the manuscript now. However, there are still a few things that the authors need to fix when preparing the camera-ready version:

* The second-to-last paragraph of Section 1 (lines 22-33 on page 2) first talks about the authors' earlier work on the PREC mapping language and, then, says that this paper introduces a new mapping language, without any motivation. I am expecting an argument why a new language is needed given that the same authors have proposed another such language before.

* Numbers for bibliographic references (i.e., something like "[9]") are *not* words to be used as nouns in sentences. Hence, it is not correct to write something like "..in the ... section of [9]" (page 2) or "In [10], the ..." (also page 2) or "..of Angles in [3].." (page 6). There may be more examples of this issue throughout the paper. These need to be fixed.

* page 3, line 25: ".. through a nested RDF-star triple."

* Def.3 on page 6, first bullet point: N_pg and E_pg are not lists but sets.

* page 7: The notion of a "renaming function" is defined (Def.5) and also used (Def.6 and Def.7) as if it is a standalone concept that is independent of anything, which it is not. Instead, this notion is dependent on four sets, named in Def.5 as N_1, N_2, E_1, and E_2. This dependency needs to be reflected in the name of the notion, both where it is defined and where it is used. So, the definition needs to say: "..a renaming function from N_1 and E_1 to N_2 and E_2 is.." Similarly, where ever the notion is used (such as in Def.6 and Def.7), the naming needs to be expand in the same way.

* Def.11: If the symbol $rdf$ is meant to denote an RDF graph, then it is incorrect to say "For all RDF graphs rdf, .." Instead, it must be "For every RDF graph rdf, .."

* Def.11: Also, the definition talks about a "list of blank nodes" but actually defines a set.

* The first part of the third sentence of Sec.4.4 ("Hence ... of a PG and ...") is totally unclear to me and should be rephrased.

* Same issue with the text of point (1) in the paragraph below Def.18.

* Page 11, line 49: "..a context are valid.." --> "..a context $ctx$ are valid.."

* Algorithm 1 is stated to be "an algorithmic view of the prsc function presented by Definition 21." That's not entirely true. The \beta function in the definition has a case in which it returns some value called "undefined" (see line 33 on page 13), which is not in the algorithm.

* Def.24: The notion of a "signature" should first be defined separately (i.e., independent of the notion of well-behaved contexts), and then be used in Def.24 (instead of integrating the definition of "signature" implicitly in Def.24).

* Theorem 2: list --> set