RDF Graph Validation Using Rule-Based Reasoning

Tracking #: 2330-3543

Ben De Meester
Pieter Heyvaert
Dörthe Arndt
Anastasia Dimou
Ruben Verborgh

Responsible editor: 
Axel Polleres

Submission type: 
Full Paper
The correct functioning of Semantic Web applications requires that given RDF graphs adhere to an expected shape. This shape depends on the RDF graph and the application’s supported entailments of that graph. During validation, RDF graphs are assessed against sets of constraints, and found violations help refining the RDF graphs. However, existing validation approaches cannot always explain the root causes of violations (inhibiting refinement), and cannot fully match the entailments supported during validation with those supported by the application. These approaches cannot accurately validate RDF graphs, or combine multiple systems, deteriorating the validator’s performance. In this paper, we present an alternative validation approach using rule-based reasoning, capable of fully customizing the used inferencing steps. We compare to existing approaches, and present a formal ground and practical implementation “Validatrr”, based on N3Logic and the EYE reasoner. Our approach – supporting an equivalent number of constraint types compared to the state of the art – better explains the root cause of the violations due to the reasoner’s generated logical proof, and returns an accurate number of violations due to the customizable inferencing rule set. Performance evaluation shows that Validatrr is performant for smaller datasets, and scales linearly w.r.t. the RDF graph size. The detailed root cause explanations can guide future validation report description specifications, and the fine-grained level of configuration can be employed to support different constraint languages. This foundation allows further research into handling recursion, validating RDF graphs based on their generation description, and providing automatic refinement suggestions.
Full PDF Version: 


Solicited Reviews:
Click to Expand/Collapse
Review #1
By Ognjen Savkovic submitted on 27/Dec/2019
Review Comment:

The new version improves on the concerns raised in the previous reviews nicely guided by the answer-letter. In particular, the critical section 5.2 is improved and new references and evidence on N3 logic are provided. The authors also addressed the other minor language and terminological issues given in the previous.

Below I list more comments on further possible improvements.

- “A formal specification is available in related works [3, 17]”. I would suggest that the authors spend more space in explaining the work there. In fact, for the sake of paper completeness, I would suggest that they introduce the Fig 1 from [5 (in paper, or 3 in letter)] and adding a couple examples of N3 logic and it’s translation to FO-logic as listed in [5]. Otherwise, it looks incomplete for the non-N3 expert.

- Among all listed reference, again with all respect with the listed, only [5] is of significant importance and quality to support the authors claim. With all respect, venues such as BioMed Research International and Federated Conference on Computer Science and Information Systems, etc are hard to take as an authority on correct usage of a formal Web language.

- Taken from the abstract of [5]: "Notation3 Logic...applied in different reasoning engines like Cwm, EYE, and FuXi. But despite these developments, a clear formal definition of Notation3’s semantics is still missing." and "Notation3 implementations from former research projects and test cases developed for the reasoner EYE. We find that 31% of these files are understood differently by different reasoners.". I would suggest that the authors tries to incorporate similar statements in their work to make clear to the reader (maybe more formally minded like me) that N3 is still needs to be polished.

- The issue on decidability is still partially addressed. The issue is not that N3 validation (or proof generation) is undecidable or not, the issue is that we do not even know if it is undecidable (following the cited literature). Also you have to be precise in asking what is undecidable. “Prolog is undecidable” is not a correct statement, but checking properties over a “Turing complete language” is indeed undecidable. Please correct that in the final version.

- “The purpose of RDF-CVis not to invent a new constraint language: it is a concise ontology which is universal enough to describe any constraints expressible by any constraint language” -- this claim is too strong, maybe the authors had in mind some more particular constraint languages

- Please move web links in footnotes (not in the body of the text)