Review Comment:
# Summary
In this work, the authors argue for using formal rule-based approach for RDF graph validation. The addressed problem is very relevant and there are interesting ideas that the paper introduces (such unique constraint language that consolidates OWA and CWA). I can appreciate the effort the authors are doing in collecting all the work on the addressed problem and in trying to address it under one framework which is very important for Semantic Web community (and often even neglected by similar work in our community). I also appreciate the effort of addressing the concerns I raised in my previous review.
However, I still have serious concerns about the correctness and academic rigor of the work. I will comment adding on top my previous review.
(2) Taken from my previous review:
[ The biggest concern is that the authors did not clearly state what is the rule language they are using (and this should be the main contribution of the paper!). There is no formal specification (grammar, syntax) nor semantics. ]
This is again my main issue. It is hard to validate the claims like, "(b) an accurate number of violations is returned by using a custom set of inferencing rules 13 up to at least OWL-RL complexity and expressiveness;"
, "(c) the number of supported constraint types is equivalent to existing validation approaches;". I see the effort of the authors that provided several examples of the comparisons but the examples do not make the claim!
I tried now to investigate more about the underlying logic the authors taken N3Logic, and now I think main problem is the authors rely on N3Logic. Is there any work on a relevant conference that is based on N3Logic or some wider adaption of it? Seems that it has been proposed almost a decade ago in [7] and [8] but in a rather informal way, and then abandoned. Even the authors of the language N3Logic claims that they were not clear about the expressiveness of the language.
In [8],
"A formal categorization of N3Logic is complicated as it differs from most traditional logics in expressivity. ... However, unlike DL, N3Logic is not
decidable, limiting expressivity in other ways motivated by the Web considerations discuss in this paper. As such, developing a formal model theory for N3Logic is quite challenging, and is the focus of current work."
Then seems that the language was not adapted by the community for further investigation (at least the authors do not provide further insides about that).
Along this lines is the comparison of the expressivity of the constraints by the proposal by the authors and languages such as SHACL and SheX that is based on PhD work of Hartmann. I had look at the work, and it is also inclined towards informal way of defining things; with all respect, this work has not been even published in a peer-reviewed conference (or at least not given as reference), thus it makes it hard to verify the claims there as well.
In general, many citations in these work are based on non peer-reviewed articles which makes is it hard to check the correctness of the claims and understand their contributions wrt to the rest of community.
Taken from my previous review:
[Almost half of the paper is consumed on criticizing existing approaches but then there is no clear proposal at the end.]
For my taste, if you claim that the main contribution is a new approach to constraint validation, then I think one should start introducing such constraint language at the beginning or asap (not at page 12/24). In this way you rise large expectations but materialize them poorly.
(2) Taken from my previous review:
[In particular, my impression is that the authors did not carefully consider the work in [21] and [22] that discusses in detail combining CWA and OWL(OWA), nor more recent work on SHACL and ShEX.]
The authors provided more references on the above but seems not that they put effort in understanding these works and comparing with their usage of N3Logic. Again this partially fault of selecting N3Logic, commented above.
(3) Taken from my previous review:
[Secondly, only the intro (sec 1) is of a reasonable quality. The style of writing and academic rigor significantly gets worse afterwards.]
The quality of the presentation has improved but still not a high level. Often new terminology is used without being introduced previously or constructed in a way that makes is hard to parse (even after several iterations).
E.g.,
- Words like
"resource r_firstname"
"compound constraint"
I find hard to understand because they do not fit to standard logic terminology (or semantic web terminology) in the context they are used. E.g., compound constraint - is this used in the literature elsewhere? I would just say constraint; resource r_firstname, is this constraints formula ? how it can be a resource?
- "Problem 2 (P2): the number of found violations depends on the supported entailments." This is a know problem already addressed in [54] and [71] (and many work afterwords)
- "To solve aforementioned observed validation problems, we pose following hypotheses" -- why do you call these hypotheses (hypothesis = a supposition or proposed explanation made on the basis of limited evidence as a starting point for further investigation)? I think it's more like your contribution.. or rather just drop part 1.2
- "declarative logic" -- what is declarative logic? Probably you meant just mathematical logic.
- "In this work, we propose an alternative validation approach using rule-based reasoning. " -- as far as I am aware of, almost all approaches to validation are sort of rule based (especially in relational and graph databases)
- "Problem 1 (P1):" I got the idea but writing style needs to be more precise.
- "These requirements are not 34 common for Semantic Web logics", what is semantic web logics?
- "Semantic Web rule-based reasoning" I am not sure if this is a known term. I find the whole paragraph is confusing.
- Section 5 - should be the main section in my view, and it is called "Application", why? Then 5.2 is called "Technologies".
- RDF-CV: RDF-CV is also important for overall argument but it also not introduced. Then it is not clear what is Listing 2 (or even 3) is exactly specifying (other than general intuition); that is, what is the semantics of rdfcv:leftProperties, rdfcv:contextClass, etc.?
|