Review Comment:
The paper was significantly rewritten, and the new version is without a doubt much better than the previous one. The paper now includes a first-order logic formalisation of the ontology they consider, which makes the intended contribution and the correctness of their results much clearer.
The paper still lacks formal proofs. For the first main contribution of validating knowledge graphs against the (axiomatization of the) time of ontology, this is probably fine, as the FO formalisation and the characterisation in terms of standard orders are clear enough. But there are a few places where a little more theory would go a long way, and an adequate formal framework would spare the authors from ad hoc discussions and arguments by example. Two concrete aspects are the expressiveness limitations of SHACL, and the uniqueness of completions, as discussed below.
But overall, I think the first contribution of validating KGs against the time ontology is now clear and valuable enough.
I am a bit less convinced about the way the second contribution is presented. I could not sympathise more enthusiastically with the cause of making inference a first-class city of any validation process. The very foundation of KGs is knowledge and semantics, and implicit tuples are thus a fundamental part of KGs that cannot be ignored during validation.
The authors advocate distinct SHACL dialects analogous to OWL profiles, which is certainly a good idea. But SHACL is for validation, and OWL for inference. Why not create OWL profiles that provide additional expressive power? After all, the goal of OWL is precisely to infer implicit facts, and SHACL was defined for validation, not for inference. If one takes the syntax of SHACL and uses it as a language for inference, is it OWL or SHACL? I am not against using SHACL-like languages for inference, but more in favour of separating the intended use from specific syntactic choices.
I already complained in the first review about the very procedural way that inference is viewed and defined. Unfortunately, this is still present in the paper. While it is not necessarily a reason for rejection, it is a pity that the authors do not provide a definition of completion or its intended consequences. They talk about adding facts and about repeated rule application, but they do not refine declaratively what an implied fact is or what a completion is. They equate the semantics with the procedure. They also seem to rule out the existence of other procedures, once the semantics are in place. Materialisation is the most straightforward, but not the only way to take inferred facts into account.
Overall, the paper in its current form makes an interesting and valuable contribution. It still falls somewhat short of rigour, but the key contributions are now clear enough to be useful to the community.
Some specific aspects:
- The arguments on the expressive limitations of SHACL-Core seem right, but they lack proof. You say, "Such cross-node value comparisons fall outside the expressive power of SHACL Core". Do you have a reference for this? Or a proof?
Similarly, "it is not possible to define a SHACL-SPARQL shape that detects this invalid pattern for an arbitrary number of nodes using SPARQL 1.1 Property Path operators". Has this been proved, here or elsewhere?
I understand what you mean, and I believe the examples you give are not expressible in SHACL, but I would have liked to see a more accurate argument, possibly relying on properties of SHACL that emerge from the logical characterisations; there is quite a bit of work on the expressive power of the logics that underlie SHACL (e.g., simulation invariance, expressiveness of classes regular paths, etc). While the current argument is probably fine for the paper, I encourage you to look into this in the future.
- While I fully agree with their discussion of validity being a "broader concern" than consistency, at times the authors seem to equate logical reasoning with consistency testing. Detecting satisfiability/inconsistency is one of the most common uses of logical axioms, but logical inference is much more than that, and it is a very powerful tool capable of capturing many forms of validity and correctness.
- By the "robust logic framework" for studying when completions terminate and when they are unique. etc., I meant the literature on the chase for (disjunctive) existential rules (and in description logics). There are dozens of papers on *chase termination*, universal models, cores, etc., which basically are about understanding in which cases one can build a unique finite representative completion.
- You say that "It is the responsibility of the knowledge engineer to determine, when such rules are included in the axiomatization, whether they might generate infinite loops and, if so, to include conditions in their WHERE clauses to break the loop."
Well, here I respectfully disagree: a much more robust approach is to properly define the logical framework one is working on, and rely on the vast knowledge we have on ensuring that the chase terminates for the logical fragment, rather than manually proving that we can "break" specific rule applications.
- Finally, the authors seem to identify non-deterministic outcomes with non-monotonic constructs. While non-monotonicity is a prominent culprit, it is not the only one. Note that some OWL fragments (in fact, all the non-Horn ones) can express disjunctive information: union, counting, all-values-from restrictions, etc. The temporal setting is intrinsically disjunctive, eg, if a rule has that overlaps(i1,i2) in the head, one needs case distinctions of whether, e.g., the beginning of i1 is before, equal, or after the beginning of i2.
|