# UCQ-rewritings for Disjunctive Knowledge

### Tracking #: 2197-3410

Authors:
Enrique Matos Alfonso
Alexandros Chortaras
Giorgos Stamou

Responsible editor:
Bernardo Cuenca Grau

Submission type:
Full Paper
Abstract:
We focus on the problem of query rewriting for the framework of Disjunctive Existential Rules. It is a well-known approach for query answering on knowledge bases with incomplete data. We propose a rewriting technique that uses Negative Constraints and Conjunctive Queries to remove components from the disjunction in Disjunctive Existential Rules. This process will eventually generate new Existential Rules. The generated rules can then be used to produce new rewritings considering the existing rewriting approaches for the Existential rules Framework. The algorithm is implemented in COMPLETO in order to provide complete rewritings for unions of conjunctive queries with negation. Additionally, we report some experiments to evaluate the viability of the proposed solution.
Tags:
Reviewed

Decision/Status:
Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 15/Jun/2019
 Suggestion: Major Revision Review Comment: Review: UCQ-rewritings for Disjunctive Knowledge The authors present a procedure that allows to compute UCQ-rewritings for conjunctive queries that may contain negated atoms over logical theories of disjunctive existential rules. Even though the topic is indeed interesting and, as far as I know has not been considered by any other researcher, I think that the paper needs some major revisions before being accepted to the SWJ. In the following paragraphs, I list some of the major problems that I see with this work. 1. Write-up. Even though the paper is understandable for the most part, it could also be a bit better written (see all the minor comments below). Throughout the paper, I find that the authors use redundant definitions that could be easily avoided. For instance, the authors define (non-disjunctive) existential rules and disjunctive existential rules separately. In turn, this also forces them to have a complicated definition for knowledge bases, and to define "rewriting steps" in two different ways (see definitions 3.4 and 3.5). At times, I find that the paper a bit "chatty". For instance, the authors introduce the notion of a hypergraph and illustrate this notion with several examples just to define connected components in a query. I think that connectedness is a well-known notion that can be explained in a small paragraph. Then, they introduce lemmas 2.1 and 2.2 which are almost self-evident, and do not seem to be used anywhere else in the paper (they are at least not referenced directly). Then, some important notions in the paper are not properly explained. For instance, the function cover used in Figures 3 and 4 lacks a well-defined explanation when the input is a set of rules. Finally, I want to mention that I am somewhat concerned about the definition of a UCQ-rewriting (line 42, page 7, left column) and I would like the authors to clarify some issues here. Is this definition really what you use to define a UCQ-rewriting? AFAIK, rewritings area always sound and complete. That is, a UCQ-rewriting usually satisfies the conditions that you describe in lines 44 and 48. Re. your algorithms and implementations, do they produce UCQ rewritings that are only sound but not complete? Also, I find that the definition of conjunctive queries is also problematic. Normally, CQs have only existential and answer variables. But in your definition you include universally quantified variables. Are this answer variables? Or are you solving First-order queries here? 2. Open questions. As far as I can tell, the authors do not comment on the following question: Is the proposed method complete for knowledge bases that admit finite rewritings for all queries? I think that the authors should clarify if this is the case or not. If they can show that indeed this is the case, I think that this would significantly strengthen the submission. 3. Problems with the contribution. More precisely, I find that the theorems in Section 3 are not very useful. I comment on each of these separately. Theorem 3.7. First of, this theorem can be simplified as follows: Let R1 be a fus and r2 a disconnected existential rule. Then, R1 cup {r2} is also fus. This result can be easily proven with the following statement: 1. Let Q be a query. 2. Compute the rewriting Q' of Q with respect to R1. Note that, since R1 is fus, this rewriting is finite and computable: you can use the algorithm from [11] 3. Compute the rewriting Q'' of Q' with respect to r2. Note that, since this is a disconnected rule, this rewriting is finite and computable. 4. Compute the rewriting Q''' of Q'' with respect to R1. 5. Q''' is a (finite) rewriting for R1 cup {r2}. 6. The rule set R1 cup {r2} is fus. Theorem 3.8 The fact that we can only add disconnected disjunctive rules here, severely limits the usefulness of this procedure. Also, it is quite straightforward to show that this language that you discuss here is decidable. Theorem 3.9 If I understand the theorem correctly, all of the variables that occur in negated atoms are answer variables (please confirm). - You say that "only the variables present in positive literals can appear existentially quantified". Hence, all of the variables that only appear in negated atoms must be answer variables. Furthermore, all of the variables in the frontier must be answer variables. So all variables that appear a negated atom and a positive atom is also an answer variables. If I am right, this is something that should be clarified and that severely limits the usefulness of the result. If not, may be find a more clear way to write the premise of this lemma. Theorem 3.10 This result trivially holds if the only literal in the query is positive. If it is negative, then all of its variables are answer variables. Again, I find that this limits the usefulness of this result. It appears that, at the end, you only consider queries where negated atoms only contain answer variables. If this is the case, it would make more sense to clarify this from the beginning. 4. Problems with the evaluation. LUBM contains transitive roles; e.g., the role "subOrganizationOf". Hence, there are queries, such as $subOrganizationOf(x, y)$ with $x$ an answer variable, that will not admit a finite rewriting. However, your algorithm always seems to produce compute finite rewritings. How is this possible? Is it because all of the queries that you consider in the implementation admit finite rewritings? Or is it because you have modified the ontology LUBM so it is fus? Either way, you should comment on this explicitly. It would be great if you could make the queries and ontologies that you use in the evaluation section available. If possible, please add the translation into existential rules too. Minor comments - Title is not very informative. Please consider writing something that contains all of the following keywords: "disjunctive existential rules" and "negative conjunctive queries". Otherwise nobody will find this via Abstract - I would not use uppercase when writing any of the following: disjunctive existential rules, negative constraints, and conjunctive queries. If you do, please be consistent (e.g., you write "Existential rules Framework" and "conjunctive queries" using lowercase in the abstract). - "It is a well-known approach for query answering on knowledge bases with incomplete data": As far as I know, it is a well known approach for query answering all knowledge bases. Also, I would write "well known" instead of "well-known". 1. Introduction - I am confused by the first sentence: what is a knowledge base? How do rules allow us to "perform query answering over incomplete data and come up with complete answers"? - Line 23, page 1, right column: "The" -> "the" - Line 23, page 1, right column: Despite the fact that I do understand what you are saying here, this sentence feels a bit ad hoc/informal. - Line 30, page 1, right column: You should mention that forward chaining does not always terminate. - "the size of the rewriting can be exponential with respect to the initial size of the query": Do you have a citation for that? Also, you should comment on the fact that a finite rewritings do not necessarily always exist. - Line 39, page 1, right column: I would say that this sentence is somewhat unclear: conjunctive query answering against a set of rules and a set of facts is undecidable even when the query may not feature negated atoms. - Line 47, page 1, right column: I would write a comma before "e.g." - Line 25, page 1, right column: I would write a comma before "i.e." - Please, be consistent in the use of uppercase/lowercase (e.g., with "Conjunctive Query"). - Line 6, page 2, left column: In [8], the authors study an existential rule fragment (the guarded fragment) that does guarantee termination for forward chaining procedures. Also, maybe you should also cite "Restricted Chase (Non)Termination for Existential Rules with Disjunctions" in this context. - Line 45, page 2, left column: Is your algorithm guaranteed to terminate if a rewriting exists? You should mention whether this is this the case or not explicitly here. Or, if you do not know, this should also be discussed. - Line 5, page 2, right column: "method" -> "methods" - Next line: again, you are inconsistent in the use of uppercase and lowercase when referring to some notions (e.g., conjunctive queries). I would suggest that you go for lowercase all throughout the document. Also, what are "disconnected disjunctive existential rules"? And linear "knowledge bases"? I understand that formal definitions should not be included in the introduction, but I would try to add an intuitive explanation for these notions somewhere. - Line 15, page 2, right column: maybe you should clarify here that the results referenced in this paragraph apply to "(non-disjunctive) existential rules". - Line 24, page 2, right column: you should list your evaluation as one of the contributions of this paper. - Line 29, page 2, right column: "Introduces" -> "introduces". Also, in this sentence you write "Disjunctive Existential Rules" in uppercase, when in the previous paragraphs you use lowercase. Just go lowercase all throughout. 2. Preliminaries - I would explicitly define substitutions and unifiers; I have seen many different definitions for these notions across different papers. - Definition 2.1: I would not create a separated equation environment in line 24. - I would denote CSF formulas using some kind of special bracket. As it is, we are supposed to identify a sequence of formulas with a conjunction. - Line 33, page 3, left column: just write "the empty CSF is \top" and "the empty DSF is \bot". - Line 6, page 3, right column: I am a bit puzzled by the definition of flattening. Why do you include this notion? - Line 15, page 3, right column: Why do you restrict the definition of entailment so there's a conjunction in the left and a disjunction on the right? - Line 29, page 3, right column: I would just say that a literal is either positive or negative. - Line 37, page 3, right column: What happens with the constants in the hyper graph? - Line 43, page 3, right column: you should emphasise connected, as you are defining this for the first time. - Example 2.1 is helpful. - Line 38, page 4, left column: you do not need to repeat again that Ui is connected - Maybe it is really not necessary that you introduce so many preliminary notions about resolution... - Line 24, page 6, right column: What you define there is not a conjunctive query, but a boolean conjunctive query. - Next line: so (positive) CQs do not allow for the use of universally quantified variables but negative CQs do? This definition is confusing. Also, the whole definition of CQs is quite non-standard. AFAIK, CQs contain existentially quantified variables and answer variables, but not universally quantified variables. Where do you take this definition from? - Line 49, page 6, right column: Your definition of conjunctive query coincides with the standard Boolean conjunctive query. - Line 18, page 7, left column: simply define disjunctive existential rules, and then define (non-disjunctive) existential rules as a special case (i.e., the disjunctive rules with a single disjunct in the head). - Line 28, page 7, left column: maybe you should specify that Q here is a query? - Line 36, page 7, left column: The definition of a knowledge base seems unnecessary complicated. First of, I would just define it as a set of rules and a set of facts (you can later differentiate between disjunctive and non-disjunctive rules, and constraints). Furthermore, why do define D as a CSF of CSFs of facts? Can you just define it as a CSF? Also, aren't negative constraints also disjunctive existential rules? Does that mean that the set \mathcal{R}^\vee can contain constraints? - Line 31, page 7, right column: "i.e." -> "i.e., the problem of deciding whether" - Line 35, page 7, right column: clarify that you solve this problem in some cases; query entailment is an undecidable problem. - Line 42, page 7, right column: your definition of UCQ-rewritings is plain incorrect. A UCQ-rewriting is sound and complete. - Line 21, page 9, left column: Is it true that the only way to produce a conjunctive query is to produce clauses with a decreasing positive charge? - Lemma 2.1 should be a theorem as it is not a result that you use to prove another result. - What is the point of Lemma 2.2? First of, this result is self-evident. Furthermore, do you use it anywhere in the paper? 3. Backward Chaining with Disjunctive Knowledge - Line 51, page 8, left column: you have no introduced "Skolemised rules". In fact, in your definition of FOL, you explicitly mention that you do not consider variables. - Line 21, page 8, right column: remove that single-word line. - Theorem 3.4: use a corollary environment. - Line 50, page 8, left column: this is a really interesting example. - Definition 3.4 and 3.5: I would not define a "Rewriting Step" and a "Disjunctive Rewriting Step". I would simply define the latter first, and then define discuss the former as a special, simpler case of rewriting step. - Line 21, page 12, right column: I think that here you want to say that $\mathcal{Q}^\lnot$ is a CSF that contains only negated literals. If so, just write that because it would be clearer. - Theorem 3.5: I think that the formulation of the theorem is a bit unclear. For instance, when you say that "there is a q' \in Q'", what is $Q'$? You should define that ahead. Also, do not add a separate math environment for line 39. - Line 9, page 14, left column: add a space between (fus) and [2] - Line 14, page 14, left column: \mathcal{R} is a set of disjunctive existential rules? (Non-Disjunctive) existential rules? How do you apply the procedure in Figure 4 to it? Note that this algorithm requires two inputs. - Line 21, page 14, left column: I think that the notion of a cut is more commonly referred to as a strata. - Theorem 3.6: Is this the first theorem where it is shown that CQ rewriting is decidable for linear disjunctive existential rules? If so, you should "sell" this a bit harder. - Line 21, page 14, right column: no need to add "a" after the opening parenthesis. Just write "A (disjunctive) existential rule..." - Theorem 3.7 could be simplified as follows: Let R be a fus and let r be a disconnected existential rule. Then, $R \cup \{r\}$ is also a fus. Note that this alternative definition entails the one that you provide. 4. Implementation and Experiments - From the beginning of the paper, I was under the impression that your system could deal with disjunctive existential rules. As this is not the case, please clarify this at the introduction. You could explicitly discuss this when you mention your contributions. - Line 48, page 18, left column: What is "Association Rules Mining"? Maybe you should add a citation here. - Not sure if figures 9 and 10 are very useful. Since you only evaluate 2 ontologies, the comparison between is not very meaningful. Problems with the layout: - Please remove all lines that contain a single word (either shorten or lengthen that paragraph). E.g., last line in the description of Example 1.1, line 16 in the left column of page 2, line 14 in the right column of page 2, last line of the introduction... - Are you using the standard equation environment in Latex? It seems to take loads of vertical space. - When you write a logical operator at the end of a line, add a ~ after it so it is separated from whatever comes before. E.g., add ~ at the end of line 47 in the left column of page 1, and line 50 in the right column of page 1. - It seems to me that, if you abbreviate a bit your predicates, you can fit many rules within a single line. E.g., use "parent" instead of "is-parent" in line 50 in the left column of page 1. Also, you can simply ignore universal quantifiers (this is what everybody does when writing existential rules). Clean up your citations: - Do not cite archive papers (e.g., [5]). Look for the conference/journal version of this paper. - Many citations contain weird sequences of characters (e.g., [2--4]). - No need to include the doi URIs. - Write down the name of the conference, not just the acronym (e.g., IJCAI).
Review #2
By Michael Morak submitted on 21/Jun/2019