Sound, Complete and Minimal UCQ-Rewriting for Existential Rules

Tracking #: 569-1775

Authors:
Melanie Konig
Michel Leclere
Marie-Laure Mugnier
Michaël Thomazo

Responsible editor:
Guest Editors RR2013 Special Issue

Submission type:
Full Paper
Abstract:
We address the issue of Ontology-Based Data Access, with ontologies represented in the framework of existential rules, also known as Datalog+/-. A wellknown approach involves rewriting the query using ontological knowledge. We focus here on the basic rewriting technique which consists of rewriting the initial query into a union of conjunctive queries. First, we study a generic breadth-first rewriting algorithm, which takes as input any rewriting operator, and define properties of rewriting operators that ensure the correctness of the algorithm. Then, we focus on piece-unifiers, which provide a rewriting operator with the desired properties. Finally, we propose an implementation of this framework and report some experiments.
Revised Version:
Tags:
Reviewed

Decision/Status:
Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Giorgio Orsi submitted on 15/Jan/2014
 Suggestion: Major Revision Review Comment: The paper studies ontology-based data access (OBDA) in a general framework of existential rules (datalog+/-). Both theoretical and practical aspects are addressed. In particular, a generic algorithm based on the notion of a rewriting operator is presented and a particular rewriting operator (most general piece-unifier) is proven to satisfy condition required for sound and complete rewritings. One of the interesting results (which to my knowledge has never been stated before) is that, if a finite UCQ rewriting exists, then there is a unique minimal UCQ rewriting. Turning to practice, the authors then study specialized single-piece unifiers and their aggregations and present some preliminary evaluation of the approach. This problem of OBDA is certainly relevant to the journal (and its special issue). Existential rules generalize light-weight description logic (which have become quite popular in recent years) and the obtained results are novel and interesting. The paper is very well-structured and mostly well-written (detailed comments can be found below). There is, however, a concern that the rewriting operator based on aggregated unifiers may not be prunable. Example 9 has a single rule, $b(x) \to p(x,y)$. Consider now the query from Example 9 and two rules, $b_1(x) \to p(x,y)$ and $b_2(x) to p(x,y)$. Similarly to the argument of Example 9, applying two single-piece unifiers is not enough (because their results will be pruned out), while the simultaneous application of the unifiers for two rules will result in a CQ that cannot be pruned (and thus, should be included in the final rewriting). general remarks =============== The many different counters (examples, properties, theorems, corollaries, lemmas etc.) make it quite difficult to locate referenced fragments of text --- would the authors consider using fewer counters? It is somewhat confusing (in particular, in Introduction) to use the term "rewritings" for individual elements of a UCQ (usually it is reserved for the whole UCQ). This usage of the term "rewriting" quite logically then leads to "a set of rewritings" (on p 2), which is then replaced by a "rewriting set" (for the whole UCQ) starting from Section 3. Would the authors consider consistently using "rewriting set" for the whole UCQ and "elements of the rewriting set" for individual CQs? Notation for substitutions in Example 1 is not explained. And would it not be more intuitive to use a function-like notation, e.g., \{ u \mapsto x, v \mapsto y }, instead? Using A \geq B to say that there is a homomorphism from A to B is a bit unexpected -- one would think that if A can be homomorphically mapped to B then A is smaller. I understand that it is related to \leq used similarly for covers, partitions, etc. (cf. footnote 3 on p 13), but I find it rather counter-intuitive. The notion for sets of queries introduced after Definition 7 could be used to simplify Definitions 6 and 7 (one can always start from a set of queries). The transition from a set of rules in Sections 1-4 to a treatment of a single rule in Section 5 is unclear -- in particular, the piece-based rewriting operator is never defined formally. It might make more sense to define the aggregation of a set of rules (Definition 16) first (which is simple enough to understand) and then compatible piece-unifiers and aggregated unifiers. The purpose of the remark in the last paragraph on p 20 is unclear. The three items on the bottom of p 24 appear to compute the set of all subsets of U_1 -- why not write it then? Reference to [GOP11] as a source of information on the ontologies used in the experiments is a bit strange: they all appeared in H. Pérez-Urbina, B. Motik and I. Horrocks: A Comparison of Query Rewriting Techniques for DL-lite. Description Logics 2009 detailed comments ================= abstract: "takes as input any rewriting operator" -> "takes any rewriting operator as a parameter" p 1, par 1: are represented completely p 1, par 2, facts --- or data --- and an ontology (a set of existential rules) p 2, par 1: synthesis -> summary p 2, par 2: schmematized -> illustrated p 3, par 1: if the answer to Q_2 in F is positive *then* so is the answer to Q_1 p 3, par 1: sticky(-join) (and a missing space before) p 3, par 2: We call rewriting operator ... -> A rewriting operator is a function p 3, last line: unicity -> uniqueness p 4, par 1: halting -> termination p 4, par 2: We call piece a minimal subset -> A piece is a minimal subset p 5, par 1: *An* interesting point p 5, par 5: s.t. -> such that (and elsewhere, e.g., Definitions 2, 9) p 6, Definition 2: no brackets are needed around R,F\models Q (twice) p 6, Definition 2: what is the order of the components in a KB: is it F,R or R,F? p 6, above Definition 3: "covers" -> "cover" p 6, Example 2: move "See also Figure 2" to the end of the first sentence p 7, Example 3, last sentence: than -> that p 7, above Theorem 1: no matter of -> irrespective of p 8, above Corollary 1: By the remark above and Theorem 1, we obtain p 8, Section 4: this operator satisfy p 8, footnote 2: The notion of a finite unification set p 8, footnote 2: , as recalled in Section 5, -> (see Section 5) p 8, footnote 2: the two definitions are equivalent p 9, Definition 7: We denote the set of k-rewritings of Q by rew_k(W,R) p 9, above Algorithm 1: subsequently specified -> specified below (or subsequently) p 10, Proof of Lemma 2: wrong font for rew_i p 12, above Definition 10: we call cutpoints the variables ... -> cut points are the variables ... p 13, par 1: We can thus assign with u a partition -> Thus, u can be associated with a partition p 13, par 1: Conversely, given a partition ..., we can consider a substitution u obtained by p 13, par 1: let { e_1, \dots, e_k } be a -> if { e_1, \dots, e_k } is a class ... and e_i is a selected element then, for all ... p 13, par 2: The sentence ending with "until stability" is unreadable p 13, above Property 5: why is it "immediate"? p 13, Definition 11: the "i.e" bit item 1 is a repetition of the definition (should be removed) p 13, Definition 11, item 2: "note sep(Q'), the variables" -> denote by sep(Q') the variables p 13, Definition 11, item 2: it would be better to place definitions of separating variables above (otherwise, too much to digest in a single definition) p 13, Definition 11, item 3: no need to duplicate the definition of a substitution associated with a partition p 13, Definition 11, after item 3: We call cupoitns, and note cutp(\mu), the variables -> Cutpoints of \mu, denoted by cutp(\mu), are the variables p 15, Theorem 4: basically is too informal p 15, proof of Property 6: not clear what "too" is meant to emphasize in the last sentence p 16, item (ii): wrong font for terms() p 18, par 1: less rewritings -> smaller rewritings (or rewritings with fewer components) p 18, above Example 8: the next *two* examples p 20, Definitions 14 and 15 share a large proportion -- should they not be merged? p 22, above Property 11: remove "it holds that"; corollary of the *following* property p 24, above Algorithm 2: if these atoms are in A *then* Q' p 25, par 1: were led -> were performed p 25, last par: can save much with respect to both -> can considerably reduce both p 26, par 1: are currently lacking -> are currently not available p 27, [LTW09]: el -> EL p 27, [RMC12]: DL-lite -> DL-Lite