Review Comment:
The paper introduces methods for evaluating SPARQL queries with preference criteria, where the goal is to return solutions that are not strictly better (dominated by) other solutions for all of the criteria indicated. The author's proposal is based on mapping SPARQL queries to SQL and evaluating them on a relational database system (RDBMS) that supports custom (optimised) physical operators for processing preferences. The paper includes a proposed syntax for specifying preferences in an extension of SPARQL, a definition of the semantics of the extensions, and a mapping from such queries to a similar extension of SQL for specifying preferences. The experiments then compare the proposed approach against a variety of existing systems for evaluating preferences in SPARQL queries. Overall the results show that the proposed SPARQL-to-SQL approach, leveraging an existing RDBMS optimised for such preference-based queries, generally outperforms other state-of-the-art approaches (except when the query returns few results).
## Strengths:
S1: The problem is interesting and relevant. Handling preferences in queries is undoubtedly a useful feature in many settings (which is why the topic has received considerable attention in relational databases). More generally, SPARQL is receiving more and more adoption, and thus studies looking into other (useful) features that may be added in future versions are important.
S2: The virtual approach of mapping SPARQL to SQL queries makes a lot of sense, taking into consideration that there already exist RDBMSs optimised to handle preferences. I think that this work can serve as a very useful baseline/benchmark for comparing "pure" SPARQL approaches to what has been achieved already for relational databases (one would expect that similar performance should be achievable in both for analogous data and queries).
S3: The paper provides some formal preliminaries for the syntax and semantics of the proposed query language.
S4: The experiments are quite thorough and consider a variety of configurations, datasets, different scales, queries, etc. Obviously this has implied a lot of work. The experiments generally establish that the RDBMS-based approach outperforms the current state-of-the-art for pure SPARQL/materialised RDF.
S5: Regarding S3, code is provided online.
## Weaknesses:
W1: The contributions are a bit unclear to me in terms of what, precisely, are the novel aspects of this work. The core novelty appears to be rooted in the idea of mapping SPARQL queries with preferences to SQL-Pref queries that can be evaluated in EXASol. There are then various other contributions, like the specification of a syntax and semantics for an extension of SPARQL to handle preferences, a method for mapping such queries to SQL-Pref, a way to handle transitive "dominates" relations using recursion, the experiments, etc. But in these latter contributions, it is not clear to me what parts are new, or how they differ (if at all) from existing work for handling preferences in SPARQL. What I miss is a clear statement of novelty in terms of what is new, and what is not. This should also include discussion of the authors previous work, cited as [19].
W2: There are certain aspects of the paper that I found unclear, including:
- The concept of "Extended Qualitative Preference Queries" is interesting, but it is sort of “forgotten about” until the end of the paper There is discussion on the "trans" function, which I thought to mean transitivity, but I guess means translate; the paper refers to another paper, but this should be defined in the current paper to make it self-contained. This concept of "Extended Qualitative Preference Queries" seems to be largely ignored in the syntax and semantics until Section 6.3, which does not directly refer to it again, but rather implies it through the use of a transitive closure.
- The hypotheses are a bit unclear.
* H_1: "A SPARQL-to-SQL preference-based algorithm executed by an OBDA engine does not spend much more time with respect to a physical operator within a relational database engine." But this does not specify any details about the OBDA implementation? Does the underlying RDB implement the same physical operator or not? Is the hypothesis thus that SPARQL-to-SQL does not introduce much overhead versus running the query in SQL over a given RDB?
* H_2: "As the preference criteria increase, the answer cardinality augments" This might be simply a minor language issue regarding "augments", but I read this as follows: "As the preference criteria increase, the number of solutions increase"? The intuition is that with more criteria, more conditions need to be satisfied for solutions to dominate each other? I’m a bit puzzled by this hypothesis in general.
* H_4: "operator executed on top of a database engine, even though the engine included techniques to optimize the query." But the database engine does not have the physical operator? Again the wording seems vague as “techniques to optimize the query” could well include the physical operator.
- The experimental section is hard to follow because it speaks about a lot of different configurations in a confusing way. We have SPREFQL with NL, BNL and RW. We have Morph-Skyline++, Morph-Skyline, QualQT, EXASol, etc. We further have SQL and SQL-pref. We have RW, RW-S and RW-Q. The text often refers to "the methods in [18]". We further have brTPF-MT, TPF-MT, and SkyTPF-MT. There are also contradictions in naming. For example, the paper introduces "SQL" as a variant using NOT EXISTS and "SQL-Pref" as using the preferences-based operator, but later states "SQL query translated with NOT EXISTS (SQL-Pref)", implying that the names have switched. While the paragraph entitled "Engines'' does help, hopefully the authors can find a better way to present the systems and variants evaluated (e.g., using a table and more consistent/intuitive naming).
- There are various typos and inaccuracies in the formal content that sometimes leave it unclear what is being defined or what is intended. Though in general most issues are minor, they are frequent. These will be listed in more detail in “Minor Comments”.
W3: As a general limitation of the approach, it only works when the base dataset is stored as an RDB. More work is needed to "port" a similar physical operator to the native RDF/SPARQL setting in order for it to be applicable for legacy graphs.
## Other comments:
- "demonstrated that the use of a physical operator during the SPARQL skyline query processing ... or the skyline operator on the top of [the] RDF triplestore". I am not clear on what is being claimed here, and it seems to be a key point. A physical operator I would understand as a specific implementation of the operator in the database. So I understand that a specific implementation is better than expressing the same using the typical operators implemented in a SPARQL engine. But what is the difference between "a physical operator during the SPARQL skyline query processing" and "the skyline operator on the top of [the] RDF triplestore"? Likewise when you state "Although this means ...", it is not clear to me what "this" is. I feel that this discussion is key, and should be made clearer.
- "a set of tuples to \mathcal{S}" It's not very clear to me what this means (a tuple in a solution need not correspond to any single relation in \mathcal{S}, e.g., if there is a join).
- Relating to the previous point, the paper seems to equate (in various places) a schema \mathcal{S} to a graph pattern gp. I find this very confusing. To me a schema here refers to a relational schema, which is a set of defined relations. It is not a query: it does not, for example, use a relational algebra over the schema to produce a single relation with results. It would be good to clarify or modify this aspect of the notation.
- The Theorem and Proof are a bit strange in that the preference operators could simply be defined identically in both cases so that the equivalence is (more or less) by definition. The proof seems to just be saying that the elements of the two distinct definitions mean the same thing. But I guess it's okay (also, maybe I misunderstand something).
- "The generation of results by Morph-Skyline++ is a costly process since it produces the triples in XML format." This is a bit surprising as the process should be more or less linear in the number of results. I would expect this to have a negligible cost (versus evaluating the query) if implemented well. Could you provide more details on why this might be costly?
- While I think that the comparison to *-TPF-* variants is useful, it does require the disclaimer that the purpose of these variants is to reduce server load while receiving queries from many clients. This is not tested by the current experiments.
- "precision, recall and F-measure" This is quite surprising when first mentioned as I would assume that the solutions should be the same. This might not be the case? Or are these measures just used to confirm correctness? (Though it becomes clear later that this rather refers to the correctness of the baselines, I think it would be good to briefly justify the inclusion of these measures when first introduced.)
- Table 4: The data are not consistent. The F-measure cannot be higher than both precision and recall (query 7).
- In the discussion regarding "In consequence, RW-Q performance is the worst.", why is there no corresponding discussion regarding the hypothesis? Does this not tend towards rejecting H_5 in general?
- "State-of-the-art tools would not be able to resolve a query where preferences are expressed in a subquery." I did not understand why this might be the case (assuming that the tools generally support sub-queries)?
## Recommendation:
All in all, I think that this work addresses an important topic and proposes a sensible approach. I appreciate in particular the details regarding the syntax and semantics of the query language (though perhaps I wonder which parts are novel and which are not), as well as the detailed experiments. I see the main value of this work as bridging work on preferences in SPARQL to analogous work in the SQL setting, where this work can thus serve as a strong baseline for further efforts towards supporting preferences in the native RDF/SPARQL setting.
On the other hand, I think that there are many parts of the paper I did not manage to understand to my satisfaction, relating in particular to W1 and W2 mentioned previously. There are also many issues that would need to be addressed that, cumulatively, add up to some significant revisions (see Other Comments above, and Minor Comments below).
Hence my recommendation is for a Major Revision.
## Minor comments:
General:
- "[16] avoids either comparing", "[18] presented", etc. In general, the text should read well without the parenthetical reference. Better to write "Keles and Hose [18] presented", etc.
Abstract:
- "top-k queries cannot be the most appropriate approaches due" -> "top-k queries may not be the most appropriate approaches as"
- "provides [] uniform access"
Introduction:
- "on accommodation[]"
- "to be easily pose[d]"
- "that assign a numerical score [to] each"
- "is [a] weighted average"
- "negative impact [on] their evaluation"
- "of such type[s]"
- "This evaluation allows [for] understanding" or even better: "This evaluation allows us to understand" (“Allows” is a very tricky verb. It might be better to rely more on verbs like “enables”, “facilitates”; “supports”; etc.)
- "of [a] preference operator that correctly handle[s]"
- "To the best of our knowledge, [] no [] benchmarks [have been defined] nor experimental studies [] conducted for th[ese] type[s] of queries."
- "Section 6 reports [on] and discusses [] the results"
Motivating Example:
- "are [the] cheapest"
- Figure 1(a): The SPARQL and SQL queries would be much more readable with more newlines, proper indentation, etc. I think it's also important to note that these are not SPARQL and SQL queries, but rather from an extended language.
- "Under an OBDA approach" Again this would not be a standard OBDA approach as the languages go beyond plain SPARQL/SQL.
- "Despite [the fact that] state-of-the-art"
- "SPARQL-to[-]SQL"
- "that allows the use of qualitative preference criteria [within] SQL"
- "algorithms [that] exploit"
- "allows [for] prioritizing"
- "where mapping rules are use[d] to"
- "Many tools support these rules[;]"
- "in the dev[e]lopment" Spell-check!
OBDA-based Qualitative Preference Queries:
- "relations over $\Sigma \times \Sigma$[;] we define"
- "Prioritized" Why not just use =_U? I guess this is because the preference relation is a partial order, which could be clarified.
- "For our motivat[ing] example"
- Definition 4: It would be good to clarify beforehand that preferences are ordered ascending; i.e., that a lower value indicates a higher preference.
- Definition 6: "\mu \in \mathcal{S}" Again I am a bit confused by this use of schema.
- Definition 6: The notation $\mu_{|...}$ used for projection should be defined earlier.
- Definition 6: The symbol for left join could be improved.
- Definition 6: Item 7 is hard to follow. How about defining that separately in an itemised way? (Also "crit is if crit1 then crit2 else crit3" I did not understand.) Arguably \rho is not a great choice as it is used for "renaming" in the classical relational algebra, though I guess it's okay if used elsewhere (also) for preference.
- Definition 7: Item 3 should use UNION on the left, not \cup, to avoid mixing syntax and query operators.
- Definition 8: "algebra relational" -> "relational algebra"
- Definition 8: The grammar could be typeset better (e.g., to avoid the bad box).
- Definition 8: The signature is not defined but used. (I guess it is clear what is meant, but ideally the use of \xi could be clarified.)
- Definition 8: Maybe define the Cartesian product first and then use that to define join simply as \sigma_cond(R_1 x R_2).
- Definition 9: Items 3-7: use subscript for E_1 and E_2.
- "Having already been pro[v]ed"
- "and S A is relational algebra expression" SA is not used previously, making the phrasing a bit confusing. It seems that SA needs to be defined in terms of gp.
- "will be executed [within] the underlying database system"
- Algorithm 4: Given that this is pseudocode and that \beta is not defined elsewhere, it might be easier to follow the algorithm providing human-readable names to the functions and variables (rather than $h$, $\beta$, etc.).
Query Language Syntax
- "allows [for] separat[ing]"
- Grammar 1: It seems that "[]" is used initially for optional elements, but then "()" is used? Also it would be more readable to have this grammar span two columns such that the constructs can be read in one line.
- "In SQL, Preference SQL" At this point it seems that you are no longer discussing query language syntax, but rather mapping. It could be a separate section or maybe sub-section of a section with a more general title
- "[I]n contrast"
- "while compares" -> "which compares"
- Figure 4: Again, though the indentation is better than in previous examples, it is a struggle to see where the sub-query on the right of the NOT EXISTS actually ends due to indentation not being fully respected.
- "as [a] baseline"
Experimental Study
- "and their translat[ion] into nested SQL quer[ies]"
- "SPARQL-based approach[es] on materialized RDF"
- Table 1: "SPARQL preference qualitative quer[ies] and their equivalent SQL preference qualitative quer[ies]"
- Table 1: "One of the tool[s] only support[s] skyline queries[.]"
- Table 1: "A preliminar[y] study [of the] scalability"
- Table 2: "TPC-H. Number of rows per table [for different scales] (left) and [a summary of queries] (right)
- "uses a nested query with NOT EXISTS {add space before \cite}".
- "are expressed in SPARQL [so that they can be] evaluated"
- "QualQT is up to two orders of [] magnitude less than SQL-Pref." Is it? I don't see a single case where this is true?
- "SQL query translated with NOT EXISTS (SQL-Pref)" Earlier "SQL" was used to denote NOT EXISTS and "SQL-Pref" was used to denote use of the preferring clause?
- "Figures[] 5b-5d"
- "and the query complexity [increase]"
- "significantly worsened [due to]"
- "when the query complexity [increases]"
- "query rewrit[ing] technique"
- "[I]t is important"
- "Table 5: Methods in [18]" Is this SPREFQL?
- "due to [the fact that] skyTPF"
- "quality of methods presented in [18]" What methods are these, specifically?
- "are queries with instantiations" What does this mean? That the basic graph patterns use constants? Does this not apply to almost all queries?
- "Firstly, [w]e have defined QualQT"
- "from such [an] implementation"
Conclusions and Future Work:
- "that best meet criteria"
- "TPF-,[ ]skyTPF- and brTPF-based methods, [which are] state-of-the art"
- "we plan to [incorporate]"
|
Comments
Comments related to comparison with SkyTPF
The limitations of related work as stated in the article are not comprehensible.
SkyTPF, for instance, only supports MIN and MAX preferences, so although it might not be possible to express the motivating example in SkyTPF, other types of queries are well supported.
In general though, a simple statement, such "skyTPF was not reported because it returned an empty answer" is a bit too bold. What causes this behaviour? Was there an issue when loading the data into the triple stores, was there a problem with parsing the query? Have the authors of this paper reached out to the authors of skyTPF to maybe receive support in running the evaluation?
The authors state that SPREFQL and SkyTPF produce incomplete and incorrect results and provide values for precision, recall and f-measure values in Table 4 and 5. It is not quite clear how these measurements were produced. How was the ground truth, i.e., the complete and correct result computed that all results are compared against? Since neither SPREFQL nor SkyTPF introduce any kind of approximation, incomplete results indicate more a technical error than a problem with the approach (maybe not all triples were loaded into the triple store?). Some elaboration and in particular sharing the queries would provide a higher degree of transparency and also enable other researchers to repeat and confirm the experimental results.
SkyTPF is reported to time out in 5 queries in Figure 7b. However, as the text describes, it is actually not a timeout, it's rather a Null Pointer Exception, which indicates an implementation error - but not a timeout, presenting it as such is misleading. As mentioned above, it should be possible to reach out to the SkyTPF authors for help fixing the issue. The authors of this paper are highly encouraged to do so. In fact, there has already been some communication and the offer of assistance - I (one of the SkyTPF authors) am happy to follow up on that.
In summary, the experimental section requires some more work to be acceptable. In particular, queries and results should be shared in more details so that the experiments can be repeated and verified by other researchers. In addition, the comparison against SkyTPF appears to be incomplete and not sufficiently fair. However, as described above, it should be possible to address these issues in a revision.
Comments related to comparison with SkyTPF
We are thankful for your comments which will be considered in the next version of the paper. With respect to datasets and queries, you can find them in the directory "Experiments" from https://github.com/marleeng/morph-preferences. You can repeat the experiments with these datasets. Please, feel free to contact us and to ask if you have any problem in repeating the experiments.