Handling Qualitative Preferences in SPARQL over Virtual Ontology-Based Data Access

Tracking #: 2784-3998

Marlene Goncalves
David Chaves-Fraga
Oscar Corcho

Responsible editor: 
Guest Editors Web of Data 2020

Submission type: 
Full Paper
With the increase of data volume in heterogeneous datasets that are being published following Open Data initiatives, new operators are necessary in order to help users to find the subset of data that best satisfies their preference criteria. Quantitative approaches such as top-k queries cannot be the most appropriate approaches due they require the user to assign weights that may not be known beforehand to a scoring function. Unlike the quantitative approach, under the qualitative approach, which includes the well-known skyline, preference criteria are expressed in a way closer to natural language. In this paper, we address the problem of evaluating SPARQL qualitative preference queries over an Ontology-Based Data Access (OBDA) approach which provides an uniform access over multiple and heterogeneous data sources. Our main contribution is Morph-Pref, a framework for processing SPARQL qualitative preference to directly query relational databases. Our framework implements a technique that translates SPARQL qualitative preference queries directly into queries that can be evaluated by a relational database management system. We evaluate our approach over different scenarios, reporting the effects of data distributions, data size and query complexity on the performance of our proposed technique in comparison with state-of-the-art techniques. Obtained results suggest that the execution time can be reduced by up two orders of magnitude in comparison to current techniques scaling up to larger datasets while identifying precisely the result set.
Full PDF Version: 

Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Ruben Taelman submitted on 26/Apr/2021
Minor Revision
Review Comment:

This article covers the topic of preference-based SPARQL queries via a qualitative approach.
The authors identify the need for evaluating such preference-based SPARQL queries over relational databases via an OBDA approach.
They adress this need by introducing a formal technique to that rewrites such queries into SQL queries,
which can then be evaluated by a RDBMS that has built-in support for preference queries.
This technique has been implemented on top of Morph, and its source code is available online.
Based on this implementation, the authors compare their technique to other SOTA solutions,
and show that this OBDA-based approach is significantly faster.

Even though my own knowledge on the domains of OBDA and preference querying is limited,
I consider this work to be very valuable.
The article is well written (aside from a few minor typos here and there) and well understandable.
The experiments are carried out in sufficient depth, and are reproducible.
Other than a couple of minor concerns (see below), this work does –in my opinion– not need much more work to be accepted.

In the introduction, and throughout the paper, the names Morph-Pref, QualQT, and Morph-Skyline++ are used.
The relationship between these things is not entirely clear, which makes certain discussions around them hard to follow.
For instance, Morph-Skyline++ is not mentioned as a contribution in the introduction, while Morph-Pref and QualQT are.
But in the evaluation section, Morph-Skyline++ is used more prominently.
Also, in figure 5, the figure captions refer to Morph-Skyline++, while the red bars refer to QualQT.
I would recommend the authors to make the interaction of these things more clear,
and to clearly name each different algorithm, and their corresponding implementation.
For instance, a diagram could be added in the introduction to show how these things interact with each other.

In the motivating example (figure 2), TPF and brTPF are mentioned as SOTA tools to evaluate preference-based queries (next to SPREFQL and skyTPF).
This is however a bit confusing, as TPF and brTPF do not provide support for preference-based queries.
I assume the authors imply that a query translation layer exists on top of these approaches.
If so, do these execution times include the time of query translation?
I recommend the authors to explicitly mention this (or another explanation if my assumption is incorrect) in section 2 to avoid confusion.

There are some issues with some of the hypotheses introduced in section 6.
Hypothesis 1: Hypotheses need to be objectively testable. However, the use of "much more time" is subjective. This needs to be changed to either "does not spend more time", or "does not spend X% more time".
Also its rejection in section 6.1 is therefore invalid. Since "costly" is subjective, one could certainly claim that QualQT is in some cases even **very** costly. Therefore, I would expect some more details in the hypothesis, such as "QualQT does not spend X% more time in at least Y% of the cases".
Hypothesis 4, 5: It is not clear what "performance" means here, as it could be interpreted in terms of different metrics. I assume it refers to lower execution time? It could also apply to lower CPU/memory usage, or a composition of multiple metrics.

In section 6.4 and 7, the authors mention that their approach outperform the other approaches, such as the TPF-based ones.
Unfortunately, their discussion is not sufficiently nuanced, as this makes the reader believe that their approach
is better then the TPF-based ones across the board, which is not necessarily always the case.
Since TPF-oriented approaches put primary emphasis on low server load and guaranteeing server availability for a large number of concurrent clients,
this inevitably leads to longer query execution times.
Therefore, it is important that the authors clearly state that these experiments were done with always just one concurrent query execution in a closed environment,
for which the authors approach expectedly outperform the TPF-based approaches.
It should be stated that further research is needed to investigate the scalability in terms of number of concurrent clients for preference-based SPARQL query execution,
for which the TPF-based approaches may very well outperform this approach.

Minor comments:
* Page 1: "to be easily pose"
* Page 11: "One of the tool only support"
* In figure 7b, why is the bar missing for SPREFQL for query 8?

Review #2
By Aidan Hogan submitted on 08/May/2021
Major Revision
Review Comment:

The paper introduces methods for evaluating SPARQL queries with preference criteria, where the goal is to return solutions that are not strictly better (dominated by) other solutions for all of the criteria indicated. The author's proposal is based on mapping SPARQL queries to SQL and evaluating them on a relational database system (RDBMS) that supports custom (optimised) physical operators for processing preferences. The paper includes a proposed syntax for specifying preferences in an extension of SPARQL, a definition of the semantics of the extensions, and a mapping from such queries to a similar extension of SQL for specifying preferences. The experiments then compare the proposed approach against a variety of existing systems for evaluating preferences in SPARQL queries. Overall the results show that the proposed SPARQL-to-SQL approach, leveraging an existing RDBMS optimised for such preference-based queries, generally outperforms other state-of-the-art approaches (except when the query returns few results).

## Strengths:

S1: The problem is interesting and relevant. Handling preferences in queries is undoubtedly a useful feature in many settings (which is why the topic has received considerable attention in relational databases). More generally, SPARQL is receiving more and more adoption, and thus studies looking into other (useful) features that may be added in future versions are important.

S2: The virtual approach of mapping SPARQL to SQL queries makes a lot of sense, taking into consideration that there already exist RDBMSs optimised to handle preferences. I think that this work can serve as a very useful baseline/benchmark for comparing "pure" SPARQL approaches to what has been achieved already for relational databases (one would expect that similar performance should be achievable in both for analogous data and queries).

S3: The paper provides some formal preliminaries for the syntax and semantics of the proposed query language.

S4: The experiments are quite thorough and consider a variety of configurations, datasets, different scales, queries, etc. Obviously this has implied a lot of work. The experiments generally establish that the RDBMS-based approach outperforms the current state-of-the-art for pure SPARQL/materialised RDF.

S5: Regarding S3, code is provided online.

## Weaknesses:

W1: The contributions are a bit unclear to me in terms of what, precisely, are the novel aspects of this work. The core novelty appears to be rooted in the idea of mapping SPARQL queries with preferences to SQL-Pref queries that can be evaluated in EXASol. There are then various other contributions, like the specification of a syntax and semantics for an extension of SPARQL to handle preferences, a method for mapping such queries to SQL-Pref, a way to handle transitive "dominates" relations using recursion, the experiments, etc. But in these latter contributions, it is not clear to me what parts are new, or how they differ (if at all) from existing work for handling preferences in SPARQL. What I miss is a clear statement of novelty in terms of what is new, and what is not. This should also include discussion of the authors previous work, cited as [19].

W2: There are certain aspects of the paper that I found unclear, including:

- The concept of "Extended Qualitative Preference Queries" is interesting, but it is sort of “forgotten about” until the end of the paper There is discussion on the "trans" function, which I thought to mean transitivity, but I guess means translate; the paper refers to another paper, but this should be defined in the current paper to make it self-contained. This concept of "Extended Qualitative Preference Queries" seems to be largely ignored in the syntax and semantics until Section 6.3, which does not directly refer to it again, but rather implies it through the use of a transitive closure.

- The hypotheses are a bit unclear.
* H_1: "A SPARQL-to-SQL preference-based algorithm executed by an OBDA engine does not spend much more time with respect to a physical operator within a relational database engine." But this does not specify any details about the OBDA implementation? Does the underlying RDB implement the same physical operator or not? Is the hypothesis thus that SPARQL-to-SQL does not introduce much overhead versus running the query in SQL over a given RDB?
* H_2: "As the preference criteria increase, the answer cardinality augments" This might be simply a minor language issue regarding "augments", but I read this as follows: "As the preference criteria increase, the number of solutions increase"? The intuition is that with more criteria, more conditions need to be satisfied for solutions to dominate each other? I’m a bit puzzled by this hypothesis in general.
* H_4: "operator executed on top of a database engine, even though the engine included techniques to optimize the query." But the database engine does not have the physical operator? Again the wording seems vague as “techniques to optimize the query” could well include the physical operator.

- The experimental section is hard to follow because it speaks about a lot of different configurations in a confusing way. We have SPREFQL with NL, BNL and RW. We have Morph-Skyline++, Morph-Skyline, QualQT, EXASol, etc. We further have SQL and SQL-pref. We have RW, RW-S and RW-Q. The text often refers to "the methods in [18]". We further have brTPF-MT, TPF-MT, and SkyTPF-MT. There are also contradictions in naming. For example, the paper introduces "SQL" as a variant using NOT EXISTS and "SQL-Pref" as using the preferences-based operator, but later states "SQL query translated with NOT EXISTS (SQL-Pref)", implying that the names have switched. While the paragraph entitled "Engines'' does help, hopefully the authors can find a better way to present the systems and variants evaluated (e.g., using a table and more consistent/intuitive naming).

- There are various typos and inaccuracies in the formal content that sometimes leave it unclear what is being defined or what is intended. Though in general most issues are minor, they are frequent. These will be listed in more detail in “Minor Comments”.

W3: As a general limitation of the approach, it only works when the base dataset is stored as an RDB. More work is needed to "port" a similar physical operator to the native RDF/SPARQL setting in order for it to be applicable for legacy graphs.

## Other comments:

- "demonstrated that the use of a physical operator during the SPARQL skyline query processing ... or the skyline operator on the top of [the] RDF triplestore". I am not clear on what is being claimed here, and it seems to be a key point. A physical operator I would understand as a specific implementation of the operator in the database. So I understand that a specific implementation is better than expressing the same using the typical operators implemented in a SPARQL engine. But what is the difference between "a physical operator during the SPARQL skyline query processing" and "the skyline operator on the top of [the] RDF triplestore"? Likewise when you state "Although this means ...", it is not clear to me what "this" is. I feel that this discussion is key, and should be made clearer.

- "a set of tuples to \mathcal{S}" It's not very clear to me what this means (a tuple in a solution need not correspond to any single relation in \mathcal{S}, e.g., if there is a join).

- Relating to the previous point, the paper seems to equate (in various places) a schema \mathcal{S} to a graph pattern gp. I find this very confusing. To me a schema here refers to a relational schema, which is a set of defined relations. It is not a query: it does not, for example, use a relational algebra over the schema to produce a single relation with results. It would be good to clarify or modify this aspect of the notation.

- The Theorem and Proof are a bit strange in that the preference operators could simply be defined identically in both cases so that the equivalence is (more or less) by definition. The proof seems to just be saying that the elements of the two distinct definitions mean the same thing. But I guess it's okay (also, maybe I misunderstand something).

- "The generation of results by Morph-Skyline++ is a costly process since it produces the triples in XML format." This is a bit surprising as the process should be more or less linear in the number of results. I would expect this to have a negligible cost (versus evaluating the query) if implemented well. Could you provide more details on why this might be costly?

- While I think that the comparison to *-TPF-* variants is useful, it does require the disclaimer that the purpose of these variants is to reduce server load while receiving queries from many clients. This is not tested by the current experiments.

- "precision, recall and F-measure" This is quite surprising when first mentioned as I would assume that the solutions should be the same. This might not be the case? Or are these measures just used to confirm correctness? (Though it becomes clear later that this rather refers to the correctness of the baselines, I think it would be good to briefly justify the inclusion of these measures when first introduced.)

- Table 4: The data are not consistent. The F-measure cannot be higher than both precision and recall (query 7).

- In the discussion regarding "In consequence, RW-Q performance is the worst.", why is there no corresponding discussion regarding the hypothesis? Does this not tend towards rejecting H_5 in general?

- "State-of-the-art tools would not be able to resolve a query where preferences are expressed in a subquery." I did not understand why this might be the case (assuming that the tools generally support sub-queries)?

## Recommendation:

All in all, I think that this work addresses an important topic and proposes a sensible approach. I appreciate in particular the details regarding the syntax and semantics of the query language (though perhaps I wonder which parts are novel and which are not), as well as the detailed experiments. I see the main value of this work as bridging work on preferences in SPARQL to analogous work in the SQL setting, where this work can thus serve as a strong baseline for further efforts towards supporting preferences in the native RDF/SPARQL setting.

On the other hand, I think that there are many parts of the paper I did not manage to understand to my satisfaction, relating in particular to W1 and W2 mentioned previously. There are also many issues that would need to be addressed that, cumulatively, add up to some significant revisions (see Other Comments above, and Minor Comments below).

Hence my recommendation is for a Major Revision.

## Minor comments:

- "[16] avoids either comparing", "[18] presented", etc. In general, the text should read well without the parenthetical reference. Better to write "Keles and Hose [18] presented", etc.

- "top-k queries cannot be the most appropriate approaches due" -> "top-k queries may not be the most appropriate approaches as"
- "provides [] uniform access"

- "on accommodation[]"
- "to be easily pose[d]"
- "that assign a numerical score [to] each"
- "is [a] weighted average"
- "negative impact [on] their evaluation"
- "of such type[s]"
- "This evaluation allows [for] understanding" or even better: "This evaluation allows us to understand" (“Allows” is a very tricky verb. It might be better to rely more on verbs like “enables”, “facilitates”; “supports”; etc.)
- "of [a] preference operator that correctly handle[s]"
- "To the best of our knowledge, [] no [] benchmarks [have been defined] nor experimental studies [] conducted for th[ese] type[s] of queries."
- "Section 6 reports [on] and discusses [] the results"

Motivating Example:
- "are [the] cheapest"
- Figure 1(a): The SPARQL and SQL queries would be much more readable with more newlines, proper indentation, etc. I think it's also important to note that these are not SPARQL and SQL queries, but rather from an extended language.
- "Under an OBDA approach" Again this would not be a standard OBDA approach as the languages go beyond plain SPARQL/SQL.
- "Despite [the fact that] state-of-the-art"
- "SPARQL-to[-]SQL"
- "that allows the use of qualitative preference criteria [within] SQL"
- "algorithms [that] exploit"
- "allows [for] prioritizing"
- "where mapping rules are use[d] to"
- "Many tools support these rules[;]"
- "in the dev[e]lopment" Spell-check!

OBDA-based Qualitative Preference Queries:
- "relations over $\Sigma \times \Sigma$[;] we define"
- "Prioritized" Why not just use =_U? I guess this is because the preference relation is a partial order, which could be clarified.
- "For our motivat[ing] example"
- Definition 4: It would be good to clarify beforehand that preferences are ordered ascending; i.e., that a lower value indicates a higher preference.
- Definition 6: "\mu \in \mathcal{S}" Again I am a bit confused by this use of schema.
- Definition 6: The notation $\mu_{|...}$ used for projection should be defined earlier.
- Definition 6: The symbol for left join could be improved.
- Definition 6: Item 7 is hard to follow. How about defining that separately in an itemised way? (Also "crit is if crit1 then crit2 else crit3" I did not understand.) Arguably \rho is not a great choice as it is used for "renaming" in the classical relational algebra, though I guess it's okay if used elsewhere (also) for preference.
- Definition 7: Item 3 should use UNION on the left, not \cup, to avoid mixing syntax and query operators.
- Definition 8: "algebra relational" -> "relational algebra"
- Definition 8: The grammar could be typeset better (e.g., to avoid the bad box).
- Definition 8: The signature is not defined but used. (I guess it is clear what is meant, but ideally the use of \xi could be clarified.)
- Definition 8: Maybe define the Cartesian product first and then use that to define join simply as \sigma_cond(R_1 x R_2).
- Definition 9: Items 3-7: use subscript for E_1 and E_2.
- "Having already been pro[v]ed"
- "and S A is relational algebra expression" SA is not used previously, making the phrasing a bit confusing. It seems that SA needs to be defined in terms of gp.
- "will be executed [within] the underlying database system"
- Algorithm 4: Given that this is pseudocode and that \beta is not defined elsewhere, it might be easier to follow the algorithm providing human-readable names to the functions and variables (rather than $h$, $\beta$, etc.).

Query Language Syntax
- "allows [for] separat[ing]"
- Grammar 1: It seems that "[]" is used initially for optional elements, but then "()" is used? Also it would be more readable to have this grammar span two columns such that the constructs can be read in one line.
- "In SQL, Preference SQL" At this point it seems that you are no longer discussing query language syntax, but rather mapping. It could be a separate section or maybe sub-section of a section with a more general title
- "[I]n contrast"
- "while compares" -> "which compares"
- Figure 4: Again, though the indentation is better than in previous examples, it is a struggle to see where the sub-query on the right of the NOT EXISTS actually ends due to indentation not being fully respected.
- "as [a] baseline"

Experimental Study
- "and their translat[ion] into nested SQL quer[ies]"
- "SPARQL-based approach[es] on materialized RDF"
- Table 1: "SPARQL preference qualitative quer[ies] and their equivalent SQL preference qualitative quer[ies]"
- Table 1: "One of the tool[s] only support[s] skyline queries[.]"
- Table 1: "A preliminar[y] study [of the] scalability"
- Table 2: "TPC-H. Number of rows per table [for different scales] (left) and [a summary of queries] (right)
- "uses a nested query with NOT EXISTS {add space before \cite}".
- "are expressed in SPARQL [so that they can be] evaluated"
- "QualQT is up to two orders of [] magnitude less than SQL-Pref." Is it? I don't see a single case where this is true?
- "SQL query translated with NOT EXISTS (SQL-Pref)" Earlier "SQL" was used to denote NOT EXISTS and "SQL-Pref" was used to denote use of the preferring clause?
- "Figures[] 5b-5d"
- "and the query complexity [increase]"
- "significantly worsened [due to]"
- "when the query complexity [increases]"
- "query rewrit[ing] technique"
- "[I]t is important"
- "Table 5: Methods in [18]" Is this SPREFQL?
- "due to [the fact that] skyTPF"
- "quality of methods presented in [18]" What methods are these, specifically?
- "are queries with instantiations" What does this mean? That the basic graph patterns use constants? Does this not apply to almost all queries?
- "Firstly, [w]e have defined QualQT"
- "from such [an] implementation"

Conclusions and Future Work:
- "that best meet criteria"
- "TPF-,[ ]skyTPF- and brTPF-based methods, [which are] state-of-the art"
- "we plan to [incorporate]"

Review #3
Anonymous submitted on 23/Jun/2021
Major Revision
Review Comment:

This paper addresses the problem of incorporating qualitative preferences in SPARQL queries. The general problem of incorporating different preference models into a variety of querying mechanisms over data and knowledge bases has received a lot of attention given that adequately solving it affects many real-world applications.

The main contributions are the proposal of a translation mechanism (based on work in the literature) and an experimental evaluation. The former is somewhat direct, so at least in my view the central contribution is the latter, showing the effect of pushing down the preference clauses to the data management engines. The main reasons for my recommendation are centered around presentation issues that, even though none of them is too serious, their accumulated effect is more so:

-- There are many presentation issues across the paper. I discuss the more minor ones below, and here focus on the others, which are centered in Section 6. The first issue I found is in the presentation of the hypotheses, which are formulated rather narrowly and in some cases informally. For instance "does not spend much more time". Another example is "performs better"; does this refer to running time, precision/recall, both? These are just two examples -- all 5 hypotheses need to be reformulated. Also related to the hypotheses, why not formulate one for the preliminary experiment in Section 6.3? I think that this would make the presentation more uniform.

The other issue that I found confusing is the discussion of solution quality. The vast majority of results are focused on running time, and solution quality is only briefly mentioned for a subset of the queries and algorithms. Perhaps I missed something, but the conclusion "These results showed that our approach is able to precisely compute the skyline set..." requires a clearer presentation of solution quality results.

-- The proof of Theorem 1, the central theoretical result of the paper, is quite cryptic. Please expand it.

-- There are many small errors, typos, and grammatical errors. Please thoroughly revise the manuscript to address them.

-- Other minor comments:

* "Quantitative approaches such as top-k queries cannot be the most appropriate approaches" --> Perhaps the authors meant "may not be"? I think it's clear that in some cases quantitative preferences work quite well.

* "preference criteria are expressed in a way closer to natural language"  --> This metaphor (I'm not sure the authors meant it to be a metaphor) is not very clear. Perhaps qualitative preference criteria are more intuitive in certain cases and can be expressed more naturally, but simply saying that they are closer to natural language is confusing.

* In this passage:
"In the former, preferences are expressed quantitatively through a scoring function to determine an ordering over query results. Closer to our work, in the latter, preferences are denoted by binary preference relationships yielding a partial order, contrary to the quantitative approach."

the terms "former" and "latter" are swapped.

* At the beginning of Section 3, the authors claim that Chomicki et al. and Kiessling et al. defined foundational theories relating preferences to DB systems and proposed extensions to SQL. However, a (much) older work should be cited here to give credit:
M. Lacroix and P. Lavency. Preferences: Putting more knowledge into queries. In VLDB, volume 87, pages 1–4, 1987.

Also, the paper cited in reference [16] is also older and should be included in that passage.

* "For example, a typical skyline query is to find hotels cheaper and closer to the beach." -->"For example, a typical skyline query is to find hotels that are both cheap and clos to the beach." (?)

* There are several places in which the authors seem to use Latex math mode without using \textit or \mathit for strings. A clear example can be seen on Page 7, in the Problem Definition paragraph, in the string "SQ".

* "where crit is criteria" --> "where crit is a set of criteria" (?)

* "Having already been probed" --> "Having already been proved"

* "Algebra Relational" --> "Relational Algebra"

* The final part before Section 5, which includes a clause and algorithm, could be made much clearer. The algorithm would look better at the top of a column, and the clause would benefit greatly from pretty printing.

* Cosmetic issues in figures and tables: The table headers make it very hard to read (black over dark green), and the bar charts in general are not black and white printer friendly.


The limitations of related work as stated in the article are not comprehensible.
SkyTPF, for instance, only supports MIN and MAX preferences, so although it might not be possible to express the motivating example in SkyTPF, other types of queries are well supported.

In general though, a simple statement, such "skyTPF was not reported because it returned an empty answer" is a bit too bold. What causes this behaviour? Was there an issue when loading the data into the triple stores, was there a problem with parsing the query? Have the authors of this paper reached out to the authors of skyTPF to maybe receive support in running the evaluation?

The authors state that SPREFQL and SkyTPF produce incomplete and incorrect results and provide values for precision, recall and f-measure values in Table 4 and 5. It is not quite clear how these measurements were produced. How was the ground truth, i.e., the complete and correct result computed that all results are compared against? Since neither SPREFQL nor SkyTPF introduce any kind of approximation, incomplete results indicate more a technical error than a problem with the approach (maybe not all triples were loaded into the triple store?). Some elaboration and in particular sharing the queries would provide a higher degree of transparency and also enable other researchers to repeat and confirm the experimental results.

SkyTPF is reported to time out in 5 queries in Figure 7b. However, as the text describes, it is actually not a timeout, it's rather a Null Pointer Exception, which indicates an implementation error - but not a timeout, presenting it as such is misleading. As mentioned above, it should be possible to reach out to the SkyTPF authors for help fixing the issue. The authors of this paper are highly encouraged to do so. In fact, there has already been some communication and the offer of assistance - I (one of the SkyTPF authors) am happy to follow up on that.

In summary, the experimental section requires some more work to be acceptable. In particular, queries and results should be shared in more details so that the experiments can be repeated and verified by other researchers. In addition, the comparison against SkyTPF appears to be incomplete and not sufficiently fair. However, as described above, it should be possible to address these issues in a revision.

We are thankful for your comments which will be considered in the next version of the paper. With respect to datasets and queries, you can find them in the directory "Experiments" from https://github.com/marleeng/morph-preferences. You can repeat the experiments with these datasets. Please, feel free to contact us and to ask if you have any problem in repeating the experiments.