Preference Queries with Ceteris Paribus Semantics for Linked Data

Tracking #: 1588-2800

Authors: 
Jessica Rosati
Tommaso Di Noia
Renato De Leone
Thomas Lukasiewicz

Responsible editor: 
Axel Polleres

Submission type: 
Full Paper
Abstract: 
Preference representation and reasoning play a central role in supporting users with complex and multi-factorial decision processes. In fact, user tastes can be used to filter information and data in a personalized way, thus maximizing their expected utility. Over the years, many frameworks and languages have been proposed to deal with user preferences. Among them, one of the most prominent formalism to represent and reason with (qualitative) conditional preferences (CPs) are conditional preference theories (CP-theories). In this paper, we show how to combine them with Semantic Web technologies in order to encode in a standard SPARQL 1.1 query the semantics of a set of CP statements representing user preferences by means of RDF triples that refer to a “preference” OWL ontology. The framework that we propose allows a standard SPARQL client to query Linked Data datasets, and to order the results of such queries relative to a set of user preferences.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 04/Sep/2017
Suggestion:
Major Revision
Review Comment:

This paper tackles the problem of expressing user preferences in a query language. Specifically, the paper focuses on three main points:

(i) using conditional preference theories to express complex dependencies/independences of preferences under a ceteris paribus interpretation;

(ii) encoding user preferences by means of an OWL ontology;

(iii) making (i) and (ii) available in the SPARQL language.

As for (i), authors build upon previous theoretical work (ref. [27]); as for (ii) and (iii) authors also enhance their own previous work (ref. [25]).

Summary of the review:

Pros:
(S.1) the paper provides a comprehensive way of dealing with user preferences (from theory to practice)

(S.2) the presentation and the writing are very good, although something can be improved (see comments below)

(S.3) there is a system that puts into practice the machinery described in the paper (even if I could not access the system although trying several times)

Cons:

(W.1) There is no experimental evaluation. In this respect, one can tackle two main problems: (a) is really that simple for users to write preferences? (b) how preferences affect running time;

(W.2) I had a very hard time in grasping the semantics of SPARQL enhanced with preferences. The semantics is buried in the text and I think that the paper will benefit a lot from making this semantics explicit (see comments below);

(W.3) related work can be extended. There is an entire line of research tackling the problem of preferences in graphs (with the important feature of enabling to include preferences in recursive queries) that is not mentioned at all in the paper.

Detailed review:

Generally, I like this paper especially for the fact that it provides a quite comprehensive solution to the problem of expressing preferences in SPARQL. Nevertheless, I think there are a number of issues that need to be addressed:

As for W.1, the whole process of generating SPARQL queries enhanced with preferences (obtained via the rewriting) is almost transparent to the user. Nevertheless, the resulting SPARQL queries can be very large. It would be interesting to investigate the following aspects:

(i) the relationship between the original preference expressed by the user (or model via CP-Nets) and the length of the generated SPARQL queries; this will give an idea of how one expects the query to become more complex as well as of the readability/conciseness of the query;

(ii) authors have described "one" of the possible rewriting; some considerations about the optimality/uniqueness of the rewriting are needed. In other words, is there any other (better) way to translate your preference into SPARQL? why you choose exactly these SPARQL operators?

(iii) from a more practical point of view, it would be interesting to see running times; in particular, comparing running time of a SPARQL query with and without (or an approximation of it) preferences. In particular, I would like to see how the constraint that "the size of the parent sets and the size of the domain sets are bounded by a constant" to have polynomial time impacts the running time.

(iv) comparing your approach with PrefSPARQL; you provide some sort of comparison in the related work section. Nevertheless, having an idea of what is implied by the fact that PrefSPARQL "introduces preferences at the level of filters" and whether this is a good/bad choice is certainly interesting.

As a side note, I tried along several days to access the system, but the server does not respond.

As for W.2, I had to struggle quite a bit to have an idea of what is the semantics of your preference-enhanced SPARQL language. Authors should consider the following aspects:

(i) the semantics of expressing preference via Cp-Nets in SPARQL is buried in the text. As an example, the text below Figure 3 gives the recipe for computing the "conditionallyMoreImportantThan"-closure while in Section 4.2 Query1-Query3 give a high-level overview of how a SPARQL query enhanced with preferences looks like. Having a table that shows for (a subset of) the SPARQL algebra what is returned in terms of mappings will make the paper more accessible. Authors could have a look at the very recent work by Troumpoukis et. al [Ref5] about the semantics of SPARQL enhanced with preferences.

(ii) Th. 1 and Th. 2 consider the case of fully acyclic and cuc-acyclic theories, respectively. I have no objections to the fact that the theoretical part of this work is not original; nevertheless, some more hints about the implications of non-acyclic theories need to be provided. As authors themselves mention, Th. 2 requires a pretty strong condition.

As for W.3, authors mainly focus on approaches that incorporate preferences in SPARQL. Nevertheless, there is an entire line of research tackling a similar problem in the specific case of graph databases. It would be interesting to investigate how recursion (the main feature of graph navigation/query languages) will impact on your framework.

Minor points:
- I noticed that the title of this paper is exactly the same as that of the conference version. Authors should consider changing the title to make clearer the distinction

- Statements like "a user’s preference model is just an approximation of what a user really wants"need to be either removed or better supported (e.g., via concrete examples).

- "hard constraints—used to return only relevant results"; the notion of relevance here needs to be clarified

- It is not completely clear which kind of entailment regime is needed to deal with the OWL ontology modeling user preferences

- typo: "if a user’s requirements"

Some missing references:
[Ref1] Combining Fuzzy Information from Multiple Systems, JCSS 1999
[Ref2] Partially Ordered Regular Languages for Graph Queries, JCSS 2005
[Ref3] Preferential Regular Path Queries, Fundamenta Informaticae, 2008
[Ref4] Querying Graphs with Preferences, CIKM 2015
[Ref5]An extension of SPARQL for expressing qualitative preferences, ISWC 2017

Review #2
By Stefan Woltran submitted on 13/Sep/2017
Suggestion:
Minor Revision
Review Comment:

The paper aims to combine SPARQL queries with preferences
specified in terms of CP-theories in order to rank the query
results in terms of the specified preferences. This is done
via a preference ontology and the semantics of the CP theory
are incorporated via SPARQL itself. In addition, a system is
presented implementing these ideas together with some
interfaces to support the users.

General Evaluation:

+ The tackled problem is definitely relevant. CP theories
are a prominent tool for preferential reasoning within the
AI and Comsoc communities. SPARQL is a well-established standard
in the world of Semantic Web technologies. Thus the paper
is clearly relevant for the journal.

+ The paper is well written. A running example and some figures
nicely illustrate the main concepts (Section 4, however, is not
so easy to follow for people not so familiar with SPARQL,
but this seems acceptable given the audience of the journal).

+ A definite plus is that a tool is provided. Moreover, all
sources are provided on the web.

+ Related work is discussed in a satisfactory way (further
pointers are given below).

- The paper extends previous work by the authors on CP-nets
and SPARQL. CP-theories are a generalization of CP-nets, so
the former approach is subsumed with the new one. Still,
central ideas follow similar patterns as the predecessor
paper does, which makes originality an issue.

- Nothing is said about scalability. It would be interesting
to understand for which amount of data and preference rules
the systems behaves efficiently (in terms of ranking the
outcomes but also concerning the check for cuc-acyclicity),
and where the authors see the limits of their approach.

Further comments:

I find it quite irritating that neither the abstract nor
the introduction makes clear that the presented results only
apply to a certain subclass of consistent CP theories, namely
cuc-acyclic ones. I do not rate this limitation severe, but
it should be stated from the very beginning and there should
be some discussion whether cuc-acyclicty is something which
can be expected for real-world instances. A positive aspect
is that the system includes a test for this form of
acyclicity.

While some complexity results for local consistency are reported
in Section 3, I missed some words about the complexity for
consistency checking in general, which is PSPACE-hard
[Goldsmith et al., 2008]. (Note that this - in some way -
also justifies the restriction to cuc-acyclic CP theories).
Having said this, it might be also an interesting direction
for future work to encode full consistency checks via
SPARQL queries, given the similar complexity, although this
might be only of little practical relevance.

For the related work, I have some suggestions.

First, I think the following paper should be discussed:

Cristina Cornelio, Andrea Loreggia, Vijay Saraswat:
"Logical conditional preference theories”, Proceedings
of the Multidisciplinary Workshop on Advances in Preference
Handling (MPREF-2015).

since it also combines a query language (datalog) with
CP-nets.

Also the paper

Tommaso Di Noia, Thomas Lukasiewicz, Maria Vanina Martínez,
Gerardo I. Simari, Oana Tifrea-Marciuska:
Combining Existential Rules with the Power of CP-Theories.
IJCAI 2015: 2918-2925

appears quite related.

Given that soft constraints in CSP is mentioned, one might
also want to refer to the more query-related world
of Answer-Set Programming, where several attempts for
preferences have been incorporated. A recent and quite
powerful approach is

Gerhard Brewka, James P. Delgrande, Javier Romero, Torsten Schaub:
asprin: Customizing Answer Set Preferences without a Headache.
AAAI 2015: 1467-1474

Concerning what is mentioned as "logic-based languages"
there is quite a tradition of works starting by the work
of von Wright.

A quite prominent AI-related logical approach is

Gerhard Brewka, Salem Benferhat, Daniel Le Berre:
Qualitative choice logic. Artif. Intell. 157(1-2): 203-237 (2004)

And, finally, a recent survey worth referring to is

Gabriella Pigozzi, Alexis Tsoukiàs, Paolo Viappiani:
Preferences in artificial intelligence.
Ann. Math. Artif. Intell. 77(3-4): 361-401 (2016)

Minor Issues:

Table 1: SW_NO in line \varphi_3 is misformed

page5/col2: in (iv) a different notation for a rule is used;
moreover, the rule has to be contained in \Gamma.

page6/col2: it is stated that J_\alpha(\Gamma)\subseteq G(\Gamma)
holds, but from the definitions it appears to me that one has to
use here the trans closure of G(\Gamma) to make the subset
relation hold.

page7/col2: "the directed graph ... exploited in Theorem 2";
in fact, several directed graphs (for different \alpha) are
used in that theorem.

page8/col1: "The properties that it is domain of reflect ..." (?)

page14/col2: "We know describe" --> "now"

References [17] and [26] miss booktitles.

Review #3
Anonymous submitted on 08/Oct/2017
Suggestion:
Major Revision
Review Comment:

The submission proposes to combine Conditional Preference (CP) theories [28] with RDF and SPARQL 1.1. CP theories provides a language for representing conditional preference.

There have been some work about expending SPARQL with preferences, such as [26] and [18]. However, neither of these work can produce a total ranking for query results. The submission aims to address such a limitation. The submission is an extended version of its conference version [25], which deals with a special case of CP theories, called CP-net. A CP theory T is consistent if there exists a strict total order satisfying T.

The submission provides an ongoing example to run through the main notions. However, the significance of the submission seems marginal. The technical discussions in the submissions are mainly related to the example(s); one might expect a more general framework for the proposed solution, with some clear specifications provided. There is an algorithm provided in the Appendix, but no properties of the algorithm are presented. It is well known that more expressive power often brings also high computational complexity. Thus it is unclear what price to pay to integrate CP theory into SPARQL, comparing to existing solutions. Maybe some evaluations should be provided along this line.

Even in the examples, some of the basic details are not yet clearly presented. For instance, although Example 2 refers to Table 1 and explains the variables, the core notion of CP statements are not explained at all. In terms of examples against [26] and [18], it would be helpful if some specific examples can be provided to showcase which examples from the submissions are feasible for [26] and [18] and which are not. For the preferences that all the three approaches can represent, in the situation when the work [26,18] cannot produce a total ranking for query results, it is unclear that if the proposed approach always has an inconsistent CP theory accordingly.