Estimating query rewriting quality over the LOD

Tracking #: 1612-2824

Authors: 
Ana Torre
Jesús Bermúdez
Arantza Illarramendi

Responsible editor: 
Guest Editors IE of Semantic Data 2017

Submission type: 
Full Paper
Abstract: 
Nowadays users have difficulty to query datasets with different vocabularies and data structures that are included in the Linked Data environment. For this reason it is interesting to develop systems that can produce on demand rewritings of queries. Moreover, a semantics preserving rewriting cannot often be guaranteed by those systems due to heterogeneity of the vocabularies. It is at this point where the quality estimation of the produced rewriting becomes crucial. Notice that, in a real scenario, there is not a reference query. In this paper we present a novel framework that, given a query written in the vocabulary the user is more familiar with, the system rewrites the query in terms of the vocabulary of a target dataset. Moreover, it also informs about the quality of the rewritten query with two scores: firstly, a similarity factor which is based on the rewriting process itself, and so can be considered of intensional nature; and secondly, a quality estimation that can be considered of extensional nature offered by a predictive model. This model is constructed by a machine learning algorithm that learns from a set of queries and their intended (gold standard) rewritings. The feasibility of the framework has been validated in a real scenario.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Martin Rezk submitted on 25/Apr/2017
Suggestion:
Major Revision
Review Comment:

This paper presents a framework that given a query, a set of mappings between datasets, and a model to score the queries, it rewrites the query in terms of the vocabulary of a target dataset and tries the predict the "quality" of the rewriting (F1) compared to a hypothetical human made query.
The idea behind the paper is interesting indeed, but the paper lacks formality, and from the experiments it is not clear how this approach can be applied in practice.
Let me analyze each section in turn:

Section 2:
Probably this section should include OBDA, since in that field the goal is also to rewrite queries from one (RDF) data source to another (SQL) using mappings. And I will come back to this when discussing Section 3.
At the end of the section the authors claim: "... try to reformulate the query in a different dataset
with a priori unknown vocabulary; and this is a distinguishing feature of our use case."
The approach that the authors propose not only assumes that the target vocabulary is known, but also that there are a set of mappings between the terms, knowledge about the ontology containing those terms (c.f. rules H4, H5, etc), algorithm that depends on these rules and therefore in the target vocabulary, and so on.

Section 3/4:
The sentence "The rule language to express rules in R is similar to Construct" is certainly not enough to define the language. Moreover, I do not see how the rules in table 3 are similar to SPARQL construct queries.
The abstract framework should at least specify the syntax and probably the semantics. For instance, when it says "v sub t:u1", under what entailment regime the subclass relation is defined? How can two classes from different ontologies be subclass one of another? If the authors construct a meta ontology, then they could probably apply the techniques enumerated in Section 2. The definition of algorithm A is again too informal, given that entailment is involved is not clear to me that the algorithm always finishes (the rule language seem to allow cycles).
Observe that defining a rule language is far from trivial. See, for instance, R2RML (https://www.w3.org/TR/r2rml/).
These sections should be more formal, and examples should be provided to illustrate the different concepts, especially the scoring function used in each rule, and how they are composed to score the rewriting (SF).

Section 5:
First, the evaluation is confusing since the authors are evaluating two things at the same time, the correctness of the rewriting, and the prediction of the similarity.
Regarding the prediction, table 7 shows that in 60% of the cases your approach does not predict F1 correctly. This is not very encouraging. One obtains a query which provides no warranty regarding the semantics, and a score assessing the quality that is most likely wrong.
What does it mean "Number of terms not belonging to the vocabulary in target dataset?"
The authors used 3 domain areas, how many queries from each areas? any area behaved better than the other one?
Why are not the queries and rewritings available online?
Do the authors claim that this model can be applied to a new domain with a different vocabulary and ontology? Especially given that a_n, a_d, and a_o, have been optimized for these datasets.
If so then they should adjust the experiments to show that.
If not, does it mean one needs to create a new training set for each new domain? Which of course implies creating a large number of queries.
Any comments on why PCA has such a negative impact when combined with RF in features1? It is quite surprising that the authors get almost the same R2 score when using only 2 features (F8) vs all the features (F1).

Review #2
By Luiz André Portes Pais Leme submitted on 26/Aug/2017
Suggestion:
Minor Revision
Review Comment:

The paper presents a rewriting method for SPARQL queries in order to expand their results with content from datasets that use vocabularies different from those used in the query. The method is based on a set of heuristic rules for rewriting triple patterns subdivided into five different categories: 1) equivalence rules, 2) hierarchy rules, 3) answer-based rules, 4) profile-based rules and 5) feature-based rules.

In order to aid in the interpretation of the expanded query results, the authors propose: 1) the computation of an estimate of the similarity factor (SF) between the expanded queries and the original one and 2) the computation of an estimate of the F1 measure (harmonic mean between precision and recall) between the query result and the ideal result. The higher the SF and F1, the greater the confidence that the result is relevant to the original query.

Equivalence rules generate new triple patterns by replacing terms of the original query with terms with which they have explicit equivalence relation (owl:sameAs). Hierarchy rules generate new triple patterns using supler classes of the query terms present in the expansion datasets. Answer-based rules produce new triple patterns involving the same subject or object with equivalent URI and same predicate. Profile-based rules generate new triple patterns by replacing equivalent terms inferred from similarity measures. Finally, feature-based rules expand triples with new properties found in the expansion datasets.

The similarity measure between the original query and the expansion queries is estimated with a linear combination of three similarity measures which is calibrated in a supervised approach using the genetic algorithm Harmony Search (HS). The estimate for the F1 measure is computed using Random Forest-RF algorithm.

The paper tackles a very relevant problem regarding to queries over the Web of Data and provides valuable means to make it easier browsing and using the Linked Data sources currently available on the Web. The text is very clear, easy to read and the experiments are very consistent with the proposals of the paper.

Issues and comments:

1) In section 1 on page 2 the authors say "theWeb of Data is explored by traversing interesting links pointing to other datasets", but this strategy is not well clarified in the paper.

2) Table 3 shows the heuristic rules for rewriting SPARQL queries. For a better understanding of the rules, one could used examples for each type of rule.

3) With respect to the experiments concerning the media domain, for example, it was used the datasets DBpedia, MusicBrainz, LinkedMDB, Jamendo, New York Times and BBC. One can infer from the paper that a SPARQL query can be expanded with several other queries over an arbitrary number of datatsets. However, the expression for the SF does not make it clear how it would be applyied to queries over more than one expansion dataset.

4) It seems that the expression maxSim(u; t:o_1; ... ; t:o_n) in rule P12/Table 3 would be argMax(sim(u, t:o_z)) with respect to t:o_z in {t:o_1, ..., t:o_n}.

5) In section 3 on page 5 the authors say that the precision and recall values were assessed with respect to a gold standard defined by humans. This task can be very hard, laborious and error prone. It would be usefull for readers to undertand the limitations of the experiments and results if this section went into more details on this task.

6) The main idea behind the query rewriting is using a knowledge base containing ontology and entity alignments to replace triple pattern terms. In a production system a rich alignment database can be very difficult to obtain. It would be very useful a brief discussion on available alignment databases.

Review #3
By Jefferson Santos submitted on 06/Sep/2017
Suggestion:
Accept
Review Comment:

This article presents a framework to translate a query written in the vocabulary of a source dataset into the vocabulary of a target dataset, based on the rewriting technique. The main idea is that users could write a query on a dataset which vocabulary they are more familiar with and, then, an automatic procedure could translate it into another one, adapted to a different dataset with the aim of improving the results obtained. The choice of the target dataset follows the existent relationships in the Web of Data.

The translation process uses two similarity measures to help users evaluate the quality of the provided translation: one that is based on the rewriting process itself (intensional) and another that is obtained by a predictive model (extensional). Those measures are essential since the rewriting process proposed does not necessarily preserves semantics, due to the different vocabulary of the involved datasets. The predictive model is provided by a machine learning algorithm that learns from a pre defined set of queries (source queries and its standard gold versions).

The authors divided their work into three parts:
(1) a proposal of a general framework for query rewriting in the Web of Data;
(2) an embodiment of that framework with a selected set of rules, similarity measures, and a quality estimation model composed of a similarity factor function and an F1 score predictive model and
(3) a validation of the embodied framework in a real scenario.

Originality

The authors did a broad literature review, and the reference citations flow smoothly through the text. From the Related Work section, it is evident that the research about Similarity in the field of Data Linking is very concentrated on classes, properties, and individuals with less research on the query similarity subject. Moreover, the existent research is restricted to queries translations that preserve semantic. Even works that share the same objectives of the authors (to improve queries' result set with more answers) are restricted to a single source dataset.

Thus, the authors approach to transform a query from a source dataset into a new one in a target dataset following dataset's relationship in the Linked Data network is very innovative.

Significance of the results

The results of this work, the general framework and its validation through a proposed embodiment, are highly relevant for research in the field of Linked Data. For a user point of view, be able to extend results obtained by querying a single dataset without the need to learn the vocabulary of different datasets seems to be a necessary feature.

Quality of writing

The paper is very well written and organized. Arguments are very well linked, and ideas are clear.

Recommendation

Although authors have provided an example in the introduction section to contextualize the problem, it seems that also, an example of the translation process itself (a step by step explanation of the application of the rules in a real query showing the rewriting process working) could be beneficial to improve readers understanding of the paper. Of course, there are space limits to writing, but maybe applying the same approach already used in the text of providing some parts online could help keep the official size.

Review #4
Anonymous submitted on 09/Sep/2017
Suggestion:
Minor Revision
Review Comment:

This manuscript was submitted as 'full paper' and should be reviewed along the usual dimensions for research contributions which include (1) originality, (2) significance of the results, and (3) quality of writing.

The article describes a framework that, given a source query, generates a target query over a different vocabulary and assesses the quality of the target query. This second point is the most interesting aspect of the article.

The article describes an instantiation of the framework, published elsewhere (reference [23]), and includes extensive experiments to validate the instantiation. From the experiments, the authors conclude that, to construct a quality predictive model, the best features are: the similarity factor; the number of terms of the source query; and the number of terms that do not belong to the target dataset. This conclusion is relevant to the problem addressed, and quite interesting.

A detailed list of remarks follows:

Page 1, abstract, line 5: I suggest dropping the sentence “Notice that, in a real scenario, there is not a reference query”. This is quite obvious.

Page 1, abstract, line 5: It is common practice not to brake the abstract in paragraphs.

Page 3, column 1, lines 7/8: The sentence “Notwithstanding that the result set is what matters to the user.” is disconnected from the previous sentence. It would be better to separate this sentence by a comma instead of stop, as in: “… the query over the source dataset, notwithstanding that the result set is what matters to the user.”

Page 3, column 1, line -4: Change “we have filled the framework” by “we have instantiated the framework”.

Page 3, column 2: Change the title of Section 2 to “Related Work” (this is always in the singular form).

Page 4, column 1, line 2: Change “[17,2]” to “[2,17]”. In general, list the references in increasing order (there are other occurrences in the paper).

Page 4, column 2, line -6: Change “to every rule” to “to each rule”.

Page 4, column 2, line -5: Change “which associates” to “associates”.

Page 5, column 1, paragraph 3, line 3: Change “and combining properly” to “and properly combining”.

Page 5, column 2, paragraph 2, line 7: Change “and that’s the reason…” to “and that is the reason…”.

Page 9, column 1, paragraph 2, line 10: Change “BNF (Bibliothôlque National du France)” to “BNF (Bibliothèque National du France)”.