# EQUATER - An Unsupervised Data-driven Method to Discover Equivalent Relations in Large Linked Datasets

### Tracking #: 872-2082

Authors:
Ziqi Zhang
Anna Lisa Gentile
Eva Blomqvist
Isabelle Augenstein
Fabio Ciravegna

Responsible editor:
Guest Editors Ontology and Linked Data Matching

Submission type:
Full Paper
Abstract:
The Web of Data is currently undergoing an unprecedented level of growth thanks to the Linked Open Data effort. One escalated issue is the increasing level of heterogeneity in the published resources. This seriously hampers interoperability of Semantic Web applications. A decade of effort in the research of Ontology Alignment has contributed to a rich literature dedicated to such problems. However, existing methods can be still limited when applied to the domain of Linked Open Data, where the widely adopted assumption of 'well-formed' ontologies breaks due to the larger degree of incompleteness, noise and inconsistency both found in the schemata and in the data described by them. Such problems become even more noticeable in the problem of aligning relations, which is very important but insufficiently addressed. This article makes contribution to this particular problem by introducing EQUATER, a domain- and language-independent and completely unsupervised method to align equivalent relations across schemata based on their shared instances. Included by EQUATER are a novel similarity measure able to cope with unbalanced population of schema elements, an unsupervised technique to automatically decide similarity cutoff thresholds to assert equivalence, and an unsupervised clustering process to discover groups of equivalent relations across different schemata. The current version of EQUATER is particularly suited for a more specific yet realistic case: addressing alignment within a single large Linked Dataset, the problem that is becoming increasingly prominent as collaborative authoring is adopted by many large-scale knowledge bases. Using three datasets created based on DBpedia (the largest of which is based on a real problem currently concerning the DBpedia community), we show encouraging results from a thorough evaluation involving four baseline similarity measures and over 15 comparative models by replacing EQUATER components with their alternatives: the proposed EQUATER makes significant improvement over baseline models in terms of F1 measure (mostly between 7% and 40%). It always scores the highest precision and is also among the top performers in terms of recall. Together with the released dataset to encourage comparative studies, this work contributes valuable resources to the related area of research.
Tags:
Reviewed

Decision/Status:
Minor revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Andreas Thalhammer submitted on 26/Nov/2014
Review #2
By Andriy Nikolov submitted on 15/Dec/2014
 Suggestion: Minor Revision Review Comment: The paper describes an algorithm for aligning relations between ontologies used by datasets in the linked data cloud. The algorithm combines several techniques: in particular, extensional similarity measure between relations based on the overlapping sets of arguments, clustering mechanism for selecting equivalent relation pairs among the candidate ones, a technique for inferring equivalence sets exploiting the transitivity of alignments. The authors report experimental testing for determining the optimal parameters of the algorithm. In my view, this paper makes two main contributions: - Since the majority of the extensional techniques for ontology alignment in the linked data cloud primarily focused on the alignments between concepts, the algorithm provides a valuable contribution by focusing on the less studied task of relations matching. - The use of unsupervised techniques at all stages is very important in the context of linked data cloud where one cannot expect sufficient amounts of training data. However, there are some aspects, which, in my view, could be clarified better. The paper focuses on equivalence relations between properties. However, it is not clear to which extent the semantic correspondences between relations in real-world schemata conform to the definition of equivalence. E.g., the ontological concepts are often used differently in different datasets and their instance sets only partially overlap despite being semantically very close [39]. To which extent the same factors apply to relations? At least judging from the common errors produced by EQUATER (section 5.2), semantic equivalence is often subjective and can often be confused with high-degree semantic similarity (Table 8). Another aspect involves the authors’ decision to use dbpedia as an evaluation dataset: a set of schemata sharing the same set of instances. This choice is justified: taking datasets originating from different sources (even those already having owl:sameAs links between instances) would usually provide only a few examples of overlapping concepts and even fewer pairs of matching relations. However, the choice to use a single dataset only could bias the results in comparison with the “canonical” ontology matching use case involving datasets originating from different sources. In that case the noise caused by imprecise instance and concept matching as well as different interpretations of relations in different datasets would become additional factors possibly influencing the performance of different techniques. I would not ask the authors to perform another set of experiments with datasets from heterogeneous sources (that would probably be a subject of another paper), but a bit more discussion on these aspects and their likely influence would be valuable.
Review #3
By Matteo Palmonari submitted on 15/Dec/2014

The paper presents EQUATER a system which, so far, is able to match relations across different ontologies used for "annotating" the same data resources. After a review of the state of the art in ontology matching, the paper presents the measures used by EQUATER for assessing the similarity of relations, assessing confidence in the judgement and automatically determining a threshold for confidence. It also introduces briefly the use of "clustering" for improving alignments. It then goes on evaluating EQUATER by comparing the proposed solutions to various more simple versions of itself.

My opinion is that this paper describes valuable work that should be published. The three methods presented are valuable contributions in their own right and the limited evaluation show that they improve on simple methods.

However, the form of the paper is unsatisfying. Besides some elements of form which are listed below, the main problem is that the paper is presented like if its contribution was a general-purpose matching system (to be compared with other such systems), while what I see as its contribution is more restricted.

This problem can also be seen in the way "limitations of this work" are described in the paper: "Therefore, we consider the second major limitation of EQUATER as it being a partial ontology alignment method addressing a specific but practical issue - aligning relations only." This is strange because the abstract describes EQUATER as a system "to align equivalent relations across schemata based on their shared instances". So how can it be a limitation of a relation matcher to align only relations? Of course, these are limitations if the claim is that EQUATER is a general purpose extensional ontology matcher. But how could it be? What is presented in the paper is a method for matching relations within the same linked data set, a far much narrow goal, but a worthy one.

It seems that the authors indeed aimed at creating a matcher (EQUATER) but decided to publish parts of it before finishing it. This is fine as long as there is a real contribution, and there is. But this paper is still written as if it were describing a general purpose matcher. For instance, if the authors want to further extend EQUATER as they indicate in the text, then I advise not to put the name of the system in the title of the paper. Otherwise, later, if the system is successful, people will be very confused when talking about this EQUATER system. Moreover, there is no link from the paper where to find (for free or for a fee) this system, so what is the point?

This makes the paper irritating to read (see detailed comments). The first part of related work is dedicated to general purpose ontology matchers and OAEI. But if this is really related to this work, why the evaluation does not compare with such systems?

The paper could rather adopt the following line of argument:
- Ontology matching is a very important topic --and in particular in LOD--
- Most matchers are not extensional --focus on those which are, and which use LOD--
- Most matchers, as testified by OAEI, are focussing on matching concepts and neglect relations --focus on those which pay attention to relations--
- In this paper we present techniques for matching relations extensionally based on relations used in LOD.
- More precisely, we develop techniques that are used to match relations across multiple ontologies annotating the same datasets. Such techniques may be used, either for helping matching ontologies that are jointly used for annotating the same data set (and can later be used for other datasets), or for contributing matching ontologies when sameAs links between different data sets have been established.

Concretely, this would involve reducing the parts dedicated to reviewing and criticizing general purpose matchers and focussing on the techniques contributed by this work.

Describing the presented techniques as original contributions to a specific problem is perfectly acceptable and corresponds more to what is in the paper... and it would be even more valuable if techniques were available as a library that other matchers could use for improving their performances.

Details:
- p2. "who often fail to conform to a universal schema": reference needed.
- p3. [13, 41] may not be the best references on OAEI. There has been a paper in Journal on data semantics. Moreover, taking OAEI as a reference on what exists introduces a bias since only systems suited to the tests participate. For instance, EQUATER has never participated, but this does not mean that it does not exist. There has been extensional systems using LOD such as Blooms (that is mentioned after).
- p4. "It is well-known that": reference establishing this?
- p4. Not sure that the example given of foaf:Person illustrate any pathology.
- p4. "Similarity computation is typically quadratic; the more matchers...": it remains quadratic if one adds matchers. And it is not necessarily quadratic because some systems prune the search space.
- p4. PARIS does not necessarily converge in fact (no proof has been published and the authors know counter examples). Actually Paris is certainly a system that does use very similar measure as EQUATOR, even if it does only generate links between entities, for doing this he does identify relations that have to correspond, as do most key-based linked data matchers as well (see [1,2] below).
- p6. "the first that studies the problem of automatically deciding threshold". I am not a specialist, but there is a database system called eTuner and nowadays most ontology matchers determine their thresholds from the data, especially because OAEI requires that the same configuration is used in all tests.
- p6. "largely" used twice, i.e., too many times for such a vague word.
- p7. I think that the "formalization of the problem" could have come before, even before the related work.
- p7. hypothesize _that_ there exists
- p8. I would have found the description of ta as arg_{\cap}/min(arg(r1),arg(r2)) an easier to explain formulation.
- p8. Note that sa is the denominator of \beta on the divisor of \alpha: the multiplication eliminates the sub_{\cup} from the equation. The presentation with alpha beta is OK because it provides the intent, but the simplification of the formula should be given. In addition, the sentence "As a result, relations that have high sa will share many subject..." is likely incorrect: the shared objects (sub_{\cup}) have disappeared. This effect is in fact classical, when trying to mitigate coverage and functionality.
- p9. 3.2.3: the "cognitive point of view" makes a sudden apparition (single occurence of the adjective cognitive in the whole paper).
- p9. I am surprised by the use of the adjective "exponential" since the researched growth is not an exponential growth, but an assymptotic one if I understand. It is unclear to me that this adjective still can be used for the given function.
- p10. I assume that, in the definition of ta^{kr}, the second line refers to kr(|arg(r_2)|) instead of kr(|arg(r_1)|) otherwise, this would break the symmetry which was in the initial ta function and this would have to be explained.
- p10. I am surprised to see mentioned "spelling errors": the probability that spelling mistakes make relations similar should be very low unless using very few data.
- pp10-11. The techniques used there seems interesting but are not sufficiently detailed in my opinion. I have no idea what is in the references [9, 26, 35] although they are supposed to be "well-known".
- p11. In particular it is not clear what is exactly achieved with clustering. It is unclear what is done when "links may appear too weak". This seems to be an interesting technique but insufficiently described.
- p9, 12. a-priori does not need a dash.
- pp13-14. 4.5 starts again with OAEI and spends a whole column justifying why DBPedia is used. The text "Although DBPedia... of the problem" should rather be in introduction of the paper than here. It is time to present an experiment.
- pp13-14. The use of "the problem" is unclear. It is not clear if this denotes the same problem each time and it would be better to qualify the problem ("the problem of ...") to be sure of that.
- p14. It is strange to separate a dev set and a test set when the algorithm does not do learning. But OK, the intent is clear. The whole 4.5-4.6 is difficult to read, but I have no improvement to offer.
- p14. "cP concatenating" unsure what it means. Since the elements of P are not concatenated pairs but simple pairs, it is dubious that "$cP\subset P$".
- pp15-16. 5.1. It is difficult to understand the reason for this section. In a result section, the space should not be devoted to criticize the gold standard, but to discuss results. The discussion is interesting but should have occurred before, either to show that the task is difficult or to discard this gold standard, it is not time anymore to do this. The analysis of these mistakes is interesting but should better be part of data preparation instead of results (unless, I misunderstood something, up to you to avoid such misunderstanding).
- p15. "The data set is overwhelmed by negative examples": if these are mistakes, it would be better calling them this way (negative example could well be counter-examples and/or true negatives which are not mistakes).
- p16. Table 5. Reporting F-measure and recall in the same table is misleading. Reporting recall when it is available, would help comparing. There is no point at comparing F-measure and recall.
- p16. "confirm the hypothetical analogy": how can an analogy be confirmed?
- p21. "aligning heterogeneous resources": please be precise. "Currently, EQUATER fits best with aligning relations from different schemata used in a single Linked Dataset": actually, if this paper is about EQUATER, then this is exactly what it does. If EQUATER is a moving target, then better not write papers about it.
- pp22-24: some references have firstname, some other don't
- p22. [13]: This is "matchinG" and there is a new edition in 2013
- p22. [9] has no publisher, difficult to find out.
[20] likely need {F} around the F.

Refs.

- Danai Symeonidou, Vincent Armant, Nathalie Pernelle, Fatiha SaÃ¯s: SAKey: Scalable Almost Key Discovery in RDF Data. Semantic Web Conference (1) 2014: 33-49