An Assertion and Alignment Correction Framework for Large Scale Knowledge Bases

Tracking #: 2829-4043

Authors: 
Jiaoyan Chen
Ernesto Jimenez-Ruiz
Ian Horrocks
Xi Chen
Erik Bryhn Myklebust

Responsible editor: 
Guest Editors KG Validation and Quality

Submission type: 
Full Paper
Abstract: 
Various knowledge bases (KBs) have been constructed via information extraction from encyclopedias, text and tables, as well as alignment of multiple sources. Their usefulness and usability is often limited by quality issues. One common issue is the presence of erroneous assertions and alignments, often caused by lexical or semantic confusion. We study the problem of correcting such assertions and alignments, and present a general correction framework which combines lexical matching, context-aware sub-KB extraction, semantic embedding, soft constraint mining and semantic consistency checking. The framework is evaluated with one set of literal assertions from DBpedia, one set of entity assertions from an enterprise medical KB, and one set of mapping assertions from a music KB constructed by integrating Wikidata, Discogs and MusicBrainz. It has achieved promising results, with a correction rate (i.e., the ratio of the target assertions/alignments that are corrected with right substitutes) of 70.1%, 60.9% and 71.8%, respectively.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Accept

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Petr Křemen submitted on 08/Jul/2021
Suggestion:
Accept
Review Comment:

Thanks authors for their revision and answers. I have only one observation:
- I can still see "input KB" used in page 4

Review #2
By Heiko Paulheim submitted on 13/Jul/2021
Suggestion:
Minor Revision
Review Comment:

The authors have done a great job in addressing most of my comments. Especially since the evaluation metrics are now much clearer to me. To help other readers that might get confused like I was, it might be an option to depict a 3x2 confusion matrix ((GT exists, GT does not exist) x (correct replacement, wrong replacement, no replacement)), and illlustrate the measures using that matrix.

There are still a few open (and a few new ;-)) questions open.

Old questions:
* While I see that some of my questions for clarification are now addressed (particularly here: why neighborhoods were not used as candidates, and how salient semantic confusion is), it would be nice to provide some real examples here. The numbers arguing for the neighborhood sizes from the cover letter should be included in the paper, since they provide a hard justification for the approach chosen.
* I see my comment on the neighborhod was eaten by the journal website's HTML encoder ;-) Let me try again: Section 4.3.1: algorithm 1 seems to extract neighborhoods only with statements with the same predicate as the target assertions. For example, if my target assertion was "Germany capital Berlin", the neighboorhood graph would not contain, e.g., "Germany seatOfGovernment Berlin" or "Germany hasPOI Berlin_Wall". Is that really intended?

New questions:
* in Fig. 4: why does the correction rate decline with higher thresholds for DBpedia, but increase for the other two datasets? There seems to be some particularity that DBpedia has, but the other two do not. It would be interesting to dig a bit deeper here.
* Since there are quite efficient implementations of RDF2vec out there (like jRDF2vec, which can be considered the fastest one, or even faster approximations like RDF2vec Light), which scale well to larger graphs as well, it would not have been a too big deal to train RDF2vec on the other two datasets as well. In the final version, I would find it neat to see results on those datasets as well.

Overall, I am very happy with revision. The remaining few questions could be solved in a minor revision.

Review #3
By José María Álvarez Rodríguez submitted on 03/Aug/2021
Suggestion:
Accept
Review Comment:

The reviewed version of the paper has properly addressed previous comments making the content more understandable and justifying some conceptual and design decisions. Apart from the theoretical approach validated through the experiments, it is specially relevant the improvement in the discussion section.

There are only some minor things that can be done when the paper transitions to the publishing process:

-Check that the numbering of references is correct according to the journal/editorial rules. Currently, it is not clear the numbering order (first, etc.).
-When making the reference to SHACL and SheX, please, include a reference or footnote to the W3C recommendations.
-Although given the context it is not possible to compare results with previous approaches (a benchmark for this task is not avaible or the metrics are not the same), it would be nice to see a quantitative comparison (e.g. correction rate) of previous approaches to the presented one.