Editorial Board

Editors-in-Chief
Krzysztof Janowicz

Managing Editors
Cogan Shimizu
Eva Blomqvist

Editorial Board
Mehwish Alam
Claudia d’Amato
Stefano Borgo
Boyan Brodaric
Philipp Cimiano
Oscar Corcho
Bernardo Cuenca-Grau
Elena Demidova
Jerome Euzenat
Mark Gahegan
Aldo Gangemi
Anna Lisa Gentile
Rafael Goncalves
Dagmar Gromann
Armin Haller
Aidan Hogan
Katja Hose
Eero Hyvönen
Sabrina Kirrane
Agnieszka Lawrynowicz
Freddy Lecue
Maria Maleshkova
Raghava Mutharaju
Axel Polleres
Guilin Qi
Marta Sabou
Harald Sack
Christoph Schlieder
Stefan Schlobach
Oshani Seneviratne
Cogan Shimizu
Ruben Verborgh
GQ Zhang

Former Editors-in-Chief
Pascal Hitzler

Editorial Assistants
Sanaz Saki Norouzi

Syndicate

N-ary Relation Extraction for Simultaneous T-Box and A-Box Knowledge Base Augmentation

Submitted by Marco Fossati on 09/11/2016 - 14:21

Tracking #: 1463-2675

Authors:

Marco Fossati

Emilio Dorigatti

Claudio Giuliano

Responsible editor:

Philipp Cimiano

Submission type:

Full Paper

Abstract:

The Web has evolved into a huge mine of knowledge carved in different forms, the predominant one still being the free-text document. This motivates the need for intelligent Web-reading agents: hypothetically, they would skim through disparate Web sources corpora and generate meaningful structured assertions to fuel knowledge bases (KBs). Ultimately, comprehensive KBs, like Wikidata and DBpedia, play a fundamental role to cope with the issue of information overload. On account of such vision, this paper depicts the Fact Extractor, a complete natural language processing (NLP) pipeline which reads an input textual corpus and produces machine-readable statements. Each statement is supplied with a confidence score and undergoes a disambiguation step via entity linking, thus allowing the assignment of KB-compliant URIs. The system implements four research contributions: it (1) executes n-ary relation extraction by applying the frame semantics linguistic theory, as opposed to binary techniques; it (2) simultaneously populates both the T-Box and the A-Box of the target KB; it (3) relies on a single NLP layer, namely part-of-speech tagging; it (4) enables a completely supervised yet reasonably priced machine learning environment through a crowdsourcing strategy. We assess our approach by setting the target KB to DBpedia and by considering a use case of 52,000 Italian Wikipedia soccer player articles. Out of those, we yield a dataset of more than 213,000 triples with an estimated 81.27% F1. We corroborate the evaluation via (i) a performance comparison with a baseline system, as well as (ii) an analysis of the T-Box and A-Box augmentation capabilities. The outcomes are incorporated into the Italian DBpedia chapter, can be queried through its SPARQL endpoint, and/or downloaded as standalone data dumps. The codebase is released as free software and is publicly available in the DBpedia association repository.

Full PDF Version:

swj1463.pdf

Previous Version:

N-ary Relation Extraction for Simultaneous T-Box and A-Box Knowledge Base Augmentation

Tags:

Reviewed

Decision/Status:

Solicited Reviews:

Click to Expand/Collapse

Review #1

By Matthias Hartung submitted on 12/Oct/2016

Suggestion:
Accept

Review Comment:

All my comments on previous versions of the manusript have been addressed.

Review #2

By Andrea Giovanni Nuzzolese submitted on 11/Nov/2016

Suggestion:
Accept

Review Comment:

The authors addressed all my comments carefully, hence I accept the paper in its latest form.

The reference [49] can be update to the following record:
V. Presutti, A. G. Nuzzolese, S. Consoli, A. Gangemi, and D. Reforgiato Recupero. From hyperlinks to Semantic Web properties using Open Knowledge Extraction. Semantic Web Journal 7(4): 351-378 (2016). DOI: 10.3233/SW-160221

Review #3

By Roman Klinger submitted on 12/Dec/2016

Suggestion:
Accept

Review Comment:

The authors addressed my comments I had for this paper. There are some minor aspects remaining which could improve the paper, mainly to support the reader in getting an understanding of design decisions and their impact on the performance on the overall system.

However, I agree with the authors that such analyses might be beyond the scope of this paper.

Log in or register to post comments
9368 reads

Main menu

Editorial Board

Syndicate

N-ary Relation Extraction for Simultaneous T-Box and A-Box Knowledge Base Augmentation

Tracking #: 1463-2675

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles

Search form

Main menu

Login

Editorial Board

Syndicate

N-ary Relation Extraction for Simultaneous T-Box and A-Box Knowledge Base Augmentation

Tracking #: 1463-2675

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles