Editorial Board

Editor-in-Chief
Cogan Shimizu
Eva Blomqvist

Editorial Board
Mehwish Alam
Claudia d’Amato
Stefano Borgo
Boyan Brodaric
Philipp Cimiano
Michael Cochez
Oscar Corcho
Bernardo Cuenca-Grau
Elena Demidova
Jerome Euzenat
Sebastián Ferrada
Mark Gahegan
Aldo Gangemi
Dagmar Gromann
Armin Haller
Pascal Hitzler
Aidan Hogan
Katja Hose
Eero Hyvönen
Krzysztof Janowicz
Sabrina Kirrane
Agnieszka Lawrynowicz
Freddy Lecue
Maria Maleshkova
Raghava Mutharaju
Axel Polleres
Guilin Qi
Marta Sabou
Harald Sack
Angelo Salatino
Christoph Schlieder
Stefan Schlobach
Cogan Shimizu
Blerina Spahiu
Sanju Tiwari
GQ Zhang
Rui Zhu

Former/Founding Editors-in-Chief
Krzysztof Janowicz
Pascal Hitzler

Editorial Assistants
Michael McCain

Syndicate

Boosting Document Retrieval with Knowledge Extraction and Linked Data

Submitted by Marco Rospocher on 08/22/2018 - 06:13

Tracking #: 1996-3209

Authors:

Marco Rospocher

Francesco Corcoglioniti

Mauro Dragoni

Responsible editor:

Andreas Hotho

Submission type:

Full Paper

Abstract:

Given a document collection, Document Retrieval is the task of returning the most relevant documents for a specified user query. In this paper, we assess a document retrieval approach exploiting Linked Open Data and Knowledge Extraction techniques. Based on Natural Language Processing methods (e.g., Entity Linking, Frame Detection), knowledge extraction allows disambiguating the semantic content of queries and documents, linking it to established Linked Open Data resources (e.g., DBpedia, YAGO) from which additional semantic terms (entities, types, frames, temporal information) are imported to realize a semantic-based expansion of queries and documents. The approach, implemented in the KE4IR system, has been evaluated on different state-of-the-art datasets, on a total of 555 queries and with document collections spanning from few hundreds to more than a million of documents. The results show that the expansion with semantic content extracted from queries and documents enables consistently outperforming retrieval performances when only textual information is exploited; on a specific dataset for semantic search, KE4IR outperforms a reference ontology-based search system. The experiments also validate the feasibility of applying knowledge extraction techniques for document retrieval — i.e., processing the document collection, building the expanded index, and searching over it — on large collections (e.g., TREC WT10g).

Full PDF Version:

swj1996.pdf

Previous Version:

Boosting Document Retrieval with Knowledge Extraction and Linked Data

Tags:

Reviewed

Decision/Status:

Log in or register to post comments
11469 reads

Main menu

Editorial Board

Syndicate

Boosting Document Retrieval with Knowledge Extraction and Linked Data

Tracking #: 1996-3209

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles

Search form

Main menu

Login

Editorial Board

Syndicate

Boosting Document Retrieval with Knowledge Extraction and Linked Data

Tracking #: 1996-3209

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles