Querying Biomedical Linked Data with Natural Language Questions

Tracking #: 1356-2568

Thierry Hamon
Natalia Grabar
Fleur Mougin

Responsible editor: 
Guest Editors Question Answering Linked Data

Submission type: 
Full Paper
Recent and intensive research in the biomedical area enabled to accumulate and disseminate biomedical knowledge through various knowledge bases increasingly available on the Web. The exploitation of this knowledge requires to create links between these bases and to use them jointly. Linked Data, the SPARQL language and interfaces in natural language question answering provide interesting solutions for querying such knowledge bases. However, while using biomedical Linked Data is crucial, life-science researchers may have difficulties using the SPARQL language. Interfaces based on natural language question answering are recognized to be suitable for querying knowledge bases. In this paper, we propose a method for translating natural language questions into SPARQL queries. We use Natural Language Processing tools, semantic resources and RDF triple descriptions. We designed a four-step method which allows to linguistically and semantically annotate questions, to perform an abstraction of these questions, then to build a representation of the SPARQL queries, and finally to generate the queries. The method is designed on 50 questions over three biomedical knowledge bases used in the task 2 of the QALD-4 challenge framework and evaluated on 27 new questions. It achieves good performance with 0.78 F-measure on the test set. The method for translating questions into SPARQL queries is implemented as a Perl module and is available at http://search.cpan.org/~thhamon/RDF-NLP-SPARQLQuery/.
Full PDF Version: 


Solicited Reviews:
Click to Expand/Collapse
Review #1
By Anca Marginean submitted on 11/Apr/2016
Review Comment:

I have no further remarks.