Review Comment:
Relevance:
In this paper, the authors present a method for translating natural-language questions into SPARQL queries that can be used against biomedical knowledge bases containing linked data in order to retrieve answers. The topic of the paper is relevant to the Semantic Web Journal in general and to the special issue on Question Answering over Linked Data in particular.
Significance:
Question answering over linked data, in particular, over biomedical knowledge bases, is a topic of theoretical and practical significance. The methodology presented in the paper represents a contribution in this regard.
Organization:
Overall, the paper is relatively well-organized, with appropriate Introduction, Conclusion, and Related Work sections besides the three main substantive sections where the authors describe and discuss the question translation methodology, the semantic resources used in the translation process, and the evaluation experiments and results. The paper also includes an adequate, albeit not extensive, number of relevant references. However, the related works discussed and the references included are mainly focused on question translation and question answering over linked data. Considering that the authors describe the application of their methodology to the domain of biomedical question answering, at least some discussion/inclusion of domain-relevant issues/references is deemed appropriate.
Presentation:
The paper contains appropriate figures to help describe and illustrate the proposed methodology. In particular, the use of detailed figures that describe the proposed natural-language-to-SPARQL conversion process at each progressive step aligns well with, and facilitates the understanding of, the textual description of the process. However, presentation could be further improved by including an end-to-end illustration of an example case showing the intermediate results of each processing step. Also, it would be appropriate to attach an appendix listing natural-language questions in the test set and the (correct/incorrect) results of their translation into SPARQL queries using the methodology as well as the answers retrieved by using the (correctly translated) queries.
Language & Writing Style:
Although the language could be slightly improved (by consistently using grammatically correct and idiomatically appropriate expressions, e.g., translation into vs. translation in), the paper is overall readable and comprehensible.
Technical Content:
Overall, the proposed methodology seems sound, and the authors describe the methodology relatively clearly and in detail. However, some steps of the translation process using the methodology seem dubious/ambiguous or at least not presented with sufficient clarity.
For example, on p.6, in the right column, in describing the identification of predicate and argument for a given example question ("What [are] the side effects of drugs used for Tuberculosis?" in Fig. 3), the authors write: "In the example from Figure 3, the predicate state with the expected ar-guments drugbank/drugs and Gas/String is recognized." However, it is not clear how the predicate "state" is identified nor is it obvious that "state" is the most appropriate predicate given the semantic content of the given question. Even more puzzling is the "Gas/String" part, which at best would describe the expected answer and its data type but seems out of place in the given context of description.
The aforementioned example also illustrates what seems to be generally lacking in the paper, namely, some effort into the semantic analysis and classification of the characteristics of biomedical questions (and corresponding answers). For example, the example question above could be abstracted as a question of the cause-effect category, with the canonical form -causes-, which in this case asks for the side effects caused by the drugs used for Tuberculosis.
Minor Correction:
The beginning of the caption for Fig. 1 reads: "The global architecture of the system (the processing steps are in yellow, the resources in blue)." However, the figure is rendered in plain black-and-white and is not colored as described.
Suggested Improvements:
The authors are encouraged to revise the paper, both in content and form, taking the reviewer’s comments into consideration.
|