Review Comment:
In this paper, the authors focus on the problem of answering natural language questions over a combination of knowledge base (KB) entities and text sources in an explainable way. The method proposed in this paper is based on three existing methods: DecompRC, PullNet, and MDR. DecompRC is for decomposing multi-hop questions into sub-questions that can be easier handled in each step. PullNet is employed to expand graphs that contain KB entities, triples, and entity-linked documents that are relevant to given questions and are potentially components of final answers and answer explanations. MDR is for finding sequences of texts as answers from relevant documents. Accordingly, the proposed method includes four modules: sub-question generation, graph expansion, sequence retrieval, and answer extraction. Experiments were conducted on MetaQA, WebQuestionsSP, ComplexWebQuestions, and HotpotQA. It is demonstrated that the proposed method outperforms PullNet and MDR on respective QA scenarios: KB-based question answering and text-based question answering.
Strengths
1) This paper focuses on the hybrid question answering problem over both KB and text sources, which is an important problem that has not been well-researched.
2) The experimental results demonstrate that the proposed method outperforms the original methods, i.e., PullNet and MDR, on respective question answering scenarios. The authors also tried to give explanations in analyses.
3) The authors present extensive discussion on the findings, theoretical and practical implications that they observe and conclude in experiments.
4) The authors provide the link to the source code, where detailed README can be found in respective folders of the code. The code appear to be complete for replication and the GitHub repository is appropriate for long-term discoverability.
Weaknesses
1) The readability of the methodology section is very limited. There is a lack of rigorous and consistent definition of notations used in the introduction. Also, the equations are mostly listed without sufficient explanations. Therefore, I could only get a notion of what the method is trying to do in each module but cannot be sure about the technical details.
2) Given the lack of clarity of proposed method, it is difficult to assess the novelty and soundness of the proposed method. It is especially difficult to examine the difference between the graph expansion and sequence retrieval modules of this method and the existing methods that are employed, i.e., PullNet and MDR.
3) In experiments, the method is only compared with PullNet and MDR. The proposed method does not really need to outperform all existing methods. But it is necessary to compare with a few other existing baselines so that the competitiveness of this method can be positioned regarding the current progress of research on this problem.
4) The authors only considered Hits@1 in several experiments. The recall and precision of returned answers need to also be examined, considering the existence of questions with multiple answers.
5) The authors claim that the method can provide explanations. However, this aspect is not evaluated in experiments. It would be better to provide a quality evaluation of generated explanations or, at least, a case study demonstrating the explainability of the method.
6) Several notations and acronyms are used in figures and texts before they are formally defined or introduced, which hampers the readability of the paper.
7) It is reported that “the main differences between GraphMDR and PullNet systems are given in Appendix B.” I believe this is very important and should be concisely presented in the main text.
8) There are several places where the font of texts or the appearance of notations are inconsistent.
In general, the quality of writing needs to be further improved. It is difficult to assess the novelty and soundness of the method given the current writing. Also, the existing experimental results are not sufficient to demonstrate the explainability of the method and to position the competitiveness of the method among existing works. Therefore, my suggestion would be Major Revision.
|