Multilingual Question Answering Systems for Knowledge Graphs—A Survey

Tracking #: 3530-4744

Aleksandr Perevalov
Andreas Both
Axel-Cyrille Ngonga Ngomo

Responsible editor: 
Philipp Cimiano

Submission type: 
Survey Article
This paper presents a survey on multilingual Knowledge Graph Question Answering (mKGQA). We employ a systematic review methodology to collect and analyze the research results in the field of mKGQA by defining scientific literature sources, selecting relevant publications, extracting objective information (e.g., problem, approach, evaluation values, used metrics, etc.), thoroughly analyzing the information, searching for novel insights, and methodically organizing them. Our insights are derived from 46 publications: 25 papers specifically focused on mKGQA systems, 14 papers concerning benchmarks and datasets, and 7 systematic survey articles. Starting its search from 2011, this work presents a comprehensive overview of the research field, encompassing the most recent findings pertaining to mKGQA and Large Language Models. We categorize the acquired information into a well-defined taxonomy, which classifies the methods employed in the development of mKGQA systems. Moreover, we formally define three pivotal characteristics of these methods, namely resource efficiency, multilinguality, and portability. These formal definitions serve as crucial reference points for selecting an appropriate method for mKGQA in a given use case. Lastly, we delve into the challenges of mKGQA, offer a broad outlook on the investigated research field, and outline important directions for future research. Accompanying this paper, we provide all the collected data, scripts, and documentation in an online appendix.
Full PDF Version: 

Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Hugo Gonçalo Oliveira submitted on 25/Oct/2023
Minor Revision
Review Comment:

This paper surveys Multilingual Question Answering Systems for Knowledge Graphs.
The current version was revised according to the comments of the reviewers.
Many comments were addressed. Relevant information that was previously missing was added, including some recent works, but the paper was not restructured.

On the suggested dimensions for the review, the paper:
(1) Is suitable as an introductory text, even if the structure is not the most friendly;
(2) Is comprehensive and has a balanced coverage, even if the structure is not balanced (Sec 4 uses half of the paper);
(3) Is well-written but, again, would highly benefit from a reorganisation;
(4) Covers material that is important for, and not limited to, the Semantic Web community.

Before some comments on the structure, I note that the (now) 12 papers considered even if left out of the selection procedure were still not identified. This makes me wonder whether the adopted procedure is reproducible and actually suitable for the task. If it were, would it leave out some many relevant papers?

But most of my comments are not on the relevance of the survey nor the quality of the covered content. They are on form / organisation, towards better highlighting the contributions according to what I would expect in a survey. In my opinion, the paper was already too long and verbose. Now it is even more, without necessarily needing so.

Most of Sec 4.1 is a mere enumeration of systems and their descriptions, almost independently of each other, much like a compilation of paper abstracts.
Table 4 helps, but authors should take advantage of its value. I strongly suggest the section to be structured around the columns of this table from the beginning. This would highlight the actual analysis, put things more systematic, thus making it easier to compare systems and identify trends.

The taxonomy in Sec 4.2 would be better supported if the surveyed systems were classified according to it. Again, this would highlight the contribution and enable to identify the most common groups (possibly by time periods) or overlaps between groups, among others. Figure 1 uses a full page but does not add much.

The following sections suffer from similar problems.

Sec 4.3 seems detached from the previous, without links to the surveyed systems.

The problem of Sec 5 is similar to the one of Sec 4.1: datasets are described almost independently of each other.

Sec 6 would be more useful if it had links to the surveyed systems and benchmarks where the challenges are raised from.

Sec 7 is a brief summary of the approach but it does not have actual conclusions nor summarises the takeaways (even though some are already in the Discussion).

Minor issues:
- p2: , another example -> . Another example
- p3: therefore utilize the latter one ?
- p13: "results are as follows: 39.29%, 33.02%, 23.74%, and 24.56%" -> what metric?
- p16: used in the NLP -> use in NLP
- Sec 3 has no introductory text.
- The sentences of the last paragraph of Section 4 seem incomplete.
- Secs 5.1 and 5.2: some confusion between benchmarks and series of benchmarks.
- p25: question number?

Review #2
Anonymous submitted on 03/Dec/2023
Review Comment:

The authors have addressed my comments in a satisfactory manner, and therefore I recommend the paper for acceptance as is.