Ripple Down Rules for Question Answering

Tracking #: 956-2167

Authors: 
Dat Quoc Nguyen
Dai Quoc Nguyen
Son Bao Pham

Responsible editor: 
Guest Editors Question Answering Linked Data

Submission type: 
Full Paper
Abstract: 
Recent years have witnessed a new trend on building ontology-based question answering systems, that is to use semantic web information to provide more precise answers to users' queries. However, these systems are mostly designed for English, therefore, we introduce in this paper such a system for Vietnamese, that is, to the best of our knowledge, the first one made for Vietnamese. Different from most of previous works, we propose an approach that systematically builds a knowledge base of grammar rules for processing each input question into an intermediate representation element. Then we take this element with respect to a target ontology by applying concept-matching techniques for returning an answer. Experimental results show that the performance of the system on a wide range of Vietnamese questions is promising with accuracies of 84.1% and 82.4% for analyzing question and retrieving answer, respectively. Furthermore, our approach to the question analysis can easily be applied to new domains and new languages, thus saving time and human effort.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Gosse Bouma submitted on 26/Feb/2015
Suggestion:
Major Revision
Review Comment:

This manuscript was submitted as 'full paper' and should be reviewed along the usual dimensions for research contributions which include (1) originality, (2) significance of the results, and (3) quality of writing.

%%% REVIEW %%%

The paper presents a new methodology for connecting NL parser output to SPARQL queries. The method is illustrated and evaluated using a QA system for Vietnamese.

The most important innovation of the paper lies in the proposed method for developing rules that map grammar/parser output into generic query statements. However, the presentation of this part of the work leaves some questions unanswered. Also, the evaluation should be improved.

The paper is original in that it focuses on the proces of developing patterns for matching grammar/parser output and query templates. Significance of the paper in its current form is limited. The paper in general is well written, although there are some minor spelling and grammar issues that can easily be corrected.

Details:

The paper starts with a lengthy overview of previous work in QA. Much of this section can be omitted or referred to only in general terms. On the other hand, this section fails to mention current developments, such as the IBM system Watson or the QA-capabilities of Apple's Siri and related systems. Given the fact that the authors want to apply their QA system to an ontology, more attention should be given to QA over large linked data sets, as illustrated by the QALD evaluation campaigns.

3.1. The query format used in this section should be explained more carefully. The example 'which univ does Pham Duc Dang study in' introduces a term1 'university' which seems to be a concept/category term. How is this handled in the automatic translation to SPARQL?

3.1./3.2. The examples shown in these sections seem rather straightforward compared to questions in the most recent editions of QALD (www.sc.cit-ec.uni-bielefeld.de/qald/). The remarks at the end of section 5, concerning the impossibility of answering comparative questions (who has the highest GPA) also suggest that the scope of questions is rather limited. Please elaborate on this.

3.3.2. 'a manual dictionary is built for describing concepts..in the ontology', and 'a phrase is matched by one of the relation patterns'. How much manual labor is involved in this, and how does it compare to the development of question patterns described later? In particular, do these steps not also make the translation process dependent on the ontology in undesirable ways?

4. rules for question analysis
It is hard to understand the details of this section, as it uses an idiosyncratic notation for queries. Is it possible to show the output of the matching process as (schematic) SPARQL queries?

The most important argument for the method presented in this section seems to be development time. Can you also say something about expressive power. For instance, does the formalism and method allow implementation of rules for comparative questions, list questions, complex (indirect) questions, etc, as used in the QALD competitions?

5. Evaluation
You evaluate on an in house ontology, developed manually using Protege. This makes it very hard to compare your results to other work. Also, it seems the scope of the ontology is very limited, compared to current work on QA for DbPedia. The evaluation would be much more convincing, if it also included results for QA over an open domain LOD set such as DbPedia or similar, large, open, resource.

Half of the correct answers in table 9 require interaction with users. Please explain what this amounted to.

Review #2
By Konrad Höffner submitted on 21/Apr/2015
Suggestion:
Major Revision
Review Comment:

The authors invented, implemented and evaluated a domain dependant (close), single language (Vietnamese) semantic Question Answering system for factoid questions as well as an approach for creating question-to-template-transformation rule trees.

(1) Originality
Their claim that this is the such first system in Vietnamese is, as far as I know, valid. There is previous work on the system which is properly referenced, however it is not quite clear which part of the system was established in previous work and which part is new.

The core contribution is an approach ("Ripple Down Rules") to assist the creation of query templates instead of hard coding them. It includes a decision-tree-like structure of transformation rules which can be iteratively modified to successively increase its performance.

(2) Significance of the Results
The "Ripple Down Rules" are shown to significantly improve the performance of the rules which along with the drastic reported time savings and the high accuracy scores leads to a high significance of the results (the times used could be included in the table however, as it is a bit unclear what took how long exactly reading the description).

The knowledge base size of 78 instances, 15 concepts and 17 relations is too small for a realistic evaluation (also, instances are not part of the ontology as is mentioned), as it hides ambiguity, which is one of the main challenges faced by question answering approaches. Tthe test data does not have to be all of DBpedia but a few thousands of triples would already allow a much more realistic evaluation, especially as the approach is claimed to be applicable to other domains and languages. With a bigger dataset, a discussion of the complexity of the algorithm/scalability of the system and time measurements would be welcome additions. This is the major weak point in my opinion.

The mentioning of some differences between the Vietnamese language and English is helpful for
developing other systems in that language.

(3) Quality of Writing
The writing style is certainly unusual but mostly in a refreshing way, with sharp observations that sometimes border on the comical, without feeling out of place in scientific writing. For example, instead of carefully defining web of document search and Question Answering and then analysing the difference, they go directly to the point: "Most current search engines take an (sic) user's query and returns (sic) a ranked list of related documents that are then scanned by the user to get the desired information. In contrast, the goal of QA systems is to give answers to the users' questions without involving the scanning process."

As the above sentence shows, there are unfortunately also many basic spelling and grammar mistakes.

Additionally, other parts are unnecessarily verbose. For example, they abbreviate "knowledge-based QA system for Vietnamese (KbQAS)" which I feel is a bit unwieldy in contrast to something simple like KS or even KQS. Also, they refer to it as the "KbQAS system", which is redundant, like "HIV virus" or "ATM machine". The abbreviation should also come directly after the term itself (I guess Vietnamese does not go into the abbreviation as the letter V is not appended).

Some terms have a slightly different meaning than the one used in the paper. For example, the first stage of the pipeline is called "front-end" and the second stage the "back-end", although those terms signify the presentation and data access layer of an application.

Some sections could be shortened a bit, such as 2.1 open-domain question answering. I do not think it is necessary to state the (undefined) performance percentage score of a system at TREC 2002, which was quite a while ago.

Note: As I have no knowledge of the Vietnamese language, I cannot judge the correctness of the Vietnamese phrases.

Review #3
By Shizhu He submitted on 17/May/2015
Suggestion:
Minor Revision
Review Comment:

This paper develops an ontology-based question answering system. It consists of two components of the question analysis engine and the answer retrieval for answering a natural language question.

The question analysis component includes pre-processing, syntactic analysis and semantic analysis modules for transform natural language question to structured intermediate representation (containing more information than Query-Triple in AquaLog [25]), which mainly relied on multiply handle write rules with the JAPE grammars in the GATE framework. This paper first develops rules for analysis Vietnamese questions.

The answer retrieval component includes ontology mapping and answer extraction modules which takes the intermediate representation and the Ontology as its input to generate an answer. However, in this step, this paper do not develop a Vietnamese especially services, such as, it use the off-the-peg relation similarity services in the AquaLog system [25].

Even this is the first QA system for Vietnamese, the question analysis engine and the answer retrieval used in this system are not very different from previous works.

The interesting and innovating of this work is the knowledge acquisition approach to systematically acquiring rules for converting a natural language question into an intermediate representation. It may be very useful for popular language, such as Vietnamese, because it lacks necessary resources for constructing machine learning analysis system, such as question classification models, ontology mapping models. The most work of this work is manage the interaction between rules and keep consistency among them.

In summary, the idea of this paper is novel and interesting, and this paper is also clearly written.

Here are some questions:
1. I wonder that the necessary of developing a Vietnamese question analysis model for QA, Is there any prominent characters among Vietnamese and other language such as English and Chinese?

2. Most question answering systems mainly address the ambiguities in the question analysis step and the answer retrieval step. Is there any ambiguities in answering questions in this paper? What methods do you use to address this problem?

3. This paper is rules based system, for example, the concepts and entities are determined using a manual dictionary. I wonder that it hard for scale other domain and language. Do you consider an extensible methods?

4. The related work lack some statistical semantic parsing methods such as works of Raymond J. Mooney and Percy Liang.

5. How to distinguish the questions contain multiply answers and just one answer. For example, the question “Which university does Pham Duc Dang study in and who tutors him?” contains two types of answers (university and person), and the question “List all students studying in K50 computer science course, who have hometown in Hanoi?” contains one type of answers even it cover two clauses.