Review Comment:
The authors invented, implemented and evaluated a domain dependant (close), single language (Vietnamese) semantic Question Answering system for factoid questions as well as an approach for creating question-to-template-transformation rule trees.
(1) Originality
Their claim that this is the such first system in Vietnamese is, as far as I know, valid. There is previous work on the system which is properly referenced, however it is not quite clear which part of the system was established in previous work and which part is new.
The core contribution is an approach ("Ripple Down Rules") to assist the creation of query templates instead of hard coding them. It includes a decision-tree-like structure of transformation rules which can be iteratively modified to successively increase its performance.
(2) Significance of the Results
The "Ripple Down Rules" are shown to significantly improve the performance of the rules which along with the drastic reported time savings and the high accuracy scores leads to a high significance of the results (the times used could be included in the table however, as it is a bit unclear what took how long exactly reading the description).
The knowledge base size of 78 instances, 15 concepts and 17 relations is too small for a realistic evaluation (also, instances are not part of the ontology as is mentioned), as it hides ambiguity, which is one of the main challenges faced by question answering approaches. Tthe test data does not have to be all of DBpedia but a few thousands of triples would already allow a much more realistic evaluation, especially as the approach is claimed to be applicable to other domains and languages. With a bigger dataset, a discussion of the complexity of the algorithm/scalability of the system and time measurements would be welcome additions. This is the major weak point in my opinion.
The mentioning of some differences between the Vietnamese language and English is helpful for
developing other systems in that language.
(3) Quality of Writing
The writing style is certainly unusual but mostly in a refreshing way, with sharp observations that sometimes border on the comical, without feeling out of place in scientific writing. For example, instead of carefully defining web of document search and Question Answering and then analysing the difference, they go directly to the point: "Most current search engines take an (sic) user's query and returns (sic) a ranked list of related documents that are then scanned by the user to get the desired information. In contrast, the goal of QA systems is to give answers to the users' questions without involving the scanning process."
As the above sentence shows, there are unfortunately also many basic spelling and grammar mistakes.
Additionally, other parts are unnecessarily verbose. For example, they abbreviate "knowledge-based QA system for Vietnamese (KbQAS)" which I feel is a bit unwieldy in contrast to something simple like KS or even KQS. Also, they refer to it as the "KbQAS system", which is redundant, like "HIV virus" or "ATM machine". The abbreviation should also come directly after the term itself (I guess Vietnamese does not go into the abbreviation as the letter V is not appended).
Some terms have a slightly different meaning than the one used in the paper. For example, the first stage of the pipeline is called "front-end" and the second stage the "back-end", although those terms signify the presentation and data access layer of an application.
Some sections could be shortened a bit, such as 2.1 open-domain question answering. I do not think it is necessary to state the (undefined) performance percentage score of a system at TREC 2002, which was quite a while ago.
Note: As I have no knowledge of the Vietnamese language, I cannot judge the correctness of the Vietnamese phrases.
|