Review Comment:
This paper presents an approach for open knowledge extraction, in which semantic relations have been constructed from hyperlinks. This approach uses a frame-based formal representation of unstructured texts, and has two major functions: evaluating the existence of relations as well as generating labels for the relations. An online service, Legalo, has been implemented, and the authors evaluate this approach based on crowdsourcing.
Merits of the paper:
1. Assigning proper labels to predicates is a difficult task. Even for specialized ontology engineers, finding a proper label takes time and efforts. This work proposes to combine extractive and abstractive approaches to automatically generate the labels based on a graph representation of the sentences. Although there are still limitations, I consider this approach as an important step for advancing automatic triple extraction from natural language texts.
2. The method descriptions, from frame-based representation to relation assessment to label generation, are presented in a detailed manner. The definitions, axioms, and rules are clear, which helps to enhance the reproducibility of this work.
3. The authors present several useful online demos, which help the readers to intuitively explore and examine the result of this research.
My major concerns of this paper are in the evaluation section and its presentation:
1. How many different human participants have been employed for the evaluation experiment? While the authors have mentioned in different places of the text about the least number of persons for a task, such as "Each question was performed by at least three workers", we still don't know how many different participants are in the entire experiment. Thus, it would be difficult to assess the significance of the evaluation result.
2. For the evaluation of hypothesis 1, the authors state that the recall is always 1 because "Legalo always provides either a true or false value and raters can only answer “yes” or “no”.
In other words, for this task there can not be neither true negatives or false negatives." It is difficult for me to understand why this happens. For me, a recall of 1.0 indicates that whenever the result from human participant is "yes", Legalo will always return that a relation exists. However, why the case, that when human participant answer "yes" and Legalo returns no relation, cannot happen (in such a case, the recall is no longer 1)? The authors may need to provide more explanation on this.
3. For the evaluation on task 3 and 4, it is unclear how the authors calculate values for precision and recall. If I understand correctly, in both tasks, Legalo generates a label which is then evaluated by human participants who choose among "agree", "partly agree", and "disagree". Then how is precision and recall calculated under this case? The authors may need to provide more explanation.
For the presentation of this paper, there are several issues as well:
1. This paper has too many sections, and some sections could be merged. For example, section 3 can be merged with section 4 and 5, with the semantic sources used for implementation (like WiBi and Watson) merged to section 4, and the evaluation dataset merged with section 5. Such an organization can also help readers track the content, and potentially reduce the length of the paper. For example, when beginning section 5, one may find it difficult to remember the two evaluation datasets discussed in section 3. As a result, the authors have to repeat some description of the dataset in section 5 again. By merging the sections, the authors only need to explain the datasets once.
2. Too many footnotes have been used in this paper (74 in total). While using some footnotes can help explain the content, too many footnotes can confuse the reader. I noticed that some footnotes are redundant: for example, footnotes 14, 32, and 44 all point to the source of the experimental dataset, while footnotes 13 and 72 both point to the online demo. There are also some explanation footnotes which can be merged into the text, such as 18, 21, and 22.
3. The paper also contains a number of repeating sentences. For example, when talking about "Legalo", the authors tend to explain repeatedly that it is "the current implementation of OKE". It might not be necessary to explain it multiple times.
4. In the legalo prototype, there is a lengthy description about FRED. While it is helpful for the readers to grasp some idea about how FRED works, too many details are unnecessary since FRED is not the major contribution of this work.
There are also some typos in the paper:
1. page 16: "In addition, this components implements two more modules: the “Property matcher” and the “Formaliser”." should be "In addition, this component implements..."
2. Also on page 16: "It depends on Legalo has core component and specialise it with two additional features..." should be "It depends on Legalo as core component and specialises it ..."
3. Hypothesis 2 in section 5.1: "Legalo is able to generate a usable predicate λ for a relevant
relation φ s between to entities, ..." should be "... between two entities..." and "λ" should be "λ'".
4. page 22:"while NELL properties result from and artificial concatenation of categories learnt automatically." should be "while NELL properties result from an artificial concatenation of..."
Other small issues:
1. the link (footnote 60) http://wit.istc.cnr.it/stlab-tools/legalo-wikipedia/ is not working.
|
Comments
Adding a relevant reference
It is due to refer to an additional relevant related work that defines the term "open knowledge extraction" in the context of AI, i.e. http://www.cs.jhu.edu/~vandurme/papers/VanDurmeSchubertSTEP08.pdf (Benjamin Van Durme and
Lenhart Schubert, 2008). At the time of writing this paper we did not refer to such work that will be properly acknowledged and surveyed in a possible publication version of the article. This work defines "open knowledge extraction" as "conversion of arbitrary input sentences into general world knowledge represented in a logical form possibly usable for inference", hence perfectly compatible with what defined in this paper. The cited work does not focus on Semantic Web technologies and languages, while it provides a further support to our claims and definitions.