Language-agnostic Relation Extraction from Wikipedia Abstracts for Knowledge Graph Extension

Tracking #: 1510-2722

Authors: 
Nicolas Heist
Heiko Paulheim

Responsible editor: 
Guest Editors ML4KBG 2016

Submission type: 
Full Paper
Abstract: 
Large-scale knowledge graphs, such as DBpedia, Wikidata, or YAGO, can be enhanced by relation extraction from text, using the data in the knowledge graph as training data, i.e., using distant supervision. While most existing approaches use language-specific methods (usually for English), we present a language-agnostic approach that exploits background knowledge from the graph instead of language-specific techniques and builds machine learning models only from language-independent features. We demonstrate the extraction of relations from Wikipedia abstracts, using the twelve largest language editions of Wikipedia. From those, we can extract 1.6M new relations in DBpedia at a level of precision of 95%, using a RandomForest classifier trained only on language-independent features. We furthermore investigate the similarity of models for different languages and show an exemplary geographical breakdown of the information extracted.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Claudio Delli Bovi submitted on 17/Feb/2017
Suggestion:
Major Revision
Review Comment:

This paper presents a language-independent approach to relation extraction, specifically targeted to Wikipedia abstracts. Based on DBpedia as backbone, the extraction system is rather general and relies on a classifier with standard local features to identify relations between linked entity mentions. Extraction is carried out directly across the text (with no preprocessing stage -e.g., to extract parts of speech or syntactic dependencies) . Crucially, inter-language links enable the system to be extended to any language supported by Wikipedia. The authors experiment with different classifiers, and report a series of experimental evaluation studies on the number of high-precision extractions, along with cross-language comparisons and topical/geographical analyses.

Overall, the paper is for the most part clearly written, and accompanied by a solid experimental evaluation. However, the proposed model is a somewhat pedestrian application of the classical classifier-based paradigm with manually engineered features, with no major novelty or improvement over familiar techniques. In many cases, some of which are even explicitly pointed out across the paper, the authors do not seem to tackle issues that would have really pushed the boundaries of the state of the art. For example, their system is crucially dependent on the availability of hyperlinks (on one side) and inter-language links (on the other). The latter problem is explicitly discussed in Section 4.2, and identified as the main obstacle when extracting relation instances in languages other than English: how would the authors deal with such a loss of information? The sparseness of Wikipedia hyperlinks is also a known issue in the field, which has fueled a number of research threads (Noraset et al., 2014; West et al., 2015; Raganato et al., 2016): it would have been interesting to investigate how to recover all this potentially useful information, instead of simply applying a conservative policy.

Another major point is that the authors mention a series of relation extraction approaches that "could be transferred to multi-lingual settings" (Section 2): why should the proposed approach be preferable over these contributions?
Furthermore, as pointed out in the second-to-last paragraph of Section 2, the recent upsurge of deep learning has led to the development of model where explicit feature engineering has been replaced by implicit feature construction: it is not clear to me how a model with engineered features, such as the one proposed in the paper, would represent a valid alternative to end-to-end relation extraction models (Nguyen and Grishman, 2015; Lin et al., 2016; Miwa and Bansal, 2016) on "specific texts". Language-agnostic extraction is not a complete novelty either: multilingual relation extraction approaches do exist, either based on universal schemas (Verga et al., 2016) or cross-lingual projection (Faruqui and Kumar, 2015).

Finally, a great deal of relevant literature on Relation Extraction and Knowledge Base Completion is missing: apart from the contributions already mentioned, embeddings method for KB completion have been very popular recently (Bordes et al., 2013; Socher et al., 2013; Chang et al., 2014; Wang et al., 2014; Lin et al., 2015, among others) as well as graph-based methods (Gardner et al., 2014, Gardner and Mitchell, 2015) and even hybrid methods (Neelakantan et al., 2015). Exploiting potentially noise-free settings for extracting relations is a key intuition also in the approach proposed by Delli Bovi et al. (2015), where definitions are used instead of abstracts. Also, a large-scale knowledge graph with an explicit focus on multilinguality, not mentioned in the paper, is BabelNet (babelnet.org) (Navigli and Ponzetto, 2012). BabelNet was indeed used to develop a language-agnostic approach to named entity disambiguation, Babelfy (babelfy.org) (Moro et al., 2014): both are extremely relevant to the topic treated in the paper and its focus on multilinguality.

Specific issues:
- Section 3: A brief, explicit definition of the classification problem would be beneficial for the sake of clarity. What is the classification objective? What about the training instances? Also, the use of the term "model" is a bit unusual (at least in the context of Machine Learning and Natural Language Processing): the authors seem to consider a "classification model" as an individual symbolic rule (perhaps learnt by RIPPER?);
- Section 4.1: Some details about the manual validation setting would be desirable, especially considering how difficult such a task is for non-expert annotators. How many annotators have been used? What did they actually evaluate? In case of multiple annotators, what agreement did they achieve?
- Section 4.3: The notation used to describe the statement (3° paragraph) is left mostly implicit or unexplained. It would be preferable to state explicitly what does 's', 'p', 'o' and 'a' represent.

References:
- T. Noraset, C. Bhagavatula, and D. Downey. Adding high-precision links to Wikipedia. EMNLP, 2014
- R. West, A. Paranjape, and J. Leskovec. Mining missing hyperlinks from human navigation traces: A case study of Wikipedia. WWW, 2015.
- A. Raganato, C. Delli Bovi and R. Navigli. Automatic construction and evaluation of a large semantically enriched Wikipedia. IJCAI, 2016.
- T. H. Nguyen, R. Grishman. Relation Extraction: Perspective from convolutional neural networks. NAACL-HLT, 2015.
- Y. Lin, S. Shen, Z. Liu, H. Luan and M. Sun. Neural relation extraction with selective attention over instances. ACL, 2016.
- M. Miwa and M. Bansal. End-to-end relation extraction using LSTMs on sequences and tree structures. ACL, 2016.
- P. Verga, D. Belanger, E. Strubell, B. Roth, A. McCallum. Multilingual relation extraction using compositional universal schema. NAACL-HLT, 2016.
- M. Faruqui and S. Kumar. Multilingual open relation extraction using cross-lingual projection. NAACL-HLT, 2015.
- A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston and O. Yakhnenko. Translating embeddings for modeling multi-relational data. NIPS, 2013.
- R. Socher, D. Chen, C. D. Manning, A. Ng. Reasoning with neural tensor networks for knowledge base completion. NIPS, 2013.
- K. Chang, W. Tih, B. Yang and C. Meek. Typed tensor decomposition of knowledge bases for relation extraction. EMNLP, 2014.
- Z. Wang, J. Zhang, J. Feng and Z. Chen. Knowledge graph embedding by translating on hyperplanes. AAAI, 2014.
- Y. Lin, Z. Liu, M. Sun, Y. Liu and X. Zhu. Learning entity and relation embeddings for knowledge graph completion. AAAI, 2015.
- M. Gardner, P. Talukdar, J. Krishnamurthy and T. Mitchell. Incorporating vector space similarity in random walk inference over knowledge bases. EMNLP, 2014.
- M. Gardner and T. Mitchell. Efficient and expressive knowledge base completion using subgraph feature extraction. EMNLP, 2015.
- A. Neelakantan, B. Roth and A. McCallum. Compositional vector space models for knowledge base completion. ACL, 2015.
- C. Delli Bovi, L. Telesca and R. Navigli. Large-scale information extraction from textual definition through deep syntactic and semantic analysis. TACL, 3, 2015.
- R. Navigli and S. Ponzetto. BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. AIJ, 2012.
- A. Moro, A. Raganato and R. Navigli. Entity linking meets word sense disambiguation: A unified approach. TACL, 2, 2014.

Review #2
By Mikołaj Morzy submitted on 22/Apr/2017
Suggestion:
Minor Revision
Review Comment:

The paper submitted for reviewing presents a method of extracting new knowledge graph relations from Wikipedia abstracts, without any NLP processing (and thus making the method language-agnostic). In addition, the method allows to combine knowledge stored in different languages, which may result in interesting framework for cross-validation of extracted knowledge.

The paper is well-written, the quality of presentation is good, the idea is presented clearly and the train of authors' thoughts is easy to follow. I also think that the concept, although not very sophisticated, works well in practice and can serve as a guideline for practical knowledge engineering. Having said that, I think that there are some issues that need to be resolved prior to the publication of the paper in SWJ.

Firstly, in the "Related Work" section the authors mention a few similar projects, but they do not provide any details as to in which regard their solution differs from previous proposals. I think they should provide some basic comparison of their proposal with previous methods and outline main differences. Secondly, the method presented in the paper uses a very simple set of features on which various machine learning algorithms are being evaluated. It is not surprising that Random Forest tends to outperform other learning schemes as RF is known to generalize well. It is my fear, however, that the experimental results are driven mostly by the sheer fact that other learning schemes are not given enough features to work with. The authors mention deep learning approach in their paper. It seems that this type of task is particularly suited for neural networks capable of using contextual information (e.g. long-short term memory networks), in particular, since recently it has been shown that these solutions generalize well across different languages (vide Google's one-shot translation). In a related note, I find the lack of discussion of neural word embeddings disturbing since these embeddings tend to preserve semantic relationships between words and could prove very beneficial in the analyzed task. Lastly, I think that it would improve the quality of the paper if some major patterns were presented (the authors only present one single rule for a negative case and the rule is not very interesting).

On the positive note, I find various experiments presented in the paper to be interesting and insightful. The correlations between geolocations and languages is an interesting point, as well as cross-lingual relation extraction.

To summarize, I think that the research presented in the paper is valid and merits a journal publication, if a revision of the paper is performed. In particular, I advise to make the following amendments to the paper
- discuss the relationship of your method to other methods presented in the literature so far,
- discuss the applicability of other learning schemes (in particular, deep learning)
- discuss the possibility of a much more sophisticated feature engineering pipeline
- present and interpret some rules extracted from the text and used to extract novel relationships

Review #3
By Dunja Mladenic submitted on 28/Apr/2017
Suggestion:
Major Revision
Review Comment:

The paper addresses a relevant problem of relation extraction, proposing a language-agnostic approach that seems to give good results that are practically applicable for the selected subset of relations. However that paper would benefit form more in depth research analysis and arguments, eg., comparison of influence that different features have on the performance and discussion on other possible features, experimental comparison to some related approaches.

(1) originality
The problem has been around for a while and addressed by several researcher in different ways. The proposed language-agnostic approach is rather straight forward and gives good results on a subset of relations (99 out of 395). The paper is of modest originality, it does give practically useful results but it is not clear how they would experimentally compare to existing approaches of related work.
Showing benefits from cross-language links and the growing number of relations by including more languages is very interesting. Also showing the distribution of relations and subject types is very interesting. More discussion on the cross-lingual dimension would also be nice and may yield to interesting research directions.

(2) significance of the results
The presented approach is technically well set, the experiments are clearly defined and the results are showing good performance on the subset of relations. The main drawback from the research aspect is lack of justification for the setting. For instance, features play a crucial role for the results, thus one would expect more arguments on the selected set of features including discussion comparing with the features used in some existing approaches. Comparing several learning algorithms is also fine, but the rational behind selecting them and analysis of their performance from the technical perspective is lacking. This is in line with the general impression of the paper being very useful in practical setting but a bit weak on providing research arguments and insights.
While the results on the 99 relations are good, it is not clear why this relations and what is the performance of the other relations. It may be that this relations are the most relevant for a certain kind of Wikipedia abstract (eg., settlement pages or personal pages), the paper would benefit from more discussion on that.

(3) quality of writing.
The paper is well clearly and well structured. Having an illustrative example is useful. Showing results of different algorithms and performance dimensions with graphs and table is also contributing to the clarity.