Generalized neural embeddings for link prediction in knowledge graphs: training, evaluation, explanation

Tracking #: 2014-3227

Asan Agibetov
Matthias Samwald

Responsible editor: 
Guilin Qi

Submission type: 
Full Paper
Link prediction is the task of finding missing or unknown links among inter-connected entities in knowledge graphs. It can be accomplished with a classifier that outputs the probability of a link between two entities. However, the way in which entities and networks are represented strongly determines how well classification works. Recently, several works have successfully used neural networks to create entity embeddings which can be fed into binary classifiers. Moreover, it was proposed in literature that creating specialized embeddings separately for each relation type in the knowledge graph yields better classifier performance, as opposed to training a single embedding for the entire knowledge base. The drawback of these specialized approach is that they scale poorly as the number of relation types increases. In this work we formalize a unified methodology for training and evaluating embeddings for knowledge graphs, which we use to empirically investigate if, and when, the generalized neural embeddings -- trained once on the entire knowledge graph -- attain performance similar to specialized embeddings. This new way of training the neural embeddings and evaluating their quality is important for scalable link prediction with limited data. We perform an extensive statistical validation to empirically support our claims, and derive relation-centric connectivity measures for knowledge graphs to explain our findings. Our evaluation pipeline is made open source, and we aim to draw more attention of the community towards an important issue of transparency and reproducibility of the neural embeddings evaluations.
Full PDF Version: 


Solicited Reviews:
Click to Expand/Collapse
Review #1
By Lei Hou submitted on 01/Dec/2018
Major Revision
Review Comment:

This paper investigates and compares two typical nerual knowledge graph embedding methods, namely generalized and specialized versions. The former learns entity embeddings from the whole knowledge graph while the latter only focuses on a single relation type at each time. Through the experiments and analysis on 4 public datasets, the authors reach a conclusion that the generalized method can achieve comparable performance as the specialized one.
The paper has two main advantages:
1. Clarity: The problem formalization and the method procedures are clearly presented. Readers, even those who are not familiar with this area, can understand the target problem and proposed method well. By the way, I like Fig. 6 because it gives a good illustration of all related triple sets.
2. Experiment: The paper conducts several lines of experiments and presents comprehensive analysis, which validate its assumption or support its conclusions.
However, I still have the following concerns or suggestions:
1. The major concern is that, I do not think the work discussed in this paper can be called knowledge graph embedding, or to say, the employed method (in Section 2.3.2) that encourages linked entities to be closer is not a proper choice for knowledge graph embedding. The first-order proximity is widely used in network embedding where the nodes are often homogenous, e.g., the social network. The knowledge graph is composed of heterogeneous nodes (different types of entities), and the links between entities do not represent similar relation but convey specific semantics. As such, I suggest that the authors should choose a better basic method.
2. Following the above problem, the current method essentially ignores the relation semantics. Actually, the relations in knowledge graph not only have semantics, but also have some inter-relation dependence or constrains (e.g., transitivity), which can help learn better entity embeddings. I'm not sure the specialized methods can preserve such properties.
3. The literature review (Section 1.1) is relatively short, and should be expanded to cover more related research.
4. I'm not sure whether the multi-relatedness defined in Section 2.2.1 is necessary because it is rarely mentioned in the following description as compared with other two metrics.
5. As for the method and evaluation:
1) In the last sentence of Section 2.3.2, is that "standard Euclidean or dot product"?
2) You declared that "We always keep the 1:1 ratio for the positive and negative examples" on the left column of Page 8, while the right wrote "the maximum number of negative entities per one positive triple is 10 (k = 10)". It takes me some time to understand, thus I suggest you making it more clear.
3) The employed embedding method encourages linked entities to be closer. Intuitively, the element-wise difference might be a good indicator for link prediction if you use standard Euclidean distance in 1), but I did not see your discussion about it (only concatenation, element-wise sum or mean).
4) I acknowledge that the figures 7-10 show some valuable trends that validate the conlusions, but they are so small that readers cannot see some details. So I suggest you adding necessary tables to present the results.
6. There are some typos, e.g., in the notation description (Page 4), "entities and relation types are disjoint sets" should use set operator rather than the "not equal" symbol, the caption of Fig.3 should start with "On the left".

Review #2
Anonymous submitted on 18/Jan/2019
Review Comment:

In this paper, the authors investigated the link prediction for knowledge graphs (KGs) and treated this problem as a binary classification problem. Authors empirically investigate the generalized neural embeddings. In this paper, there are two kinds of neural embeddings methods. The one is neural embeddings are trained once on the entire knowledge graph. The other is a specialized embedding for each relation. Based on the experimental results, the first embedding model attains similar performance to the second one. The writing of this paper is good, and it is an interesting problem to compare different embedding models for this task. However, the paper still has some issues which are given as follows:

(1) This manuscript does not bring some explanations for why this paper treads the linked prediction problem as a supervised learning problem. In page 2, authors claim this paper is an extension of learning and evaluating KGs neural embedding. The paper uses some machine learning metrics to compare different embedding model such as F1, Recall. However, from my knowledge, there exist a lot of knowledge graph embedding models such as TransE [1], TransH [2] which using other metrics, like Hits@10 or MRR, to learn and evaluate link prediction. Authors should explain why they use these metrics and design a method to compare traditional embedding models.

(2) The authors use two kinds of graphs to learn KG embeddings. The difference between the two kinds of embeddings is the specialized graph are trained for each relation, and the generalized embeddings are only trained once. In essence, the two kinds of graphs are generated by different sampling strategies. Thus, specialized embeddings also could be trained only once. The experimental results cannot illustrate the conclusion.

(3) Authors said that they propose a new way to train KGs embedding. I think it is an inappropriate statement. In section 2.3.2, this paper uses existing embedding model. In my view, the paper proposes a new sampling method and using existing embedding model to obtain the dense vector space. Another inappropriate statement is in section 1. In this section, a knowledge graph is treated as a graph where the links may have different types. However, the used embedding model does not employ any graph information. Both generalized graph and specialized graph view each triple as independence. These two graphs can be view as a different subset of a whole KG.

(4) In section 2.3.2, The formula may be incorrect. It looks like sum all standard Euclidean of a positive entity producing K negative entities. I am not sure this formula can obtain the result that the positive entity is closer than K negative entities. In the same time, the main idea of this embedding model is similar to TransE which uses a margin loss function to obtain the embeddings.

(5) We know the performance of downstream tasks highly depends on the embedding model. Moreover, the quality of embedding mode can be affected by many kinds of factors such as negative sampling, the initialising value of embeddings, the redundancy of triples in KG. The paper uses two types of a sampling method to build two training set. The sampling process is full of with randomness thus sampling positive examples, and negative examples will affect the quality of embeddings, and it is not easy to obtain the conclusion in this paper.

(6) The paper uses some traditional metrics to evaluate the performance of link prediction and uses close word assumption to generate negative examples. In this case, the number of negative samples is far higher than positive examples. The training dataset is imbalanced, and the trained classifier will trend to classify data as a negative example. In the same time, a testing dataset is also generated by sampling. If there exist similar scale negative examples in the two graphs, the F1 score will trend to similar. In the experiment section, authors do not analyse the effects on the imbalanced dataset.

Analysis the KG embedding is an exciting task. In this paper, authors gave a method to analyse the two kinds of embedding model and proposed two sampling methods to generate the training data for embedding model. However, the novelty of this paper is not good enough, and there exist some fatal mistakes. I do not recommend accepting this paper for this journal.

[1] A. Bordes, N. Usunier, A. Garcıa-Duran, J. Weston, and O. Yakh-nenko, “Translating embeddings for modeling multi-relational data,” in Proc. Adv. Neural Inf. Process. Syst., 2013, pp. 2787–2795.

[2] Z. Wang, J. Zhang, J. Feng, and Z. Chen, “Knowledge graph embedding by translating on hyperplanes,” in Proc. 28th AAAI Conf. Artif. Intell., 2014, pp. 1112–1119.