Review Comment:
In this paper, the authors investigated the link prediction for knowledge graphs (KGs) and treated this problem as a binary classification problem. Authors empirically investigate the generalized neural embeddings. In this paper, there are two kinds of neural embeddings methods. The one is neural embeddings are trained once on the entire knowledge graph. The other is a specialized embedding for each relation. Based on the experimental results, the first embedding model attains similar performance to the second one. The writing of this paper is good, and it is an interesting problem to compare different embedding models for this task. However, the paper still has some issues which are given as follows:
(1) This manuscript does not bring some explanations for why this paper treads the linked prediction problem as a supervised learning problem. In page 2, authors claim this paper is an extension of learning and evaluating KGs neural embedding. The paper uses some machine learning metrics to compare different embedding model such as F1, Recall. However, from my knowledge, there exist a lot of knowledge graph embedding models such as TransE [1], TransH [2] which using other metrics, like Hits@10 or MRR, to learn and evaluate link prediction. Authors should explain why they use these metrics and design a method to compare traditional embedding models.
(2) The authors use two kinds of graphs to learn KG embeddings. The difference between the two kinds of embeddings is the specialized graph are trained for each relation, and the generalized embeddings are only trained once. In essence, the two kinds of graphs are generated by different sampling strategies. Thus, specialized embeddings also could be trained only once. The experimental results cannot illustrate the conclusion.
(3) Authors said that they propose a new way to train KGs embedding. I think it is an inappropriate statement. In section 2.3.2, this paper uses existing embedding model. In my view, the paper proposes a new sampling method and using existing embedding model to obtain the dense vector space. Another inappropriate statement is in section 1. In this section, a knowledge graph is treated as a graph where the links may have different types. However, the used embedding model does not employ any graph information. Both generalized graph and specialized graph view each triple as independence. These two graphs can be view as a different subset of a whole KG.
(4) In section 2.3.2, The formula may be incorrect. It looks like sum all standard Euclidean of a positive entity producing K negative entities. I am not sure this formula can obtain the result that the positive entity is closer than K negative entities. In the same time, the main idea of this embedding model is similar to TransE which uses a margin loss function to obtain the embeddings.
(5) We know the performance of downstream tasks highly depends on the embedding model. Moreover, the quality of embedding mode can be affected by many kinds of factors such as negative sampling, the initialising value of embeddings, the redundancy of triples in KG. The paper uses two types of a sampling method to build two training set. The sampling process is full of with randomness thus sampling positive examples, and negative examples will affect the quality of embeddings, and it is not easy to obtain the conclusion in this paper.
(6) The paper uses some traditional metrics to evaluate the performance of link prediction and uses close word assumption to generate negative examples. In this case, the number of negative samples is far higher than positive examples. The training dataset is imbalanced, and the trained classifier will trend to classify data as a negative example. In the same time, a testing dataset is also generated by sampling. If there exist similar scale negative examples in the two graphs, the F1 score will trend to similar. In the experiment section, authors do not analyse the effects on the imbalanced dataset.
Analysis the KG embedding is an exciting task. In this paper, authors gave a method to analyse the two kinds of embedding model and proposed two sampling methods to generate the training data for embedding model. However, the novelty of this paper is not good enough, and there exist some fatal mistakes. I do not recommend accepting this paper for this journal.
[1] A. Bordes, N. Usunier, A. GarcıaDuran, J. Weston, and O. Yakhnenko, “Translating embeddings for modeling multirelational data,” in Proc. Adv. Neural Inf. Process. Syst., 2013, pp. 2787–2795.
[2] Z. Wang, J. Zhang, J. Feng, and Z. Chen, “Knowledge graph embedding by translating on hyperplanes,” in Proc. 28th AAAI Conf. Artif. Intell., 2014, pp. 1112–1119.
