Review Comment:
This paper studies to extract substructures of given entities and learn entity embeddings from those extracted substructures. The idea might not be bad. But the paper is not well written. A lot of terminologies and key technical details are not clearly defined or described. And the experimental results are not that promising. Detailed comments are given as follows.
1. This paper uses a lot of (different) terminologies, e.g., "representative neighborhood for each target entity", "representative subgraphs of target entities", "extract relevant/interpretable/meaningful entity representations". It is not clear what the authors mean by "representative neighborhood", "representative subgraphs", and "entity representations". Do they all refer to the same thing? If so, please unify the terminologies and provide precise definitions.
2. In Definition 2, the authors define a semantic relationship as a triple . But in the follow-up sections, a semantic relationship actually means p^d.
3. Please re-organize Section 4, which gets only a single subsection 4.1.
4. Algorithm 1 and Algorithm 2 seem to be complicated. Could the authors further provide a formal complexity analysis?
5. There are key technical details missing. It is not clear how to obtain entity embeddings after computing Specificity (or Specificity^H).
6. It is not clear how to create graph embeddings for those entities with the type Film/Book/Album/Country/City.
7. The authors mention that "we use three different datasets from different domains for the tasks of classification and regression". What are these three datasets?
8. Figure 6: why not conduct the same experiments on the Book data?
9. Figure 7: why not conduct the same experiments on the Book or Album data?
10. Section 6.5: why use a new dataset DBpedia Pagerank here? How does the new dataset relate to those introduced in Section 6.1?
11. Could the authors further explain how to interpret Figure 8?
12. The experimental results for the entity recommendation task (Figure 9) are not promising enough. The proposed method cannot beat the best performing baselines. And sometimes it performs substantially worse (Figure 9(b) and Figure 9(e)).
|