Review Comment:
Introduction
In the submitted work, the authors propose a novel relational graph neural network that integrates geometric (curvature) and semantic information within relational graph attention layers, specifically during the message aggregation phase. In particular, they leverage Ollivier–Ricci curvature to downweight redundant (high-curvature) edges and to amplify bridge (low-curvature) edges, thereby contributing to the research field that integrates curvature information into knowledge graph embeddings, with a focus on the link prediction task.
Methodology: Edge Aggregation via Parallel Resistance Analogy
The authors propose a methodology to compute Ollivier–Ricci curvature on multigraphs (knowledge graphs) by transforming the multigraph into a weighted directed graph, following a formulation that closely resembles that of [2], although it is neither cited nor acknowledged and is instead presented as novel. Even if the formulation is elegant, given the unit resistance assumption and the fact that the resulting expression essentially collapses to 1/(number of non-loop edges), the formalization could be presented in a clearer and more explicit manner.
Methodology: Attention-Based Aggregation
The proposed message passing neural network consists of two layers, both of which employ attention mechanisms during the aggregation phase to weight different members of a node’s neighborhood.
In the first layer, neighbors are aggregated using attention weights computed as a softmax over the Ollivier–Ricci curvature values obtained in the previous phase. This mechanism effectively downweights redundant edges and favors the aggregation of bridge edges.
In the second layer, neighbors are aggregated using attention weights computed as the softmax over dissimilarity scores between the node embedding and the neighbor embeddings produced by the first layer. However, this module is not clearly motivated, and additional explanation or justification should be provided.
The authors use the term “evolutionary” to describe the progression from geometric attention in shallow layers to semantic attention in deeper layers. However, this terminology is potentially misleading, as it is strongly associated with a different research field. A change in terminology or a clearer reformulation is therefore recommended.
Experiments
The authors provide a standard evaluation for knowledge graph completion tasks based on ranking metrics (MRR, Hits@K), comparing the proposed model against several sota methods, including translational, semantic, MLP-based, GNN-based, and attention-based approaches. However, the experimental comparison lacks methods from the literature that explicitly leverage curvature information (e.g., [1]), which closely resemble the experimental setup and choice of scoring function and would therefore represent a particularly relevant baseline.
The effectiveness of the proposed approach is further supported by an extensive ablation study, which demonstrates the individual contributions of both the geometric and semantic aggregation components. Additionally, structural analyses provide insights into the model’s performance on specific datasets, thereby strengthening the experimental results and clarifying the scenarios in which the model is likely to perform better. Further analysis examining the impact of the proposed Ollivier–Ricci curvature–based methodology, specifically, selecting the top k% of edges, removing them, and then applying state-of-the-art models, reinforces the effectiveness of the semantic-aware module.
However, the experimental section lacks an analysis of training and evaluation times. Such results would be important to assess the computational complexity and overhead introduced by curvature computation and semantic similarity calculations, and to enable a more complete comparison with existing models.
Long-term stable URL for resources
The provided repository does not include a README file or any usage documentation. The accompanying Python code contains minimal docstrings and lacks clear usage explanations. In addition, the data files are not well organized, making it difficult to directly identify the purpose of each file. As a result, reproducing the experiments and models based on the proposed methodology is challenging. The absence of documentation significantly hinders usability, and the inclusion of a README, basic usage instructions, tutorials, or example scripts is strongly recommended. Nevertheless, the provided datasets themselves appear to be complete.
Conclusion
In conclusion, the proposed methodology could represent a valuable addition to the field, provided that the text is revised for clarity, additional information on training and evaluation times is included, and the accompanying resources are supported by more comprehensive and detailed documentation.
Questions
1. The semantic-aware aggregation module is not sufficiently motivated. Could the authors provide additional explanation regarding its role and intended effect? In particular, is this component designed to mitigate oversmoothing effects in GNNs, as discussed in the introduction?
2. In the implementation of the proposed semantic-aware aggregation module, have any optimization strategies been introduced to reduce the number of required computations? Specifically, how is the computation of similarity between non-connected nodes avoided, and how does the approach scale to semantic web datasets with a large number of entities?
3. With reference to Section 4.4, and in particular Table 3, does the removal of the top-k edges have an impact on model training time? Furthermore, could this methodology be leveraged to improve scalability when applied to large-scale knowledge graphs?
References
[1] Guo, D., Su, M., Cao, C., Yuan, F., Zhang, X., Liu, Y., and Fu, J. (2023). Curvature-driven knowledge graph embedding for link prediction. In Proceedings of the 26th International Conference on Computer Supported Cooperative Work in Design (CSCWD), pages 1226–1231. IEEE.
[2] Klein, D. J., and Randić, M. (1993). Resistance distance. Journal of Mathematical Chemistry, 12(1), 81–95.
|