Review Comment:
This manuscript was submitted as 'full paper' and should be reviewed along the usual dimensions for research contributions which include (1) originality, (2) significance of the results, and (3) quality of writing. Please also assess the data file provided by the authors under “Long-term stable URL for resources”. In particular, assess (A) whether the data file is well organized and in particular contains a README file which makes it easy for you to assess the data, (B) whether the provided resources appear to be complete for replication of experiments, and if not, why, (C) whether the chosen repository, if it is not GitHub, Figshare or Zenodo, is appropriate for long-term repository discoverability, and (4) whether the provided data artifacts are complete. Please refer to the reviewer instructions and the FAQ for further information.
This paper tackles the challenge of making use of semantic information when training knowledge graph embedding models. The expected outcome is better models capable of more accurate predictions but also able to be trained on less data - the semantic data making up for less exemplars. The authors introduce an approach based on 5 semantic axioms derived from the OWL ontology of the data and show a positive outcome in the experiments campaign. This is a significant piece of work likely to have an impact on the community. However the paper can not be published as is and some minor and major points needs to be addressed.
Approach:
* It is rightly recalled that maintaining manually defined semantic axioms for KG is tedious but the working assumption of of NeuRAN is that such axioms are actually available. In particular OWL axioms defining the domain, range, symmetry, reflexivity and disjointness of all the classes. This is already tedious and, although desired, not always present in ontologies. It then sounds like those being a hard pre-requirement for NeuRAN the definition of axioms remains essential. This should be made more clear in the paper; and if that is a misunderstanding some more explanation should be given about the creation of the 5 axioms.
* In Section 3, generating negative triple with a random process whereas ontological axioms are available sounds like a missed opportunity. Those axioms could be used to generate harder negative triples which are logically sounds but factually wrong (instead of more easy ones which are wrong on both aspects). I wonder why this is not considered here. Furthermore the choice of random corruption over other possible strategies is not motivated.
* After reading the paper it is still unclear to me why a neural approach was needed here to incorporate the axioms in the loss function. Considering they are ontological rules, the probabilities could be booleans with a 1 if the rule is satisfied and a 0 otherwise. What is the exact gain of doing otherwise? As depicted in Figure 2, going for this approach of neural-like loss for the axioms implies having embeddings for the ontology ("Type embedding") whereas it is common to discard the TBox triples when doing KGE with semantics agnostic approaches such as TransE. But it appears that two embeddings are generated (one for ABox, one for TBox) so NeuRAN seems to still follow those best practices.
Presentation:
* There are a few grammatical errors in different places (e.g. "How could [...] is [...]" questions in Fig 1 - the "is" should be a "be").
* The running head title reads "Running head title"
* Notations are not always convenient and not always introduce. For example, using bold letters to indicate embeddings is not the most convenient approach for the reader. In Equation (3) the argument "t" of the loss function is not used nor introduced. I assume "t" is for "triple" and thus t=(s,r,o) \in T. Equations (1) and (2) are not aligned with the content depicted in Fig 2 where E is derived from all the P whereas those two equations use only E functions
Experiments:
* It is unclear from reading Table 4 what those numbers are. This is explained in the text as AUC values but not recalled as a reading guide in the caption. It is also unclear if then a difference of 0.003 (FB) to 0.008 (WN) on average between CKRL and NeuRAN results is statistically significant.
* The gold standard for each of the three sets of experiments is unclear. In particular for the triple classification it is mentioned that negative triples are generated at random but there is no indication whether those negatives where also, or not, part of the negatives in the training set. For the other two experiments it is assumed the gold standard was part of the dataset but this is not made clear in the paper
* Considering the importance of the ontological axioms, there should be a table reporting the number of them available in both FB15K237 and WN18RR. This table could be matched with a short discussion explaining if those are deemed sufficient. Maybe with too a discussion on the potential downsides on predictions if it would happen that some of those axioms are actually wrong.
* In experiments 4.3 the performance is announced to be evaluated against accuracy, precision and recall but the results for those metrics are not reported in the paper.
Stable link for resources:
* This is a major concern: as of today (April 8, 2022) the repository contains only an empty README file pushed 9 months ago.
|