Review Comment:
In this paper, the authors describe an approach (in 2 variants) to extend a well established chemistry ontology, Chebi, with new classes (more precisely, suggestions for new, atomic ’SubClassOf’ axioms) for un-seen chemicals from their SMILES strings. The approach uses transformers/deep learning models and domain-specific embeddings of the SMILES strings and is trained on the existing Chebi ontology (where the SMILES strings are captured as annotations of classes). The results are promising: on existing Chebi classes, they achieve good F1 scores and outperform the authors’ previous approach, and on those not yet covered in Chebi they look good but exact evaluation is part of future work. The new approach has the potential to be transparent: considering areas of attention of the model on (positive?) classifications can indicate reasons for the classifications.
The paper reads well and the results are interesting, though I think the presentation can and should be made more clear by addressing the following points. Also some of the claims about this new approach should be made a little more carefully to fit the evidence gathered so far:
- Explanations of diagrams and plots need to be clearer: all axes and colours used need to be explained, ideally in the caption (eg Fig 7, it’s unclear what the x-axis is and what the colours mean (perhaps the ‘blue/red’ for Fig 7(c) is also used for (a) and (b) but this can be made clearer. Eg Fig 8, what is enumerated on x-axis (this becomes sort of clear in the text but should also be clear from caption)). Also, please make sure that numbers on axes are readable (even with quite a lot of zooming in, this isn’t the case in Figure 2 - and it’s also lacking labels)
- Some of the claims made are not strongly supported by the evidence provided in the paper: the interpretability/explainability is discussed by an interesting example, but a suitable evaluation is left for future work. Furthermore, it seems that explanations will only be available for positive classification: what would one do for false negatives? Similarly, the current approach addresses ontology learning in a very weak form as it is restricted to learning of atomic subclass-relationships. While the results are interesting, one could also call this ‘class localisation’ or ‘class insertion’.
- Throughout the paper, the nature of the (structured) annotation used should be made more clear: it took me a while to realise that the SMILES strings were used (and without further statements around them) since the annotations used are described in quite a few different ways first.
More detailed comments and suggestions:
Page 2
- line 5: rephrase ‘Cheb tries to’ to ‘Chebi engineers try to ‘ or such like
- Line 12: is there a reference for Chebi’s workflow? Also "navigating the ontology scaling dilemma “: is ‘navigating' really what you mean here?
- Line 13: I don’t understand “ design decisions [..] analogously to new classes and relations, “
Page 3:
- perhaps also explain what the *target* of these ontology extension approaches are (are they all aimed at atomic SubClassOf axioms?)
- Would the following be clearer? "Given the *documented, structured* design decisions by the ontology developers, how would they extend their ontology to cover a novel entity? “
- Line 39: "within these structures *whose* sub-graphs may themselves”?
- Perhaps move the explanation of SMILES to an earlier point, eg a small section on ‘background on Chebi’?
Page 5
- line 16: can you be more precise on "and the system as a whole was not explainable.”
- Line 31: "based on the design decisions that are implicitly reflected in the structure of ChEBI. ” one of the places that confused me (see above): isn’t your approach rather focussed on the (structured) annotation documenting/reflecting these design decisions?
Page 6
- line 34: "One of these successful architectures is RoBERTa [44], *whose* architecture offers a learning paradigm “
- Line 33: here and later in the evaluation, it would be interesting to know this distribution of ‘several’ (direct) superclasses: how many classes have 1 superclass? How many 2?…
- "with a plausible real-life dataset of chemicals” isn’t it that the related use case is realistic (rather than the dataset ‘real-life’)?
Page 7: line 25 could you briefly sketch the algorithm used and/or explain the ‘class merging’ step (perhaps using an example)?
Section 4.1: given that we’re looking at multi-label classification could you please briefly explain precision/recall: do we need to get the whole label set correct to be correct or is this counted on a ‘per label’ basis’ (and perhaps drop the explanation on page 11 of the usual precision and recall)? Also, perhaps illustrate the different F1 scores using a small example?
Page 10: why is Table 2 and Figure 7 restricted to Electra - or - how are these for Roberta?
Page 11: briefly explain what a ‘smaller class’ is? Also, "some of the predicted subclass relationships can be determined to be correct according to ChEBI, while others are incorrect.” - is confusing: some of these *are* correct according to Chebi - and that’s how the whole evaluation works, right? This needs clarifying, in particular, what is the verdict if, for chemical X, the predicted direct superclass is Y but it should be Y’s sub- (or super-)class Z?
Page 13 line 47: how many are ‘several’ (see above - would a distribution be interesting?)?
Section 4.3: did you eye-ball the extended ontology? It seems that one could relatively easily pick some of the new classes and (ask some chemists to) check the classification (even manually this should be feasible for quite a good sample)?
Page 16 line 1: which part of the OWL DL expressivity is relevant for your approach? Isn’t it restricted to /focussed on/working with the (inferred) class hierarchy and treating the rest of the ontology as a black box?
Page 16 line 32: " Visualisations such as those in Figs. 9 and 10b can be used to explain decision made by the model, raise trust in the prediction system, a” this looks a bit like an over-statement to me.
|