Review Comment:
I thank the authors for resolving many of my comments about the previous version of the paper. As I have already mentioned before, the paper constitutes a large and useful body of work. I recommend its acceptance given that my remaining concerns (of which there are quite a few, but none of them major) are addressed.
I wanted to point out the following positives in particular:
- a good in-depth discussion on relevant ontologies and semantic interoperability
- useful extra sections on explainability and accessibility
- Table 16 is an excellent contribution.
- new parts that discuss the challenges supported by the systems, and the quality of the systems. This is a huge improvement over just showing huge tables (which are now moved to appendix).
- the improved and more granular reference architecture.
The new version now also outlines the particular role of SW technologies for overcoming the challenges. But, I found the discussion being too sparse in some cases, and there are many missing citations. I list some of these below.
Section 5.3.3:
The part on "rule-based reasoning" should be accompanied by citations; which systems use rule-based reasoning, and which ones rely on SW rules in particular?
Section 5.4.3
This section is very sparse. If the reviewed systems rely on SW rules for decision support - as it is currently implied - they should be cited here and this aspect of the systems discussed in more detail.
Section 5.5.3
"Explainability relies on domain knowledge". Not as a general rule - for instance, post-hoc XAI methods such as SHAP, LIME do not rely on domain knowledge.
"Additionally, many ontology reasoners provide explanations of the reasoning process, although it is not clear whether these explanations are made available to the end users." Yes, explanations are made available to end-users, but they still need to be formulated in a human-interpretable way to be useful. For instance, Protege formulates explanations on the reasoning behind an inference. The eye reasoner generates proofs for inferences, which can similarly be formulated in a human readable way [https://link.springer.com/chapter/10.1007/978-3-031-54303-6_7].
Section 5.6.5.
"[..] an approach used by a few of the selected systems." Citations needed.
I also still feel that the scoring perspective requires nuancing. Some challenge aspects may not be be relevant to all types of systems (such as DSS supporting more than one type of user). An excellent system that specifically focusses on end-users, and not clinicians, would thus score lower here (although some would argue that systems should always be tailored to a specific user group). This could be added as a limitation at the end of the paper, one that is shared with many other scoring / benchmarking efforts - it may not be fully applicable to all possible systems, and consequently some people may want to adapt it (such as single-user systems).
Other comments:
- It is unclear how process interoperability applies to the content of Subsection 5.1.4, as it does not pertain to integrating data (e.g., sensor data, EMR data). Perhaps an option here is to integrate the contents of this subsection into the prior subsections.
- There are some inaccuracies in the SW overview:
"RDF-star, which allows the subject or object of a triple to refer to another triple"
This is not wholly accurate, as the embedded subject/object triple is not necessarily asserted (and thus not "referred" to).
https://www.w3.org/TR/rdf12-concepts/#section-triple-terms-reification
"Other important standards in the Semantic Web community are: eXtensible Markup Language (XML), a markup language and file format". I would not call XML a SW standard.
- "This explicit provision of explanations can also be considered a form of post hoc explainability"
Both interpretable ML models, and post hoc explainability, aim to provide such explanations. Hence, it seems incorrect to put them in the same category as post-hoc explainability. It would be more appropriate to put this into a separate "explanation quality" section.
- The last part of section 5.5.2, starting with "Only six of the selected systems report", pertains to explanations in general and thus does not belong in the subsection either.
- It is strange to put "challenges assessment" under the "summary" section 5.8, which should be, well, a summary of what came before and thus not introduce new content. I would propose adding it as a separate subsection before the summary section. Same comment for "quality assessment" under the summary section 6.5.
- Section 8 provides useful summaries on SW usage, quality assessment, and future research directions. That said, it seems that "new" takeaways from section 8.2 (e.g., such as lack of resource re-use) do not belong under future research directions; the section should not introduce new observations.
Minor comments:
- "explainability is gaining traction as a pivotal aspect of AI-driven health systems" I don't believe that AI-driven health systems have been discussed / introduced yet, so it's a bit strange to describe aspects of it here.
- "Group 4: Other reviews related to AI and technology in the health domain"
Similar to before; I don't think AI-driven health systems have been introduced yet.
- "In contrast to SSN, SAREF is targeted at industry developers rather than ontology experts [25], making it practical for real-world applications." Surely the development of an ontology by experts does not precluded from being practical in the real world.
- The objectives listed under Section 4.1 do not fully correspond to the contributions listed in the introduction - namely, a quality assessment of the selected systems based on data, devices used, etc.
- "Semantic Web technologies can also contribute to syntactic interoperability, albeit in an indirect capacity." By offering RDF as a uniform data description language, SW Technologies can also offer syntactic interoperability. E.g., FHIR offers a Turtle syntax, which is a serialization format of RDF.
- "Although several situation-focused ontologies have been developed [..] none of the selected systems extend any such ontologies." How about systems using (not necessarily extending) existing ontologies?
- "The limitations of semantic-based approaches, such as scaling difficulties and inability to handle uncertainty, can be mitigated by combining them with complimentary techniques such as ML and Bayesian networks."
As the authors themselves mention later on (Section 5.6.5), there exist extensions of SW technology specifically to cope with uncertainty.
- "However, the use of these technologies does not guarantee explainability." Can the authors give an example here of where SW technology does not lead to explainability?
- "While ML is often criticized for its susceptibility to producing black box models" I don't think "susceptibility" is the right word to use here.
- "An additional ethical concern is the cascade of care, a phenomenon in which incidental findings from screenings or monitoring result in further clinical care."
Please consider elaborating on how personal health monitoring systems can exacerbate the cascade of care.
- "Using platforms like GitHub rather than static files has the advantage of version control". The drawback here is that GitHub repo's can easily be deleted or made private. I have experienced this personally for several papers. A Zenodo record has the benefit that it cannot be deleted after creation, but new versions of the record can still be added. Another solution to improve accessibility is for journals to have a data availability policy.
- Please clarify the following: "This can be attributed to the fact that there is a gap in tooling support for Semantic Web representations such as RDF with standards such as FHIR." FHIR has an RDF representation (using Turtle serialization).
- In general, Turtle is a serialization format (there are others, such as N-TRIPLES) of RDF, the latter being the abstract data representation language. This should be clarified throughout the paper.
|