Searching for explanations of black-box classifiers in the space of semantic queries

Tracking #: 3469-4683

Authors: 
Jason Liartis
Edmund Dervakos
Orfeas Menis-Mastromichalakis
Alexandros Chortaras
Giorgos Stamou

Responsible editor: 
Guest Editors Ontologies in XAI

Submission type: 
Full Paper
Abstract: 
Deep learning models have achieved impressive performance in various tasks, but they are usually opaque with regards to their inner complex operation, obfuscating the reasons for which they make decisions. This opacity raises ethical and legal concerns regarding the real-life use of such models, especially in critical domains such as in medicine, and has led to the emergence of the eXplainable Artificial Intelligence (XAI) field of research, which aims to make the operation of opaque AI systems more comprehensible to humans. The problem of explaining a black-box classifier is often approached by feeding it data and observing its behaviour. In this work, we feed the classifier with data that are part of a knowledge graph, and describe the behaviour with rules that are expressed in the terminology of the knowledge graph, that is understandable by humans. We first theoretically investigate the problem to provide guarantees for the extracted rules and then we investigate the relation of "explanation rules for a specific class" with "semantic queries collecting from the knowledge graph the instances classified by the black-box classifier to this specific class". Thus we approach the problem of extracting explanation rules as a semantic query reverse engineering problem. We develop algorithms for solving this inverse problem as a heuristic search in the space of semantic queries and we evaluate the proposed algorithms on four simulated use-cases and discuss the results.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Accept

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 25/May/2023
Suggestion:
Accept
Review Comment:

The authors have addressed all the concerns in my previous review.

Review #2
By Daniele Porello submitted on 26/May/2023
Suggestion:
Accept
Review Comment:

After reading the authors' responses and the new version of the paper, I believe that they addressed my comments and requests of clarification.

Review #3
Anonymous submitted on 28/May/2023
Suggestion:
Accept
Review Comment:

This submission presents a revised version of a paper that I have recently reviewed. This new version satisfactorily addresses and corrects the main issue pointed out in my previous review.

Overall, I now believe that the paper can be accepted for publication.


Comments

Reviewer 1
We thank the reviewer for their valuable comments. We took into consideration the reviewer's comments and made revisions to our manuscript. Below we explain in detail how we addressed the reviewer’s comments.

Comment 1.1
One of the strategies used to merge queries in the proposed algorithms is to compute the Query Least Common Subsumer (QLCS). The existence of such query depends on assuming that every conjunctive query (CQ) contains an atom of the form TOP(x), where TOP is the well-known constructor from DLs. However, this assumption is not entirely consistent with the definition of CQs, i.e., - by definition, a CQ cannot have an empty body or an atom of the form TOP(x). Note that TOP is not a concept name. Therefore, it is wrong to assume that every CQ contains such an atom. This makes, in addition, the use of the empty query as a shorthand for { | TOP(x)} not well-defined. One way to achieve the desired effect could be to add to the TBox the GCI $TOP \sqsubseteq A$ where $A$ is a fresh concept name, and then assume that all CQs contain the atom A(x). This does not change the set of certain answers, and should not be a problem for the query subsumption partial order. Perhaps this is what the authors meant in the first place. However, this must be carefully explained.

Response 1.1
To address the issue, we have elected to abandon the use of the TOP constructor and instead opted to extend the definition of CQs to include queries with empty bodies. We have acknowledged this is not common in other works using CQs and we have taken care to adapt any other definitions to be compatible with queries that have empty bodies.

Revision: We have modified the definition of CQs in the opening paragraph of section 2.2 to queries with empty bodies (n ≥ 0) including an acknowledgement that this is not typical (“Generally, in the relevant literature… any special adjustments”). We have made small changes in the paragraph regarding the definition of query answers (“Given a KB… the literature as semantic queries”). We have also updated the argument in section 4.3.1 regarding the existence of the QLCS which relied on the inclusion of the TOP constructor (“As with the Kronecker product… to construct the QLCS.”).

Reviewer 2
We thank the reviewer for their valuable comments. We took into consideration all the reviewer's comments and made revisions to our manuscript. Below we explain in detail how we addressed the reviewer’s comments.

Comment 2.1
If I understand correctly, the approach is independent of the classifier and of the features that are used by the black-box model. In particular, concept names of the Knowledge Base are in general independent of the features used by the classifier. On the one hand, this is quite general and it is in principle applicable to any black-box. On the other hand, there is a sense in which the explanations provided by this framework are not explanations of the black-box model, we do not know how the model decided, and the explanation dataset and model do not mention information used by the classifier. The framework, in fact, provides a way in which someone, possibly the experts of the domain, rationalize the black-box model classification by means of
semantic information about the samples.

Response 2.1
As the reviewer correctly points out, explanations that do not directly operate on the inputs of the classifier are not in any way guaranteed to be linked to the semantics the classifier is using. Other works (Rudin 2019) prefer the term “summary of predictions” to characterize such explanations. We have added this distinction in the text. In lieu of any formal guarantees we have defined some measures of quality that can inspire some confidence about the link between the explanations and the semantics of the classifier. In our Visual Genome experiment we have also cross-referenced our explanations with saliency maps on the input images to investigate if the semantics expressed by the explanations are correlated with the areas of the image that most influence the decision of the classifier.

Revision: We have expanded the penultimate paragraph of section 3 to further discuss the uncertainty of the semantic links (“As we have previously mentioned… the classifier is using”).

Comment 2.2
Reading the paper I was somehow missing a clear understanding of the contribution of the axioms of the Tbox. In Example 1 and, if I understand correctly, in the rendering of the explanation models of the experimental evaluation, ontologies appear quite simple. This may be ok for the scope of this paper, which focuses on the general model and its properties. However, the authors could improve the discussion of the motivations for ontologies (if this is so). Are rich semantic characterizations of the concepts of the ontology useful? Do they affect the understandability for users? Are richer DL dialects (e.g. with Booleans) significant for the quality of explanations?

Response 2.2
We believe that we have already showcased the usefulness of external ontologies in the Visual Genome experiment wherein the terminology would be very poor without the use of the WordNet ontology. Incorporating external ontologies also allows the end user to select which terminology they would like to use. E.g. in the Visual Genome experiment, one could opt to use the ConceptNet ontology instead.

Revision: We have expanded the discussion of the motivating example in section 3.1 to further justify the use of ontologies (“In a more concrete example… more understandable.”).

Comment 2.3
It seems that in general the quality of the explanation model, in terms of correct explanation rules, depends on the language of the ontology, and the number of its concepts, i.e. on its granularity. It seems that in very abstract terms, an explanation model with a sufficient number of classes allows for defining rules for each prediction of the classifier. The dependence on the explanation dataset is acknowledged by the authors on page 33 (Line 50). If this is so, then, it seems that there is a trade-off to consider, between the richness of the knowledge base and its capability of providing correct explanation rules. Discussing this aspect is important to assess the significance of the approach.

Response 2.3
We agree with the reviewer that as the ontology becomes richer, our explanation model is able to produce explanations for any arbitrary data thus increasing the probability that the explanation and the classifier agree on the inputs by chance rather than by a semantic link. In terms of statistical learning, as the capacity of the model increases, so does the possibility of overfitting. We acknowledge this trade-off and we employ some techniques in our experiments to examine whether the explanation rules are correct, namely using a holdout dataset and cross-referencing with other explanation methods.

Revision: We have extended our discussion of this issue at the end of section 3 (“Unfortunately, as the semantic descriptions… the scope of this work.”).

Comment 2.4
A comment about exceptions. It seems that also for exceptions a trade-off is reasonable. The fewer the exception the closer the explanation model is to the black-box model, so in principle having no exceptions sounds like overfitting, thus possibly replicating the opacity of the black-box model.

Response 2.4
We do not consider the lack of exceptions to be a sign of overfitting in the general case. As we mentioned in the above answer, overfitting is curtailed by reducing the capacity of the explanation model and evaluated by testing whether the explanations generalize on a holdout dataset. A lack of exceptions under the right circumstances would actually provide a transparent and interpretable surrogate model.

Revision: As we mention in the above comment, we have added a brief discussion regarding overfitting at the end of section 3.2. We have also extended our discussion about exceptions in section 6.

Reviewer 3

We thank the reviewer for their valuable comments. We took into consideration all the reviewer's comments and made revisions to our manuscript. Below we explain in detail how we addressed the reviewer’s comments.

Comment 3.1
some abbreviations (e.g. sav and fol) are given in lowercase. It would be
better to use upper cases for readability (e.g. SAV and FOL)

Response 3.1
We agree with the suggestion of the reviewer and have converted the abbreviations to uppercase.

Comment 3.2
The experiment part is rather rich. How about the reproducibility? Are the
developed system (KGrules) and the experiment data sets available publicly?

Response 3.2
All code used is available publicly at https://github.com/ails-lab/kgrules-h. We have added a link to this repository at the beginning of section 5. The semantic annotations are also included in this repository. The original datasets are publicly available from different sources. Additionally, we use a classifier pre-trained on this dataset that is available publicly (http://places2.csail.mit.edu/models_places365/resnet50_places365.pth.tar) and we only feed Visual Genome images to it, so hopefully anyone interested can fully replicate our results.