Editorial Board

Editor-in-Chief
Cogan Shimizu
Eva Blomqvist

Editorial Board
Mehwish Alam
Claudia d’Amato
Stefano Borgo
Boyan Brodaric
Philipp Cimiano
Michael Cochez
Oscar Corcho
Bernardo Cuenca-Grau
Elena Demidova
Jerome Euzenat
Sebastián Ferrada
Mark Gahegan
Aldo Gangemi
Dagmar Gromann
Armin Haller
Pascal Hitzler
Aidan Hogan
Katja Hose
Eero Hyvönen
Krzysztof Janowicz
Sabrina Kirrane
Agnieszka Lawrynowicz
Freddy Lecue
Maria Maleshkova
Raghava Mutharaju
Axel Polleres
Guilin Qi
Marta Sabou
Harald Sack
Angelo Salatino
Christoph Schlieder
Stefan Schlobach
Cogan Shimizu
Blerina Spahiu
Sanju Tiwari
GQ Zhang
Rui Zhu

Former/Founding Editors-in-Chief
Krzysztof Janowicz
Pascal Hitzler

Editorial Assistants
Michael McCain

Syndicate

The CAAPT ontologies: an ontological framework for the representation of museum-based critical cataloguing

Submitted by Erin Canning on 12/15/2025 - 09:11

Tracking #: 3991-5205

Authors:

Erin Canning

Responsible editor:

Guest Editors 2025 OD+CH

Submission type:

Full Paper

Abstract:

This paper introduces CAAPT (Computational Approaches to Addressing Problematic Terminology), an ontological framework for the representation of cultural heritage terminology guidance documents and the decision-making practices involved in this domain as linked open data. CAAPT consists of three constituent ontologies: CAAPT-O, CAAPT-UC, and CAAPT-DM. These are trialed through the instantiation of a knowledge graph populated by the contents of three cultural heritage terminology guidance documents and an institutional record that documents the use of guideline suggestions to make decisions regarding critical cataloguing actions at the Victoria and Albert Museum. This knowledge graph demonstrates the affordances of the ontologies. A linked open data vocabulary, CAAPT-V, is also introduced in order to provide a set of reference values to be used in instantiations of the ontological framework. Lastly, this paper proposes a novel approach to ontology engineering that is grounded in critical theory, namely concepts from feminist and queer theories, thus aligning the theoretical framework of the technical development work with that of the domain it is considering.

Full PDF Version:

swj3991.pdf

Previous Version:

The CAAPT ontologies: an ontological framework for the representation of museum-based critical cataloguing

Tags:

Reviewed

Long-term Stable Link to Resources:

https://github.com/eecanning/caapt/blob/fb6b7c315555cd556377148c64b741269c83f85c…

Decision/Status:

Solicited Reviews:

Click to Expand/Collapse

Review #1

Anonymous submitted on 21/Feb/2026

Suggestion:
Accept

Review Comment:

I thank the authors for the revised version. Several points raised in the first review have been addressed and the paper is clearer. The work remains interesting and relevant for the domain but some points still require clarification or deeper justification.

Regarding the modeling, the class caapt:Guide raises questions. The argument that it is expressed in natural language does not seem sufficient to justify creating a new class. Other existing classes already support natural-language content: CIDOC CRM E33 Linguistic Object, OntoLex ConceptSet, SKOS ConceptScheme. These could potentially represent the same notion. It would be useful to elaborate more on this modeling choice and provide stronger arguments.

In the trialing section, the term "reconciliation" suggests "disambiguation", which is more common in knowledge graph population, you may consider using this term.

As mentioned in my first review, the manual work required at the beginning of the process remains a strong limitation. This affects generalizability and reusability. The authors acknowledge this in the Limitations section, but it would be helpful to propose follow-up actions to mitigate it. Many research efforts exist on automatic document processing and on disambiguation, especially for structured documents such as the ones considered in this research. This literature should be checked and the choice of a manual approach should be better justified as it impacts reproducibility of the whole framework.

In the discussion accuracy is described as critical for the validity of the knowledge graph, it is also stated that staff confirmed that suggestions were accurately translated. However no concrete evaluation is provided to support this claim. An empirical assessment with an inter-annotator agreement measure would strengthen the argument.

There are a few typos to correct: Page 7, "lex-O" should be "lex-0". Page 12, last paragraph, remove "see" in "is illustrated in see Figure 6". Page 16, first paragraph, check the use of brackets in "the original predicate - to indicate...".

All the described ontologies are made available by the author as TTL files on a GitHub repository along with competency questions. The repository is well organized and contains a README file. The provided resources are not complete for replication of experiments as the museum documents and subsequent knowledge graphs could not be shared and part of the population process is manual.

Review #2

By Maria Theodoridou submitted on 28/Feb/2026

Suggestion:
Accept

Review Comment:

The author has revised the paper and taken into consideration all the addressed comments. I find the author's answer to the reviewers' comments sufficient and well documented. The author provided more figures that present the ontologies clearly. The addition of smaller subgraphs makes the reading of the paper easier and comprehensive.

I think it is now a good paper to be published.

Review #3

By Francesco Corcoglioniti submitted on 29/Apr/2026

Suggestion:
Accept

Review Comment:

I find that the manuscript has significantly improved since the first submission, and I'd like to acknowledge and praise the substantial effort put by the author in this revision round.

All my previous comments -- presentation-wise, content-wise, and w.r.t. linked ontologies -- have been adequately addresses by the author, through adapting the manuscript, the resources and/or clarifying reviewer's doubts in the author's response.

Presentation-wise, the revised text is now clearer and easier to follow, and I find that the additional diagrams and especially the running example in Turtle that accompanies the various sections of the text, are both effective in conveying the author's modeling ideas.

Content-wise, I thank the author for having considered and addressed the points I raised. About my previous comment C3 (caapt:TermEntry subclass of culco:Contentious), I didn't note the use of skos:exactMatch with culco:ContentiousIssue in CULCO, which makes it a skos:Concept: given this, I stand corrected and I find the author's solution reasonable and in line with CULCO design (I remain a bit confused about conflating a concept with an information object, but this regards now CULCO and not the authors' contribution). About previous comment M7 (need for frac:observedIn and rdf:value), I do now understand this is a result of FrAC having been updated after the first manuscript submission: I find the author's response (avoid changes while FrAC is still evolving) reasonable, and I appreciate the addition in the discussion section of a paragraph about the (risks of) reuse of evolving ontologies with specific reference to FrAC. This can help clarifying the situation to readers looking into the paper and the proposed ontology modules. About previous comments C2 and M6, I respectfully maintain my view but I also accept author's position and reply, in the light that the proposed modeling decisions do not introduce logical inconsistencies.

Overall, I think that this is a solid contribution and that my previous concerns have been all addressed. I'm happy with the revised manuscript, and I have no further concerns that would ask for additional review rounds. I thus recommend acceptance, leaving to author to decide whether/how to optionally integrate the remaining feedback in this review when finalizing the manuscript.

What follow are a list of typos and other presentation-related suggestions, as well as some clarifications on C2 and M6, which are provided here in the hope they can be of help to author in continuing this line of work, but that do not represent criticisms that need addressing.

### PRESENTATION-RELATED SUGGESTIONS ###

My main (and only non-trivial) suggestion for the author is to move paragraph "Terms and entries" (incl. Figure 5 and illustrative example) before paragraph "Guidance documents" (incl. Figure 4 and illustrative example). Right now, the presentation of CAAPT-O, which starts with "Guidance documents", already touches terms and entries, so the following "Terms and entries" paragraph appears a bit redundant and/or in the wrong order to me.

Then, here is a list of typos and other spot issues/suggestions, all of which can be handled straightforwardly:

* (T1) [page 2] "... RDF syntax style Turtle ..." -> remove "style"
* (T2) [page 3] "... as well as a language used ..." -> remove "a"
* (T3) [page 5] "... offers the opportunity to consider that ..." -> what about a much simpler "... suggests that ..."?
* (T4) [table 3] Suggestion: what about adding also row(s) for FrAC and OA, which are also mentioned in the text? May also cite Lexinfo and Lex-O in the row for OntoLex.
* (T5) [page 7] "... several usage typologies, including that encoded in LexInfo" -> "those" instead of "that"
* (T6) [page 10] "... contain term entries ..." -> suggest adding "(caapt:TermEntry)" to explicitly link to content of Figure 4; same for "contain suggestions"
* (T7) [page 11] "The entry of a term is differentiated from the concept of the term following OntoLex's conceptual modeling ..." -> this works only if I interpret entry = ontolex:LexicalEntry, instead of caapt:TermEntry: suggest clearly referring to ontology concepts (ontolex:LexicalEntry vs. caapt:TermEntry) to disambiguate
* (T8) [figure 4] suggest adding a box for caapt:Suggestion, as long as it is mentioned in the main text associated to the figure
* (T9) [page 12] in the listing for "Terms and entries", I would avoid listing all usage context :indian_ucN, 1 <= N <= 10, also because usage contexts have not been introduced yet at this point of the text. I would collapse them into something like ':indian_uc1 ... :indian_uc10' and add a remark that they are covered later.
* (T10) [page 14] "In this entry Indian does not refer to Indian as used to describe people from India/South Asia. In this context Indian is correct" -> please check: there is one entry for the whole listing (:indian_entry), so "In this entry" is irrelevant; then my understanding is that Indian is not problematic when referred to people of India/South Asia, problems arising in the other cases.
* (T11) [figure 7] I think crm:P107i should go from crm:E39_Actor to crm:E74_Group, which would be consistent also with its usage in the illustrative example.
* (T12) [figure 8] rdfs:Resource is the top class in RDFS and thus it is the super-class of every other class, with no need to represent it explicitly. Therefore, author may consider dropping subclass relations pointing to rdfs:Resource to simplify the figure. Also, the author may suggest using a notation like the one for crm:P69 in Figure 6 to denote that three properties are sub-property of ontolex:usage. Having ontolex:usage pointing to rdfs:Resource is correct, but I found it misleading when interpreting the graph (I mistakenly relating ontolex:usage to ontolex:reference).
* (T13) [page 16] "indicate[]" -> why "[]"?
* (T14) [page 16] "Using OntoLex classes in the domain position ..." -> I think it should be "range position" here
* (T15) [page 17] "... concepts CIDOC CRM, OntoLex, and OA ..." -> "concepts of"
* (T16) [page 18] "... in within a record ..." -> remove one of "in" or "within"
* (T17) [page 18] "Records are connected to instances of encounter through three levels: ... component of the record" -> suggest directly pointing to boxes in Figure 10, or alternatively mentioning which CIDOC CRM relations have to be followed. Right now, it's difficult to follow the text and relating it to the figure.
* (T18) [page 19] "crm:P148i_is_component_of :ex_i_sr" -> should be ":ex_i_record"
* (T19) [page 19] "oa:hasTarget :ex_i_record" -> should be ":ex_i_sr"
* (T20) [figure 9, 11] crm:P82 has a generic literal as range, and not necessarily an xsd:date, which is consistent to its use in some listings where its value is a string like "starting 16th century"@en
* (T21) [figure 9, 11, listing of pages 20-21] crm:P3_has_note is used in the listing but missing in the figures; also note that for some activities, rdfs:comment is used with similar purpose in the listing: is it rdfs:comment or crm:P3_has_note?
* (T22) [page 21] "... exact match requirements applied only to the entries and not the terms they discussed." -> please check: if I interpret entry = caapt:TermEntry and term = caapt:TermRoot, then my intuition would be that exact match should apply to the latter based on the respective lexical forms
* (T23) [page 22, figure 12] "There are 34 cases ..." -> in Figure 12, there are 19 + 1 + 13 = 33 cases (pointed out in previous submission as well: why doing 34 = 19+1+13+1 as written in the response? where is the additional 1 to be summed here?)
* (T24) [page 22, figure 12] "There are also 52 cases where an entry is listed ..." -> a term? 53 = 12 + 19 + 1 in right side of Figure 12, which refers to terms
* (T25) [page 22] "... which permiits multiple terms to be connected to the same entry." -> it would be nice to provide an example of entry with two or more terms (e.g., inline in the text, in parenthesis)
* (T26) [page 22] "... 65 terms which have been classified by multiple source documents ..." -> is 65 referring to 32 + 19 + 1 + 13 = 65 of Figure 12 and related text? If so, this sentence would be redundant
* (T27) [page 23] "... thus is capable ..." => suggest "is thus capable"
* (T28) [table 6] In the query for CQ 37, suggest dropping OPTIONAL (just the keyword, keep the triple patterns within it!) and the following FILTER(BOUND(...)) constraint, since the combination of the two has the effect of requiring the patterns in the OPTIONAL clause not to be optional at all (this was already pointed out in the previous submission)

### CONTENT-RELATED CLARIFICATIONS ###

Provided that I'm fine with author's position, I'd like to clarify here why I maintain my view regarding comments C2 and M6.

About comment C2, having caapt:TermEntry subclass of ontolex:LexicalConcept and then linking a caapt:TermEntry instance to all term senses (i.e., instances of caapt-uc:UseContext / ontolex:LexicalConcept), has the effect of intending that caapt:TermEntry as the lexical concept resulting from the 'union' of all these senses. For instance, :entry_indian (a caapt:TermEntry) will be the lexical concept comprising the union of all possible meanings of 'indian', i.e., "people of India OR native people of America / Canada ...". Yes, this is not forbidden by OntoLex and there is no limit to the number of senses associated to a lexical concept. However, my understanding of ontolex:LexicalConcept is that its primary usage scenario is to have the associated senses refer to **different synonymous terms** (ontolex:LexicalEntry), to capture what in WordNet would be a **synset**. Continuing the example, a term like 'indian' would have multiple synsets (each being an ontolex:LexicalConcept), one of these synsets standing for Native Americans and being associated to specific senses of additional terms such as 'indigenous', 'Native Americans', 'First Nations', etc (unfortunately WordNet is no more browsable online, otherwise I would have linked some examples). Summing up, the author's solution is compliant with OntoLex structurally, but somehow counter-intuitive to people coming from a WordNet background.

About comment M6 (use of OA selectors), this comment stems from the manuscript citing a "second layer or interpretation" centered on OA (page 18), which I interpret as the goal to provide an encounter/text quote/field/record structure that can be interpreted using **exclusively** either CIDOC CRM terms (and I'm fine with the respective solution) or with OA terms. Provided that was the intent, this is not the case, because it's not possible to use OA terms alone. To exemplify, here is a query to extract encounter instances and their quoted text:

SELECT ?encounter ?quote {
?encounter oa:hasTarget [ a oa:SpecificResource ; oa:hasSource ?record ; oa:hasSelector ?field ] .
?field a oa:FragmentSelector ; oa:refinedBy [ a oa:TextQuoteSelector ; oa:exact ?quote ; ^crm:P106i_forms_part_of ?encounter ] .
}

The query uses exclusively terms from OA, except for crm:P106i_forms_part_of. If I remove the triple pattern for crm:P106i_forms_part_of, and if the KG contains exactly one oa:FragmentSelector instance for each field as stated by author (which I agree with) and links it to multiple oa:TextQuoteSelector(s) via oa:refinedBy, then the query response would match an ?encounter to **all** ?quote texts occurring in the field, rather than just the specific ?quote text for that encounter. In my original comment M6, I was assuming a replication of the oa:FragmentSelector exactly to avoid this behavior, but that's bad solution in its own. I don't see a way to use only OA terms with the current modeling solution, but whether this matters depends on how to intend "second layer of interpretation" and perhaps it's not a requirement/desiderata. I'm thus fine with author's solution, just want to point this out.

Log in or register to post comments
645 reads

Main menu

Editorial Board

Syndicate

The CAAPT ontologies: an ontological framework for the representation of museum-based critical cataloguing

Tracking #: 3991-5205

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles

Search form

Main menu

Login

Editorial Board

Syndicate

The CAAPT ontologies: an ontological framework for the representation of museum-based critical cataloguing

Tracking #: 3991-5205

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles