Review Comment:
I find that the manuscript has significantly improved since the first submission, and I'd like to acknowledge and praise the substantial effort put by the author in this revision round.
All my previous comments -- presentation-wise, content-wise, and w.r.t. linked ontologies -- have been adequately addresses by the author, through adapting the manuscript, the resources and/or clarifying reviewer's doubts in the author's response.
Presentation-wise, the revised text is now clearer and easier to follow, and I find that the additional diagrams and especially the running example in Turtle that accompanies the various sections of the text, are both effective in conveying the author's modeling ideas.
Content-wise, I thank the author for having considered and addressed the points I raised. About my previous comment C3 (caapt:TermEntry subclass of culco:Contentious), I didn't note the use of skos:exactMatch with culco:ContentiousIssue in CULCO, which makes it a skos:Concept: given this, I stand corrected and I find the author's solution reasonable and in line with CULCO design (I remain a bit confused about conflating a concept with an information object, but this regards now CULCO and not the authors' contribution). About previous comment M7 (need for frac:observedIn and rdf:value), I do now understand this is a result of FrAC having been updated after the first manuscript submission: I find the author's response (avoid changes while FrAC is still evolving) reasonable, and I appreciate the addition in the discussion section of a paragraph about the (risks of) reuse of evolving ontologies with specific reference to FrAC. This can help clarifying the situation to readers looking into the paper and the proposed ontology modules. About previous comments C2 and M6, I respectfully maintain my view but I also accept author's position and reply, in the light that the proposed modeling decisions do not introduce logical inconsistencies.
Overall, I think that this is a solid contribution and that my previous concerns have been all addressed. I'm happy with the revised manuscript, and I have no further concerns that would ask for additional review rounds. I thus recommend acceptance, leaving to author to decide whether/how to optionally integrate the remaining feedback in this review when finalizing the manuscript.
What follow are a list of typos and other presentation-related suggestions, as well as some clarifications on C2 and M6, which are provided here in the hope they can be of help to author in continuing this line of work, but that do not represent criticisms that need addressing.
### PRESENTATION-RELATED SUGGESTIONS ###
My main (and only non-trivial) suggestion for the author is to move paragraph "Terms and entries" (incl. Figure 5 and illustrative example) before paragraph "Guidance documents" (incl. Figure 4 and illustrative example). Right now, the presentation of CAAPT-O, which starts with "Guidance documents", already touches terms and entries, so the following "Terms and entries" paragraph appears a bit redundant and/or in the wrong order to me.
Then, here is a list of typos and other spot issues/suggestions, all of which can be handled straightforwardly:
* (T1) [page 2] "... RDF syntax style Turtle ..." -> remove "style"
* (T2) [page 3] "... as well as a language used ..." -> remove "a"
* (T3) [page 5] "... offers the opportunity to consider that ..." -> what about a much simpler "... suggests that ..."?
* (T4) [table 3] Suggestion: what about adding also row(s) for FrAC and OA, which are also mentioned in the text? May also cite Lexinfo and Lex-O in the row for OntoLex.
* (T5) [page 7] "... several usage typologies, including that encoded in LexInfo" -> "those" instead of "that"
* (T6) [page 10] "... contain term entries ..." -> suggest adding "(caapt:TermEntry)" to explicitly link to content of Figure 4; same for "contain suggestions"
* (T7) [page 11] "The entry of a term is differentiated from the concept of the term following OntoLex's conceptual modeling ..." -> this works only if I interpret entry = ontolex:LexicalEntry, instead of caapt:TermEntry: suggest clearly referring to ontology concepts (ontolex:LexicalEntry vs. caapt:TermEntry) to disambiguate
* (T8) [figure 4] suggest adding a box for caapt:Suggestion, as long as it is mentioned in the main text associated to the figure
* (T9) [page 12] in the listing for "Terms and entries", I would avoid listing all usage context :indian_ucN, 1 <= N <= 10, also because usage contexts have not been introduced yet at this point of the text. I would collapse them into something like ':indian_uc1 ... :indian_uc10' and add a remark that they are covered later.
* (T10) [page 14] "In this entry Indian does not refer to Indian as used to describe people from India/South Asia. In this context Indian is correct" -> please check: there is one entry for the whole listing (:indian_entry), so "In this entry" is irrelevant; then my understanding is that Indian is not problematic when referred to people of India/South Asia, problems arising in the other cases.
* (T11) [figure 7] I think crm:P107i should go from crm:E39_Actor to crm:E74_Group, which would be consistent also with its usage in the illustrative example.
* (T12) [figure 8] rdfs:Resource is the top class in RDFS and thus it is the super-class of every other class, with no need to represent it explicitly. Therefore, author may consider dropping subclass relations pointing to rdfs:Resource to simplify the figure. Also, the author may suggest using a notation like the one for crm:P69 in Figure 6 to denote that three properties are sub-property of ontolex:usage. Having ontolex:usage pointing to rdfs:Resource is correct, but I found it misleading when interpreting the graph (I mistakenly relating ontolex:usage to ontolex:reference).
* (T13) [page 16] "indicate[]" -> why "[]"?
* (T14) [page 16] "Using OntoLex classes in the domain position ..." -> I think it should be "range position" here
* (T15) [page 17] "... concepts CIDOC CRM, OntoLex, and OA ..." -> "concepts of"
* (T16) [page 18] "... in within a record ..." -> remove one of "in" or "within"
* (T17) [page 18] "Records are connected to instances of encounter through three levels: ... component of the record" -> suggest directly pointing to boxes in Figure 10, or alternatively mentioning which CIDOC CRM relations have to be followed. Right now, it's difficult to follow the text and relating it to the figure.
* (T18) [page 19] "crm:P148i_is_component_of :ex_i_sr" -> should be ":ex_i_record"
* (T19) [page 19] "oa:hasTarget :ex_i_record" -> should be ":ex_i_sr"
* (T20) [figure 9, 11] crm:P82 has a generic literal as range, and not necessarily an xsd:date, which is consistent to its use in some listings where its value is a string like "starting 16th century"@en
* (T21) [figure 9, 11, listing of pages 20-21] crm:P3_has_note is used in the listing but missing in the figures; also note that for some activities, rdfs:comment is used with similar purpose in the listing: is it rdfs:comment or crm:P3_has_note?
* (T22) [page 21] "... exact match requirements applied only to the entries and not the terms they discussed." -> please check: if I interpret entry = caapt:TermEntry and term = caapt:TermRoot, then my intuition would be that exact match should apply to the latter based on the respective lexical forms
* (T23) [page 22, figure 12] "There are 34 cases ..." -> in Figure 12, there are 19 + 1 + 13 = 33 cases (pointed out in previous submission as well: why doing 34 = 19+1+13+1 as written in the response? where is the additional 1 to be summed here?)
* (T24) [page 22, figure 12] "There are also 52 cases where an entry is listed ..." -> a term? 53 = 12 + 19 + 1 in right side of Figure 12, which refers to terms
* (T25) [page 22] "... which permiits multiple terms to be connected to the same entry." -> it would be nice to provide an example of entry with two or more terms (e.g., inline in the text, in parenthesis)
* (T26) [page 22] "... 65 terms which have been classified by multiple source documents ..." -> is 65 referring to 32 + 19 + 1 + 13 = 65 of Figure 12 and related text? If so, this sentence would be redundant
* (T27) [page 23] "... thus is capable ..." => suggest "is thus capable"
* (T28) [table 6] In the query for CQ 37, suggest dropping OPTIONAL (just the keyword, keep the triple patterns within it!) and the following FILTER(BOUND(...)) constraint, since the combination of the two has the effect of requiring the patterns in the OPTIONAL clause not to be optional at all (this was already pointed out in the previous submission)
### CONTENT-RELATED CLARIFICATIONS ###
Provided that I'm fine with author's position, I'd like to clarify here why I maintain my view regarding comments C2 and M6.
About comment C2, having caapt:TermEntry subclass of ontolex:LexicalConcept and then linking a caapt:TermEntry instance to all term senses (i.e., instances of caapt-uc:UseContext / ontolex:LexicalConcept), has the effect of intending that caapt:TermEntry as the lexical concept resulting from the 'union' of all these senses. For instance, :entry_indian (a caapt:TermEntry) will be the lexical concept comprising the union of all possible meanings of 'indian', i.e., "people of India OR native people of America / Canada ...". Yes, this is not forbidden by OntoLex and there is no limit to the number of senses associated to a lexical concept. However, my understanding of ontolex:LexicalConcept is that its primary usage scenario is to have the associated senses refer to **different synonymous terms** (ontolex:LexicalEntry), to capture what in WordNet would be a **synset**. Continuing the example, a term like 'indian' would have multiple synsets (each being an ontolex:LexicalConcept), one of these synsets standing for Native Americans and being associated to specific senses of additional terms such as 'indigenous', 'Native Americans', 'First Nations', etc (unfortunately WordNet is no more browsable online, otherwise I would have linked some examples). Summing up, the author's solution is compliant with OntoLex structurally, but somehow counter-intuitive to people coming from a WordNet background.
About comment M6 (use of OA selectors), this comment stems from the manuscript citing a "second layer or interpretation" centered on OA (page 18), which I interpret as the goal to provide an encounter/text quote/field/record structure that can be interpreted using **exclusively** either CIDOC CRM terms (and I'm fine with the respective solution) or with OA terms. Provided that was the intent, this is not the case, because it's not possible to use OA terms alone. To exemplify, here is a query to extract encounter instances and their quoted text:
SELECT ?encounter ?quote {
?encounter oa:hasTarget [ a oa:SpecificResource ; oa:hasSource ?record ; oa:hasSelector ?field ] .
?field a oa:FragmentSelector ; oa:refinedBy [ a oa:TextQuoteSelector ; oa:exact ?quote ; ^crm:P106i_forms_part_of ?encounter ] .
}
The query uses exclusively terms from OA, except for crm:P106i_forms_part_of. If I remove the triple pattern for crm:P106i_forms_part_of, and if the KG contains exactly one oa:FragmentSelector instance for each field as stated by author (which I agree with) and links it to multiple oa:TextQuoteSelector(s) via oa:refinedBy, then the query response would match an ?encounter to **all** ?quote texts occurring in the field, rather than just the specific ?quote text for that encounter. In my original comment M6, I was assuming a replication of the oa:FragmentSelector exactly to avoid this behavior, but that's bad solution in its own. I don't see a way to use only OA terms with the current modeling solution, but whether this matters depends on how to intend "second layer of interpretation" and perhaps it's not a requirement/desiderata. I'm thus fine with author's solution, just want to point this out.
|