# Pattern-based design applied to cultural heritage knowledge graphs

### Tracking #: 2517-3731

Authors:
Valentina Anita Carriero
Aldo Gangemi
Maria Letizia Mancinelli
Andrea Giovanni Nuzzolese
Valentina Presutti
Chiara Veninata

Responsible editor:
Special Issue Cultural Heritage 2019

Submission type:
Full Paper
Abstract:
Ontology Design Patterns (ODPs) have become an established and recognised practice for guaranteeing good quality ontology engineering. There are several ODP repositories where ODPs are shared as well as ontology design methodologies recommending their reuse. Performing rigorous testing is recommended as well for supporting ontology maintenance and validating the resulting resource against its motivating requirements. Nevertheless, it is less than straightforward to find guidelines on how to apply such methodologies for developing domain-specific knowledge graphs. ArCo is the knowledge graph of Italian Cultural Heritage and has been developed by using eXtreme Design (XD), an ODP- and test-driven methodology. During its development, XD has been adapted to the need of the CH domain e.g. gathering requirements from an open, diverse community of consumers, a new ODP has been defined and many have been specialised to address specific CH requirements. This paper presents ArCo and describes how to apply XD to the development and validation of a CH knowledge graph, also detailing the (intellectual) process implemented for matching the encountered modelling problems to ODPs. Relevant contributions also include a novel web tool for supporting unit-testing of knowledge graphs, a rigorous evaluation of ArCo, and a discussion of methodological lessons learned during ArCo development.
Tags:
Reviewed

Decision/Status:
Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 18/Jul/2020
 Suggestion: Minor Revision Review Comment: I'm almost satisfied with the submitted version of the paper which largely improves the previous one. Specifically, the authors have solved the issues raised in the first review. I can also appreciate the restructuring of the paper, the modifications, and the additions introduced as well. In my first review, I observed the following: ------------------------------------------------------------------------------------------------------------------------------------------ My major concern regards the proposed definition of eXtreme Design (XD). The relationship between XD and XP is not clear at all. How XD is inspired by XP? What does XD inherit from XP? What does not? In which point XD is different from XP? In which does not? Why XD is necessary and why XP is not enough in the context of ontology design? The authors should answer those questions in a dedicated section that illustrates clearly the relationships between XP and XD. Moreover, there is no reference on how to apply XD to the design phase of an ontology, which represents the main difference between the development process of an ontology-based software and an entity-relation (ER) database-oriented software. In fact, the authors limit themselves to present the ontology patterns adopted in the implementation of ArCo without mentioning how to identify them: applying well-known patterns is a standard practice in software engineering and does not depend on the paradigm adopted. Without such clarification the reader has the feeling that XD is simply an à la "Spiral" approach (as it seems to be confirmed in Section 4.6) and that the “test driven” approach involves only the implementation (coding) of the ontology and not its design process. ------------------------------------------------------------------------------------------------------------------------------------------- In Section 3, the authors clarify most of the aspects of my observation. Therefore, I am still not convinced about the justification of XD, since it appears to be Extreme Programming but applied to the ontology design process. Moreover, the authors assert: -Page 5, lines 21-30. Where XP diminishes the value of careful design, this is exactly where XD has its main focus. Extreme Programming does not diminish the values of careful design, it is the very opposite: it is intended to produce high-quality software, to reduce its cost through short development cycles. An accurate design phase is a mandatory step of XP (see for instance, Extreme Programming Installed, Jeffries, Anderson, Hendrickso). In my opinion, the authors may use the term XP in replacement of XD, describing how it has been applied to ArCo. The paper may be accepted for publication as it is, but I suggest the authors to mind my last concern.
Review #2
Anonymous submitted on 28/Jul/2020
 Suggestion: Accept Review Comment: Globally, the paper has been greatly improved and the authors addressed most of my criticisms, thanks for that. Still I think that the paper better fits the category: "Descriptions of ontologies: short papers describing ontology modeling and creation efforts. The descriptions should be brief and pointed, indicating the design principles, methodologies applied at creation, comparison with other ontologies on the same topic, and pointers to existing applications or use-case experiment" because the great majority of results and modeling/methodological choices have been already analyzed in published works. However, this is not a short paper, and by shortening it the reader would not understand the main contributions of the ArCo project. In my view, this complete and exhaustive analysis merits to be published even though it does not contain substantial new results, but I leave to the editors the final word. The paper would benefit at least a further effort to clarify some remaining critical and unclear points. (1) The semantics adopted for the conceptual graphs needs to be clarified and made explicit. For instance, it is not clear how a relation between two classes translates into OWL axioms. Suppose A --(REL)--> B. Does this simply translate into $\exists REL.\top \sqsubseteq A$ and $\top \sqsubseteq \forall REL.B$ (in DL) or $REL(x,y) \to A(x) \land B(y)$ (in FOL) or there is something more/different? A note on that would help the reader. (2) Even though the authors added some clarifications about the notion of situation still it would be very helpful to have an explicit comparison with the W3C ontology patterns for representing n-ary relations in RDF and OWL, see https://www.w3.org/TR/swbp-n-aryRelations/. (3) In Section 4.3 the authors try to explain their idea of introducing some redundancies. More specifically, some binary relations that are in principle detectable via complex queries are also directly introduced in the vocabulary of the theory. First, I would appreciate a clarification of the reason, is it a matter of computational efficiency? Second, how the authors assure that the two redundant mechanisms (the information they represent) are (is) not discrepant? (p.3) "By formalising the semantics of cultural properties" - ?? (p.3) "a formal evaluation of ArCo ontologies based both ..." - based ON both ... (p.3) "Section 4 provide" - provideS (p.3) Section 8 discusses relevant related work and Section 7 summarises the lessons learned from the experience of developing ArCo... - Why do you talk about Section 8 before talking about Section 7? One could also avoid Section 8 and introduce Section 8.1 at the end of Section 5 and Section 8.2 the end of Section 3 (or 6). In addition Sections 5.4 and 5.5 concern methodological, more than conceptual, aspects. Maybe they can be moved. (p.10) "Requirements coming from user stories, as well those extracted from ICCD standards, are translated into Competency Questions. All CQs, and related SPARQL queries, that so far guided ArCo KG design and testing are available online." - I'm wondering what happens when a substantial change in the ontology (maybe also concerning quite high-level concepts) is necessary. In addition to take into account how to migrate the data, one also needs to consider the new translations of CQs (and maybe also the CQs themselves). The authors do not discute this aspect. (p.11) "Table 1 lists some representative CQs for each module" - It sounds strange to me that no CQ involves more than one module. (p.13) "and to (ii) D&S [11, 34] distinctions for second-order entities" - I'm not sure to understand what "second-order entities" means here. (p.13) "For example, the Uffizi in Florence can be categorised as a Building (physical object), a Museum (a social object), and a relative Location (a spatial region), with experts understanding the Uffizi as a complex entity, whose heterogeneous features are not supposed to be analysed into three different categories, since they emerge out of co-predication [35]." - I don't see in which sense the DUL generalization is better or different from a logical disjunction (of the three categories of Building, Museum, and Location). If the generalization corresponds to a disjunction, by categorizing Uffizi under this generalization the inferential power of the theory is quite low. If I'm not wrong, DOLCE adopts a multiplicativist approach, i.e., in the example, it would probably assume the existence of three different, but related, entities (a physical object, a social object, and a spatial region). This would increase the information present in the theory and its inferential power. (p.15) "ArCo situations do not commit to the distinction between objects and events as applied in DOLCE (and CIDOC CRM), since a situation is defined as the occurrence of a description as observed, diagnosed, aggregated, invented, etc." - First, the term "occurrence" is often considered as a synonymous of "event" or "perdurant". Second, as far as I understand, DOLCE does not commit on a realist view on perdurants, i.e., DOLCE-perdurants may depend on cognitive processes as observation, diagnosis, invention, ecc. (p.15) "The constructive stance prioritises the dependence of an event-like entity on its framing, so that a requirement for ontology design can be directly matched by a frame." - A clarification on that would be interesting and useful. (p.15) "having physical size, constitution, unique qualities, or authorship are factual situations for a cultural property c, while establishing constitution via Carbon-14 or attributing authorship for c are interpretation situations." - Physical size, constitution, etc. are measured by means of physical instruments with given tolerances and resolutions and prone to possibles misuses (or maybe are just the reports of experts). It seems to me that if there is a difference with respect to Carbon-14 technics, this difference is mainly qualitative, maybe Carbon-14 measurements are less reliable. (p.17) "As it describes a real-word object, it can be defined as an information object" - First, "real-world"? Second, I don't see the implication. (p.21) "A cultural property can be involved in many different situations during its life: it can be commissioned, bought or obtained, used " - It is hard to me to accept that a commissioned cultural property exists also before been realized (or maybe without ever being realized). (p.22) Fig.13. In the RDFS label of data:NumismaticProperty/1400019640 the authors included the attributed authors / date, ecc. Are these labels dynamically produced on the basis of the information in the KB? (p.39) "a knowledge graph of Italian Cultural Heritage (CH)" - The acronym CH has already been introduced. Use ICH or just Italian CH.
Review #3
Anonymous submitted on 29/Sep/2020