Review Comment:
This manuscript was submitted as 'full paper' and should be reviewed along the usual dimensions for research contributions which include (1) originality, (2) significance of the results, and (3) quality of writing. Please also assess the data file provided by the authors under “Long-term stable URL for resources”. In particular, assess (A) whether the data file is well organized and in particular contains a README file which makes it easy for you to assess the data, (B) whether the provided resources appear to be complete for replication of experiments, and if not, why, (C) whether the chosen repository, if it is not GitHub, Figshare or Zenodo, is appropriate for long-term repository discoverability, and (4) whether the provided data artifacts are complete. Please refer to the reviewer instructions and the FAQ for further information.
The paper introduces to knowledge engineering in general, focusing on ontologies, knowledge graphs. some knowledge engineering methodologies, and finally LLMs and their application in knowledge engineering. Considering the Call for paper for the SWJ special issue on “the Pedagogy and Praxis of Knowledge Graphs and the Semantic Web” the paper does not really fall in one of the categories indicated in the special issue. It is neither (1) a formal evaluation of educational material, nor (2) the description of Open Source KG/SW educational material, nor (3) a report on or application of OS KG/SW Educational Material & Tools, nor (4) a survey. It is a lightweight introduction into the subject of knowledge engineering targeted towards engineers or software developers. As it doesn’t fall into one of the categories mentioned above, the paper has to be rejected for formal reasons.
Considering the paper an introduction to knowledge engineering, please take into account the following comments:
p1, line31: the definition of declarative knowledge requires a bibliographical reference
p1, line39: the definition of a knowledge base doesn’t consider axioms nor (logical) constraints.
p1, line47-49: besides expressivity, also consider granularity, level of detail and (computational) complexity.
p2, line1-3: the source you are referring to (Blumauer and Nagy) are themselves referring to Lasilla et al: The role of frame-based representations on the Semantic Web. 2001 [1] for the classification of knowledge organisation systems.
p2, line34-43: Natural language is also a knowledge representation! Please distinguish also “formal” knowledge representation systems in your enumeration.
p2, line49: You introduce the term “embedding” here, which requires an explanation.
p3, line 10-12 and 16: You are using the terms “knowledge”, “information”, and “data” without a proper definition. Data and information should clearly be distinguished from knowledge and vice versa. Probably make use of the DIKW pyramid [2]
p3, line 46-5: bib,iographical references for logics, semantics, meaning, interpretation, truth are missing. Alternatively you might give proper definitions.
p4, line 1-20: It is unclear why you are introducing the formal notation of entailment and interpretation, while other important terms (as e.g. “sound”ness, “complete”ness, “derivation”, etc.) are only explained using natural language. However, as the formalism is not required later in the paper, I suggest skipping it.
p5, line 1: semantic networks require a bibliographical reference.
p5, line 17-21, the graphical notation (diamond) is unclear and not explained.
p5, line 30: “qnames require further explanation and bibliographical reference”
p5, line 36: Why are you referring to “resources” ant not “entities”?
p5, line 37-39: The important use case of using blank nodes for existential assumptions is not mentioned.
p5, line 42: Datatyped literals are not mentioned
p6, line 1-15: I would not refer to the example as “reification” but as representation of n-ary relations. Otherwise this will be confused with RDF reification.
p6, line 37-42: If the example RDF is supposed to be RDF Turtle serialization, the period at the end of each statement is missing.
p6, line 43: The meaning of the filled box is unclear.
p6, line 47: A bibliographical reference for “frames” is missing.
p7, line 9-22: Please provide a formal explanation of the frame-based inference mechanism
p8, line 16: Please provide a bibliographical reference for Protégé.
p8, line 30: Bibliographical reference for Aristotle’s “Metaphysics” is missing.
p8, line 40: “Clarity” should refer to “Explicit”.
p9, line 1-5: Bibliographical references for SNOMED, GO, and CHEBI are missing.
p9, line 32-51: RDF classes and properties haven’t been defined (City rdf:type rdf:Class.)
p10, line 18-19: OWL ontologies can refer to different instances of description logics, depending on the OWL version/flavor used (OWL2, OWL2 DL, OWL2 EL, OWL2 QL, OWL2 RL, OWL2 Full)
p10, line 22:Besides classes and relationships, the T-Box can also contain axioms.
p11, line 38-51: symmetric properties, antisymmetric properties, reflexive properties are missing
p12, line 18: There is no “owl:subClassOf”!
p14, line 51: The formal definition of a knowledge graph doesn’t include the possibility of attributive triples.
p15, line 28-31: the term “crowdsourcing” doesn’t need this explicit explanation.
p16, line 1-15: Your definition of “property graph” is wrong. A property graph is a data model of various graph-oriented databases, where pairs of entities are associated by directed relationships, and entities and relationships can have properties.[3]
p16, line 12: give examples for open knowledge graphs.
p16, line 13-17: Why are knowledge graphs well suited for the applications mentioned? Please provide a rational or justification.
p16, line 44-49: Knowledge graphs not necessarily only contain nodes (entities) referring to named entities. Named entity recognition also provides classes for common entities (as e.g. under the class “misc”), which might be referred to in a knowledge graph.
p17, line 17/23: Probably you mean “DBpedia” instead of “Wikipedia” here.
p17, line 25: “Chicago” might refer to many more entities, as e.g. Chicago the band.
p17, line 28-46: a table would be helpful for better overview.
p17, line 44-45: Entity classification is a special case of link prediction. However, it can also be treated as (traditional) (multi-class) classification problem.
p18, line 22: Knowledge graph embeddings are typically created using unsupervised learning techniques. The primary goal of these embeddings is to represent entities and relationships in a continuous vector space while preserving the structural and semantic information of the knowledge graph. This is achieved by optimizing objective functions that capture the relationships between entities, such as translational models…
p18, line 22: “composition, inverse, and antisymmetry” require explanations and dedicated examples.
p19, line 1-14: FIgure 8 is not explained in sufficient detail in the text.
p20, line 1-2: It is not explained how shape based constraints can be tested automatically.
p20, line 21-26: The example for “opaque URIs” as “geo:Locality1” is not well chosen as the issue of multilinguality, as mentioned in line 22, does not hold for this English language-based example. Better use Q-Identifiers from Wikidata as example.
p20, line 30-51: Too few information for Ontological engineering methodologies. Not a single methodology is referenced or mentioned.
p20, line 45-51: Provide a figure illustrating ODPs and Ontologies derived from ODPs for better understanding.
p21, line 1-20: It remains unclear in how far and how exactly FAIR principles can be applied to knowledge representation artefacts.
p21, line 27-28: “Foundation Models” and LLMs are not synonymous. Bibliographical references are missing.[4]
p21, line 24-44: Discussion of LLMs as knowledge representations is missing.
p22, line 1-21: How exactly can knowledge graphs be constructed with the help of LLMs?
p22, line 30: The term “SAT-style analogies” needs to be explained (and referenced)
p22, line 47-51: It remains unclear why especially LLMs should be well suited for the representation of common sense knowledge. Common sense knowledge often has never been recorded as a text being available for the training of LLMs. This needs to be explained.
References:
[1] Lassila, O. and McGuinness, D.L. 2001. The Role of Frame-Based Representation on the Semantic Web. Technical Report #KSL-01-02. Stanford University.
[2] Ackoff, R.L. 1989. From Data to Wisdom. Journal of Applied Systems Analysis. 16, (1989), 3–9.
[3] R. Angles, "A Comparison of Current Graph Database Models," 2012 IEEE 28th International Conference on Data Engineering Workshops, Arlington, VA, USA, 2012, pp. 171-177,
[4] Rishi Bommasani, Drew A Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, et al. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021.
|