Modeling vs Encoding for the Semantic Web

Paper Title: 
Modeling vs Encoding for the Semantic Web
Werner Kuhn
The Semantic Web emphasizes markup over modeling. It is being built on the premise that ontology engineers can say something useful about the semantics of terms by expressing themselves in a markup language. This assumption has never been systematically tested, and the shortage of documented practical applications of Semantic Web ontologies suggests it may be wrong. Rather than blaming OWL and its expressiveness (in whatever flavor) for this state of affairs, we should improve the modeling techniques with which OWL code is produced. I propose, therefore, to separate the concern of modeling from that of encoding semantics in a markup language. Modeling semantics is a design task, encoding it is an implementation. Se-mantic Web research should produce languages for both. Ontology modeling languages should support ontological distinc-tions and translation into any encoding language (RDF, OWL of some flavor, or something else).
Full PDF Version: 
Submission type: 
Responsible editor: 
Krzysztof Janowicz

Review 1 by Thomas Lukasiewicz:

Contents: The paper argues that the Semantic Web community should adopt a more modest engineering view of semantics. It suggests that OWL is a powerful encoding language, but too weak as a modeling language. The paper then proposes to develop separate modeling languages to complement such encoding languages.

Evaluation: I like quite much the ideas of the paper, in particular the one observing that there are some problems with current markup in the Semantic Web, which may be resolved by introducing separate modeling languages; I also like the idea suggesting that ontologies actually constrain the domain.

I have some doubts, though, that modeling languages will completely solve the current problems. Actually, I recently see the problems more and more being solved by machine learning techniques for (often fully) automatic annotation and/or extraction from Web content. How does this observation match (or interplay with) the modeling initiative described in the paper? Could you discuss/comment?

Minor comments: The paper is not always easy to read; would it be possible to increase its readability, maybe by more examples?

The paper should also be formated in the style of the SWJ journal. In particular, the text body should be written using only one font size (currently, some parts are written in significantly smaller fonts). Furthermore, the layout should be improved: explicit enumeration should be used whenever necessary, and the rest of the text should then be written in a continuous way.

Review 2 by Giancarlo Guizzardi:
This paper discusses a very important point and calls attention to an aspect of ontology engineering which has been for far too long neglected in the Semantic Web literature. Actually, the point in question is very similar to the point I make in "Theoretical Foundations and Engineering Tools for Building Ontologies as Reference Conceptual Models" which has been accepted in this very same inaugural issue of the journal. As consequence, I believe this journal issue will benefit from having these two papers complementing each other and reinforcing each other"s main argument.

In the sequel, I include some comments that I believe can help to improve the clarification of the paper in some specific points.

Abstract: "Web research should produce languages for both" -> More generally, I would put that ontology research should produce languages for both.

In the same paragraph, "Ontology modeling languages should support ontological distinctions and translation into any encoding language (RDF, OWL of some flavor, or something else)." -> I would put it in different phrasing, as one can wrongly interpret the sentence as stating "ontology modeling languages should guarantee translatability to any enconding language". Since the latter idea has been suggested in the past (and since I don"t believe that is what you mean), I think it would be interesting to make your point more explicit here.

Page 2: "At those times, both fields used a single paradigm for encoding and modeling: relational algebra for databases and logical devices for user interfaces." -> In the field of databases, in the 80's there was already an understanding (at least in the research environment) of the need of separating the conceptual, logic and physical design. Conceptual-level modeling languages termed Semantic Data Modeling languages (e.g., Chen's ER diagrams, Abrial's Semanatic). I take the opportunity here to cite a fragment of the aforementioned paper submitted to this same inaugural issue: "At this point, I would like to echo the historical report of Janis Bubenko regarding an analogous discussion taking place in the conceptual modeling community in the 70"s between supporters of Conceptual Data Models (e.g. ER diagrams) and those of the Relational Data Model [3]. As summarized by Bubenko,"[t]oday the battle is settled: conceptual data models are generally used as high-level problem oriented descriptions. Relational models are seen as implementation oriented descriptions"™".

Page 3: "As the saying "words don"t mean, people do" expresses, it is people who mean something when they use a term." -> The clarity of this point should be improved. With this phrasing, one can take the interpretation of functionalist view on semantics (as opposed to a referentialist one) and I don't think that is what you intended.

Page 4: "While it is possible to "ontoclean" diagrammatic modeling languages like UML [6], formal reasoning support requires different kinds of languages." -> Since the references are not numbered, I am assuming here that reference [6] is to (Guizzardi, 2005). If so, I truly appreciate the reference to this work, but I would put it in a different way. The proposal in [6] is much broader in scope than an "OntoClean" UML profile. If only because the former deals only with taxonomic relationships, whilst the latter address a much larger superset of ontological concepts including, for instance, part-whole relations, different sorts of dependence, moments, formal and material relations, quality dimensions and domains, etc. The important point nonetheless that should be highlighted is that both proposals are concerned with truly "ontological level" distinctions, not merely with logical and epistemological ones.

Another point that needs clarification is whether what you mean by "formal reasoning" is "automated reasoning". If so, I don't think an ontology modeling language MUST support automated reasoning, since that is only one of the possible applications of ontologies. In contrast, what they MUST support is human problem-solving and, when convenient, they could be (possible partially) mapped to (possibly multiple different) encoding languages that support automated reasoning. It seems to me (judging from the remainder of the text) that you agree with these points.

Page 5: ""¦but functional languages offer functions to model perdurants (and qualities)."-> Notice that in the usual interpretation, qualities are endurants. If functions are the representation of perdurants then I assume you mean function in the programming sense (akin to an OO method). A function in this sense models a specification of a behavior. So, it is unclear here if you really mean a "behavior modeling a quality", or simply what is named an "attribute function" or a "mapping" that maps an endurant to a certain quale (only implicitly associated with a quality).



A refreshing read - we should probably be reminded more often that "choices" which have become mainstream in a research community are not necessarily the best, and that the achievement of bigger visions often necessitates rethinking. It is also good to be reminded that there is quite a bit of confusion in the community what "semantics" is, and what "semantics" should be for the semantic web. There is much too little fundamental discussion about this.

This said, it should be noted that the alternatives put forth in the paper, and which are claimed to be "closer" to achieving (some of) the semantic web goals, are also in need to be validated as to their usefulness. While I can follow the conceptual arguments, a significant amount of research is required to spell out these alternative ideas before they can be tested in the field. This doesn't mean it shouldn't be done. But it could perhaps be argued that it might be wise (e.g., considering the short attention span of Computer Science as a field, or of funding agencies) to first identify the added value which can be obtained with the methods currently proposed.

One aspect of the paper left me puzzled - I was wondering, which role the author has for "formal semantics" (in whatever sense) in his perspective. I'd like to read something about this.

Thanks to Giancarlo and Thomas for very helpful reviews and to Pascal for a great challenge (formal semantics). I have tried to address all points in the revised version. Now, at least the author knows better what he is talking about, and maybe this will help readers, too ;-).

Giancarlo's point (well taken, of course) that the database community had already separated modeling from encoding in the 70's makes the semantic web situation look even poorer!

I realize that my arguments for functional languages were (and remain) a bit obscure, due to space limitations. Their intention is only to point out opportunities, not to explain details.

Regarding "functionalist" vs "referentialist" views of semantics, I confess not to understand the difference well enough to take sides. I have tried to stay away from "isms" and to propose a pragmatic view of what ontologies do, and a less pragmatic one of how they could be improved.