Preventing Interoperability Problems Instead of Solving Them

Paper Title: 
Preventing Interoperability Problems Instead of Solving Them
Eero Hyvönen
A major source of interoperability problems on the Semantic Web is the different vocabularies used in metadata descriptions. This paper argues that instead of solving interoperability problems we should focus more effort on avoiding the problems in the first place, in the spirit of Albert Einstein's quote "Intellectuals solve problems, geniuses prevent them". For this purpose, coordinated collaborative development of open source vocabularies and centralized publication of them as public vocabulary services are proposed. Methods, guidelines, and tools to facilitate this have been developed on a national level in the Finnish FinnONTO initiative, and are now in pilot use with applications and promising first results.
Full PDF Version: 
Submission type: 
Responsible editor: 

Review 1 by Martin Raubal:
The paper tackles an important problem and suggests collaborative development of open source vocabularies. I'm just not too sure how feasible this approach is. On the other hand, this should also be a vision paper, so that's perfectly fine. The whole approach reminds me of developing standards that everyone needs to follow but as we all know it is not that easy. How will you ensure that everyone follows? Same with ontology development, it seems like everyone wants to develop his / her own spatial ontology (for example).

I would also like the authors to address the following problem: What happens if standards change? How can you take care of that, will you have to change the vocabulary accordingly? How to keep up-to-date on these issues? Also: What happens when new concepts are defined?

The whole approach fits perfectly to the idea of a data infrastructure (see all the recent efforts and problems of establishing spatial data infrastructures (SDI)).

I really like the view that vocabulary work is as much a social process. It would be great if you could expand on that a bit, it is really important and often forgotten!

Review 2 by Giancarlo Guizzardi:

The main claim of the paper, namely, that "instead of solving interoperability problems we should make a big effort to avoid them" is certainly a very important research goal to be pursued. For this reason, I am happy to see a discussion addressing this topic in this inaugural issue. Exactly because of the importance of the topic, there are some issues in this particular presentation of the article that can be improved for clarification.

Firstly, I feel that the scope of the proposed solution should be better characterized in the presentation. The major reason for Interoperability problems comes from people committing to different conceptualizations which are not completely manifested via a representation (leading to the so-called "False Agreement Problem"). The point is that this problem can also appear in collective modeling if people falsely believe that they share the same conceptualization which is represented by a shared model (artifact). In other words, if the representation mechanism used to build this shared model is not expressive enough to make explicit the difference between ontological commitments, one can easily run into the false agreement problem even if the model is collectively constructed.

I understand that the page limit is insufficient to allow for an in depth discussion on the details of the proposal. However, in the current presentation, it is not easy to understand how the referred solution is distinct from existing centralized solutions in practice. Moreover, as previously mentioned, there are many serious semantic interoperability problems that can hardly be solved using only lightweight ontologies and fully automated derived mappings between them. In the end, the paper gives the impression that discussed solution focus more on Terminological Interoperability than on Semantic Interoperability.

In page 3, when discussing the Ten Commandments, the author writes [commandment 1] "Add machine semantics. Start transforming thesauri into machine interpretable (lightweight) ontologies in order to boost their usage on the Semantic Web." In my opinion, the importance given to this commandment is exaggerated. For the sake of interoperability, it is much more important to have fully expressive reference models (perhaps lacking tractable machine processable semantics) than having shallow lightweight ontologies which are computationally interesting.
In the same paragraph commandments 4 and 5 are prescribed: "4) Reuse the others"™ work. 5) Maintain interoperability with the past and other ontologies. Otherwise benefits of collaboration are lost." We have a situation now in which many existing domain ontologies lack both expressivity and truthfulness regarding the underlying domain. In contrast, many of these ontologies tend to be strongly biased by computational and/or application-specific concerns. Building an interoperability model that aims at aggregating the "maximum common denominator" of all envisaged models can cause that all the unwanted biases present in these specific ontologies are imported to the shared model. This bottom-up approach for reference model construction is very common in metamodeling and it requires a very meticulous human intervention with the proper methodological tools so that it can be properly circumvented (something which in a sense is the opposite of complete automation). For this reason, I would hesitate to prescribe these as rules to be generally applied (as the term commandment suggests).

Regarding your list: Why does 6) keep funding agencies happy? Should this be what drives research? ad 7) The question is how to integrate them!

Overall this is a good paper, maybe you could be a bit more generic in your conclusions and the overall vision.

There are several language problems that should be fixed:
p1: "link their own content"; "as suggested by the CIDOC CRM or FRBR..."; "on the web scale in the Linked Data..."??; "more semantic confusion"; "to study." missing space afterwards;
p2: "hearth" should be "heart"; "vocabularies can be aligned";
p3: "4) Reuse others'..."; "idea is to provide"; "systems through REST"; "raising from"; "is production use"??;
p4: "from tens of memory"??; "only over time";



Very interesting work. IMO, it goes well together with Christoph Schlieder's and probably also my vision statement. The three statements focus on related topics but each argues from a different perspective. I am especially interested in the points (2) and (7) in the guidelines presented on page 3. Both are among the key reasons for semantic heterogeneity and, hence, for later interoperability issues.

Your main argument seems to be to focus more on the collaborative/social creation and sharing of ontologies in the first place. This will reduce the necessity to construct mappings or even translate between ontologies later on. While I agree with this argumentation, it leads to at least three interesting challenges:

(1) I like the idea of collaborative ontology development and of involving users and domain experts in this process as early as possible. However, from own experience in creating ontologies and working with domain experts it turns out that there is a huge gap between the conceptualizations of the user, the domain expert, and the ontology engineer - in fact, they do not speak a common language even if they use the same terms. For instance, the users of a system have a naive understanding of the used terms and real-world processes compared to the domain experts. A classical example are Public Participation GIS or involving citizens in urban planning in general. Additionally, how can the users and experts understand and judge whether the final ontology implemented by a computer scientist reflects their initial conceptualizations? We addressed this problem by extending METHONTOLOGY with similarity rankings as common language and tried to measure how much the conceptualizations from the domain experts differ compared with the results of the implemented ontology. While this turned out to be a useful starting point the issue is much more complex and a very interesting field for future work. What was your experience in the projects so far?

(2) Conceptualizations evolve in space and time. What kind of ontology evolution methodologies should be used and how to version top-level ontologies like FinnONTO? Will this lead to interoperability problems on the long term and finally require ontology matching again? You write that ontologies can be aligned to FinnONTO during their creation. Is this always a alignment in terms of establishing GCI axioms or do you also support semantic annotation?

(3) How to ensure that the top level ontologies are general enough not to exclude local definitions without being over general and hence meaningless (in their task to restrict the interpretation of data)? What role does ontology modularization play in this context?