Are we better off with just one ontology on the Web?

Tracking #: 2253-3466

Authors: 
Armin Haller
Axel Polleres

Responsible editor: 
Guest Editor 10-years SWJ

Submission type: 
Other
Abstract: 
Ontologies have been used on the Web to enable semantic interoperability between parties that publish information independently of each other. They have also played an important role in the emergence of Linked Data. However, many ontologies on the Web do not see much use beyond their initial deployment and purpose in one dataset and therefore should rather be called what they are – (local) schemas, which per se do not provide any interoperable semantics. Only few ontologies are truly used as a shared conceptualisation between different parties, mostly in controlled environments such as the BioPortal. In this paper, we discuss open challenges relating to true re-use of ontologies on the Web and raise the question: "Are we better off with just one ontology on the Web?"
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Oscar Corcho submitted on 26/Jul/2019
Suggestion:
Minor Revision
Review Comment:

This is a nice review paper that considers some of the key aspects that need to be taken into account to understand how ontology engineering has evolved in the last three decades, and raises questions and challenges on how it could be made more accessible to a larger community (the Web community in general).

I like the set of categories that are being proposed by the authors in order to make an analysis. I may think of others, of course, but I consider that this group of categories is sufficiently representative of aspects that I normally have to consider when reusing ontologies for some purpose (Linked Data generation, ontology creation, etc.). Probably I am only missing a stronger view on the curation processes, even though this is partially mentioned when referring to initiatives like schema.org or Wikidata.

I am also missing, when the authors talk about repositories and communities, a broader view on current domain ontology repositories that are also going into the same direction as BioPortal. Examples are the ETSI communities for the development of SAREF, the work around FiBO for the financial industry, things like SPAR Ontology network for publishing, my own work with open data for cities (github.com/opencitydata). I think that such efforts also demonstrate that there is a new group of activities/communities organised around some domain, which could be a good alternative to the strongly-led-by-one-organisation focus of schema.org or the bottom up approaches of DBpedia or Wikidata.

AS for the analysis that is done on WikiData and schema.org, I am not fully convinced about how they have been approached, in the sense that schema.org is decently presented and the process well understood, but the description of Wikidata's approach, its comparison with the DBpedia ontology (why not YAGO or other similar ones), etc., is not sufficiently well presented. I think that this would be a clear place where to make changes in the paper if the authors have time, to provide a more comprehensive description of this approach and also to support some of those claims that this will be probably one of those approaches of one-for-all that will in the end succeed.

So in summary:
- Nice analysis of typical aspects to consider when building and maintaining ontologies nowadays. The role of ontology curators communities may be further stressed if considered relevant by the authors.
- Good diagnosis of schema.org, but not so clear for Wikidata and comparison with other initiatives.
- I would recommend adding some descriptions and comparisons also with domain-focused catalogues (BioPortal and the others aforementioned, whichever ones the author decide to use) as another way in which several interlinked ontologies are put togetehr.

There are a few minor typos in the text, mostly grammar based.

Review #2
Anonymous submitted on 03/Aug/2019
Suggestion:
Major Revision
Review Comment:

The paper concerns the question why ontologies are not shared on the Web globally as much as many visioned in the early 2000's.

First, reasons for ontology heterogeneity are listed and an abstraction/reusability hierarchy of ontology types is presented.

The causes of heterogeneity listed in [14] seem to concern only concept hierarchies and miss relational heterogeneity. Many "ontologies" are actually metadata models, say e.g. CIDOC CRM, while others more like concept hierarchies. This issue should be clarified, as it may have effect on the notion of upper ontologies etc. in Fig. 1 that are different in nature in different cases.

After this the authors categorize and discuss different challenge types related to re-using ontologies (Section 2). Nothing particularly new is presented here, but this section provides a useful recap of the issues with many references.

In the next section, the issues are discussed using schema.org as a case study. Also Wikidata is concerned as another widely used global ontology.

In conclusion, the authors argue that winner takes it all -principle will finally apply also to ontologies and we are better off with one sustainable ontology on the Web, such as schema.org or Wikidata. This argument remains only as an opinion, and I would have liked to see at least some scientific facts supporting this, e.g., statistics on how usage of global ontologies, such as schema.org and Wikidata, has been developing. Also, there seems to be the trend that instead of one cross-domain global universal global ontology, global domain area ontologies, such as geo-ontologies (e.g. Geonames), actor ontologies (E.g. VIAF, TGN), medical ontologies (e.g. SnomedCT), gene ontologies etc. are emerging, and this could be discussed in the paper. A possible future is a set of linked ontologies in the Linked Open Data (or Ontology) Cloud, not just one global ontology.

Even if this paper is a vision paper in a special category, I think that some deepening of the discussion, perhaps explicating several possible futures, is still needed in finalizing the paper.

Minor comments

vs a 24 > vs. a 24

Use camel-back notation in the title and headings
Linked Open Data cloud -> Linked Open Data Cloud
i.e. -> i.e.,
e.g. > e.g.,
versioning Klein -> versioning, Klein
that tests -> that test

Review #3
By Boyan Brodaric submitted on 03/Sep/2019
Suggestion:
Minor Revision
Review Comment:

This well-written paper considers the problem of low ontology re-use on the web and suggests that convergence to a single ontology would increase re-use. It then evaluates two candidate ontologies against specific evaluation criteria and concludes the Wikidata ontology holds more promise for overall re-use than the schema.org ontology, primarily because of the limited scope of the latter. This is a timely discussion, given the popularity of lightweight ontologies amongst a certain cadre of users, mainly typical web developers, and is well worthy of publication. However, the paper would benefit from more exploration of the following:

- By advocating for the Wikidata ontology, or something like it, the authors effectively promote the development of another upper level ontology with domain extensions. But, in what sense is this better than adopting an existing upper-level ontology and extending it? The knock cited against importing upper-level ontologies is complexity and unintended inferences, but it is not evident that Wikidata is a better alternative: a casual perusal of the top of the Wikidata ontology reveals many similarities to other upper level ontologies, while its internal consistency is not discussed (see next point), so its logical implications are unclear. A similar question exists for the domain elements – how can a single ontology meet specialized needs? As it is unlikely to satisfy users of specialized ontologies, such as in the biosciences and others, does this mean the target audience consists of unspecialized users outside of narrow domains? If so, is this sufficient to increase overall re-use?

- Another issue is the downplay of the ontology quality criteria in the evaluation. While some quality criteria are considered, e.g. completeness and adaptability, other recognized criteria such as internal consistency and semantic accuracy (to intended meaning) are largely ignored. These ignored criteria are significant, because they represent challenges especially to bottom-up driven ontologies, which arguably have greater potential for contradiction and imprecision due to the lack of a top-down semantic backbone and reliance on variable input data. Greater consideration of these criteria in the evaluations would be more informative and effective.

Smaller issues:

- p1, l28, right - please reword: “we have discovered that the two main open-source upper ontologies other than BFO, DOLCE and SUMO, are not used in any of 430 Linked Open Datasets that were investigated for the study.”

- p1, l35, right - please elaborate: “unintended inferences that result from importing the upper ontology in a mid-level or domain ontology.” Is it that the upper-level ontologies are inconsistent, that the import reveals inconsistencies in the local ontology, or that there is a misalignment between ontologies?

- p2, l38, left - suggest: “raising doubts [about] its sustainability”

- p5, l20, left - please elaborate: “This means, for the ontology part, it reuses existing ontologies where possible, but mints URIs for terms in the Wikidata namespace.”