A content-focussed method for reengineering thesauri into semantically adequate ontologies

Tracking #: 960-2171

Daniel Kless
Ludger Jansen1
Simon Milton

Responsible editor: 
Krzysztof Janowicz

Submission type: 
Full Paper
The re-engineering of vocabularies into ontologies can save considerable time in the development of ontologies. Current methods that guide the re-engineering of thesauri into ontologies often convert vocabularies in a merely syntactic way and ignore the prob-lems that stems from interpreting vocabularies as statements of truth (ontologies). Current re-engineering methods also do not make use of the semantic capabilities of formal languages in order to detect logical mistakes and to improve vocabularies. In this paper, we introduce a content-focused method for building domain-specific ontologies based on a thesaurus, a popular type of vocabulary. The method results in an ontology that does not only adhere to the semantics of the description logic OWL, but also contains a semantically rich description of the modeled entities, enables non-trivial automated reasoning, and can be integrated with other ontologies following the same development principles. We explain the motivation and sub-activities for each of the steps in our method and illustrate their application through a case study in the domain of agricultural fertilizers based on the ACROVOC Thesaurus. Foremost, our method shows that a considerable manual effort is required to derive a semantically rich ontology from a thesaurus, particularly in connection with the alignment to a top-level ontology as well as for the identification and formal specification of membership conditions. Applying our method is likely to change the structure of a thesaurus consid-erably. Our method is particularly useful where a highly reliable is a hierarchy or consistent definitions are crucial.
Full PDF Version: 


Solicited Reviews:
Click to Expand/Collapse
Review #1
By Eetu Mäkelä submitted on 19/Feb/2015
Review Comment:

In general, all my previous comments have been addressed, and the organization and argumentation in the article have both improved. Thereby, I suggest accepting the article. Nonetheless, below you'll still find just a couple of totally minor issues I suggest ironing out before publication.


Page 2: "The combination of computational tractability, the strong reasoning support for consistency checking and generating the inferred class hierarchy, as well as the use of XML-based syntaxes and unique identifiers (IRIs/URIs) are considerable advantages and a reason for the high popularity of OWL-DL as compared to other truth conditional logical systems." is slightly too strong on the one hand and too hand-wavy on the other hand to my taste.

Page 4: RDF Semantics (RDFS) -> RDFS Semantics

Page 5: "One has to consider, though, that there is an RDF-based semantics for the OWL syntax [42]. However, this semantics nullifies important syntactical distinctions of OWL, including the distinction between class terms and individual terms, or between TBox and ABox.". I don't agree with this statement. What even is a syntactical distinction and how is that important? As this assertion is anyway a runaway one, you could just delete it.

Page 6: wonky layout

Page 25: neclected->neglected

Page 25: "The authors very much thank Jens Wiebensohn from the Agricultural Science at the University of Rostock for sharing his knowledge about fertilizers." <- should there be a "Department" or something in this affiliation?

Review #2
By Stefano Borgo submitted on 20/Feb/2015
Review Comment:

The content of this version of the paper is much better, weaknesses have been addressed in a fair way and were possible resolved, the motivations and the methodology are clear, fairly well exemplified and limitations discussed to a reasonable extent. I’m satisfied with the work done by the authors and with the way they used my comments/observations. Point 16 and the related issue, which was not clear to the authors, does not apply to the new version, so I consider that solved as well. There are some typos and a few things to fix/add as reported below. Beside these, in my view the paper is ready and can be published.

Please rewrite:
pg.18-19 subsection “Application of the step to the fertilizer ontology”
the discussed points (a), (b) etc. do not correspond to the points (a), (b) etc. at pg.17
pg.24 col.1: you explain why you classify fertilizers as dispositions and not as roles, add an explanation for dispositions vs functions also.

Please correct:
pg.2 col.1 “we are not aware of any method that describes a […] method” (??)
“has can find”
pg.3 col.2 “hast”
pg.15 col.2 “we were not able without being able to”
pg.19 col.1 “usefulness” -> “useful” (perhaps even better to use “suitable”)
pg.20 col.1 “retorted” (perhaps you meant “resorted”)
“which not pre-occupied with” (??)
col.2 “relationships n OWL”
pg. 24 col.2 “(step 7) or may be” (“or” is misplaced)
missing information in reference [28]
cite the latest version of Dolce, see the chpt. "Foundational Choices in DOLCE" in the Handbook on Ontologies 2009