OntoSeer - A Recommendation System to Improve the Quality of Ontologies

Tracking #: 3242-4456

Pramit Bhattacharyya1
Samarth Chauhan
Raghava Mutharaju

Responsible editor: 
Guest Editors Tools Systems 2022

Submission type: 
Tool/System Report
Building an ontology is not only a time-consuming process, but it is also confusing, especially for beginners and the inexperienced. Although ontology developers can take the help of domain experts in building an ontology, they are not readily available in several cases for a variety of reasons. Ontology developers have to grapple with several questions related to the choice of classes, properties, and the axioms that should be included. Apart from this, there are aspects such as modularity and reusability that should be taken care of. From among the thousands of publicly available ontologies and vocabularies in repositories such as Linked Open Vocabularies (LOV) and BioPortal, it is hard to know the terms (classes and properties) that can be reused in the development of an ontology. A similar problem exists in implementing the right set of ontology design patterns (ODPs) from among the several available. Generally, ontology developers make use of their experience in handling these issues, and the inexperienced ones have a hard time. In order to bridge this gap, we developed a tool named OntoSeer, that monitors the ontology development process and provides suggestions in real-time to improve the quality of the ontology under development. It can provide suggestions on the naming conventions to follow, vocabulary to reuse, ODPs to implement, and axioms to be added to the ontology. OntoSeer has been implemented as a Protege plug-in. We conducted a user study of the tool in order to evaluate the quality of the recommendations. Almost all the users are satisfied with the recommendations provided by OntoSeer and a majority of them agreed that OntoSeer reduces their modelling time. The source code and the instructions to install and use the plug-in are publicly available at https://github.com/kracr/ontoseer. A short video demonstrating the use of OntoSeer is available at https://youtu.be/iNQOJGZkZKQ.
Full PDF Version: 

Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 28/Sep/2022
Major Revision
Review Comment:

The paper describes OntoSeer, a tool that monitors the ontology development process and provides real-time suggestions to improve the quality of the ontology.
In particular, the tool provides several functionalities:
Recommendation of classes, properties, axioms, and the ontology design patterns
Suggestions of names to use during the development
Integration with Protégé
The tool relies on several resources: competency questions, existing ontologies, ontology Design Patterns, and vocabularies.

Quality, importance, and impact

The tool is very relevant and helpful. The Related Work section clearly reports the differences with respect to other existing tools.
However, the evaluation is not precise, and the results are not convincing. The number of users is tiny, only 21, with only two expert users. Results do not clearly prove the tool's effectiveness. Both graphs and tables are confused. I suggest reorganizing/rewriting Section 4.

Clarity, illustration, and readability

The whole system is not detailed described. Technical details are missing. Maybe a sketch of the system architecture could be helpful.
It is not clear how suggestions are generated. You talk about indices and ranked lists, but no details are reported. You use about one page to describe the Jaro-Winkler distance/similarity, a well-known measure. You can use this space to provide technical information. All your suggestions seem to be computed by string similarity/distance. It is unclear if you rely on the ontology structure to build more valuable suggestions.
You put together search results coming from API with ones provided by your algorithm. These two results sets are not homogenous. How do you combine them?
It is unclear how the index is updated. Can I download an updated index? Do I need to rebuild the index?

Long-term stable URL for resources

The GitHub repository is well documented.

Minor issues

- abstract: please, avoid URLs in the abstract. You can add a reference to both video and repository in the text.
- page 8, row 34: "We evaluated OntoSeer through user study by considering the following questions." -> "We evaluated OntoSeer through a user study by considering the following questions.".
- page 13, row 33 "with all the owl files and user response are available at https://github.com/kracr/ontoseer/tree/master/Evaluation" -> Please, put the URL with the hyperlink in the footnotes.
- page 14m row 40 "OntoSeer plug-in are publicly available at https://github.com/kracr/ontoseer." -> URL in the footnotes.

Review #2
Anonymous submitted on 04/Nov/2022
Minor Revision
Review Comment:

The authors propose a Protégé plug-in named OntoSeer to help ontology developers create new ontologies.
The tool supports several options like recommending Ontology Design Patterns, axioms, and vocabularies. It helps with following well-recognized naming conventions and supports class hierarchy validation.
The authors introduce several other plug-ins and tools for comparison, including OntoCheck, OOPS!, OntoClean, ODEClean plug-in of WebODE, and OnToology.
Afterwards, the authors present how the features of OntoSeer operate.

Finally, the authors provide an in-depth analysis based on a user study to assess the usefulness of the proposed plug-in.

This final part is particularly interesting. It is clear that the adoption of OntoSeer increases user awareness of best practices and helps them save time while designing new ontologies.
Even though the number of respondents and the outcomes are not enough to highlight significant differences, the aspects mentioned above are definitely assessed.
The overall work does not have an exceptional novelty in terms of theoretical/technical proposal. However, the tool seems to be useful for ontology designers, and the user study constitutes a novel contribution that could guide future research.
Finally, the authors make available the plug-in source code (thus following the modern best practices in scientific paper writing) and a short explanatory demonstration.

Nevertheless, some important aspects need to be clarified, and I will detail them in the following.

In Section 3.1 (Class, Property and Vocabulary Recommendation), the process of retrieving the most similar ones needs to be clarified. The authors mention the creation (or not, depending on the case) of an index, but more details are required. Furthermore, can the local and remote retrieval results (based on different indices) be put in a single ranked list?

Analogously, Section 3.3. (Axiom Recommendation) is unclear since the index and Jaro similarity seem to be used in combination, but they need to include the exact details of this. Maybe an example could be useful in this case.

Review #3
By Shyama Wilson submitted on 20/Jan/2023
Major Revision
Review Comment:

---Review Comments: --
This is a paper submitted as 'Tools and Systems Report'. Thus, I would like to start by summarizing my thoughts in light of the Journal's requirements (https://www.semantic-web-journal.net/reviewers).
(1) The author has mentioned the requirements of the tool and clearly described its capabilities.
(2) The described tool is accessible on the web: GitHub. Moreover, data files are available in the given repository and contain a README file which makes it easy for users to access the data and install the tool. The use cases used for the evaluation and the developed ontologies (i.e., links) are also available in the repository.
(3) Quality, importance, and impact of the described tool or system
Considering the user-friendliness, functions, and installation, I found that the tool is fine. However, when I tried this tool with Protégé 5.0.0, only the class hierarchy validation functions worked and the other functions did not (ie, the errors were shown in the Protégé logs). Finally, I tried the tool with Protégé 5.5.0, then, all the functions described in sections 3.1 -3.5 worked. But, I occasionally experienced Protege windows crashing, This may not be a big issue when creating an ontology for testing purposes. But definitely, this will be a hassle in developing real ontologies and Real users may feel a bad experience. I noticed that this issue has been reported by two users in your User Study/Test.

The author has addressed a timely issue by introducing a tool.
However, based on my opinion, the authors have not provided sufficient convincing evidence to show the impact of the tool in developing a high-quality ontology. For instance, the authors stated that Ontoseer, based on the user study (21 respondents), helps to reduce the ontology modeling time. But for a journal publication at this level, I was expecting rigor testing that was somewhat akin to your evaluation of the ODP recommendation. For example, The authors can give sample use cases for developers to model an ontology/ontologies with or without Ontoseer. Then, authors can keep track of the modeling time/duration that will be spent by developers (i.e., two independent groups) and can analyze the modeling time. Additionally, evaluating the user-friendly of the tool is crucial, although it is not presented in this paper.

(4) Clarity, illustration, and readability of the paper are good. The authors have clearly described the capabilities. But, they didn’t discuss the limitations of the tool, which is equally important.

*** Comments for the sections***
--Abstract: Page 1--
Line 26: “Apart from this, there are aspects such as modularity and reusability that should be taken care of”, In the abstract, you mentioned the two ontology quality characteristics: modularity and reusability. I could see how you addressed the reusability. However, it is not clear how you handled the modularity using Ontoseer. "Modularity" is a crucial characteristic of an ontology. Therefore, it is necessary to clearly explain how you address the modularity in the appropriate Section.

Line 30: “we developed a tool named OntoSeer, that monitors the ontology development process and provides suggestions in real-time to improve the quality of the ontology under development”. In this case, I thought that OntoSeer automatically tracks and displays modeling pitfalls and suggestions. Later, however, I realized that tool users want to execute/click the functions anytime they wish to review the specified quality aspects (i.e., class hierarchy validation, ODP recommendation, etc.). The mentioned sentence, in my opinion, misleads the reader right away. I, therefore, recommend altering the sentence.

Page 2, Line 2: “Experienced developers face issues while building ontologies, and this problem only magnifies in the case of inexperienced ontology developers…” . What are the issues you are referring to? It is unclear to me the problems experienced by novice developers. Please give some examples or cases.
Page 2, Line 5 – It is unclear to me how you addressed the ontology modularity using ODP. I would like to suggest you define the modularity you are referring to. Then, describe how it is addressed through OntoSeer.
Page 2, Line 12 – Here, you only mentioned reusability, no modularity, why?

--Related Work:--
Although the authors have covered several relevant works, some significant works are lacking. i.e., XDAnalyzer of the XDTool which used ODPs (http://ontologydesignpatterns.org/wiki/Main_Page)
• i.e., “XD Analyzer: The aim of this tool is to provide suggestions and feedback to the user with respect to how good practices in ontology design have been followed, according to the XD methodology (for instance missing labels and comments, isolated entities, unused imported ontologies).”
• RepOSE – debugging tool
Optional (Readings)
o See Ontology Summit 2013 – Tools
o Protégé Debugging tools

Moreover, I suggest adding a table that summarizes and compares the related tools/works.

3.1. Class, Property and Vocabulary Recommendation
This function retrieves a related set of vocabularies from the existing repositories for the selected classes/properties. It's great to have such a feature for ontology developers. However, as I observed, in order to add the suggested vocabulary, developers need to spend some time examining the proposed vocabulary and need to understand the way of adding it. In this situation, it would be better if the tool can provide some modeling suggestions with the existing ontology and the suggested vocabulary. But I know it is not straightforward and needs to do some research and required a reasonable time to address it.
3.2. Class and Property Name Recommendation
This function properly works as explained in the paper. Mainly, this explains the suitable naming conventions for classes and properties that help in improving the readability and maintenance of an ontology.
According to what I understand, the current tool merely recommends suitable naming conventions; then, the user must remember it in order to add it later. This can be solved (i.e., the usability of the tool can be improved) by enabling users to add a suitable convention to the ontology at the same time when OntoSeer suggests naming conventions.
3.3. Axiom Recommendation
This function recommends some appropriate axioms to be added to the ontology.
Similar to my suggestion proposed for Section 3.2, For this function also, it is great if users/developers could add appropriate axioms at the same time when OntoSeer suggests them (i.e., axioms).
3.4. ODP Recommendation
This is a significant function for both experienced and inexperienced developers. However, it could be challenging for inexperienced developers how the suggested ODP/s are mapped with the existing ontology. In this case, if OntoSeer can show the example modeling solution with the suggested ODP is useful as performed in the XD Analyzer tool.

3.5. Class Hierarchy Validation
This function validates the class hierarchy according to OntoClean. This is also a significant function for both experienced and inexperienced developers.
In the paper, you have explained OntoClean characteristics (i.e., Rigidity, Identity, Unity), but, you have not explained OntoClean principles/ constraints which are crucial when assessing the hierarchies. Better to explain them.
Ex:- OntoClean explains: “Given two properties p and q, where q subsumes p, the following constraints must hold:
1. If q is anti-rigid, then p must be anti-rigid
2. If q carries an identity criterion, then p must carry the same criterion
3. If q carries a unity criterion, then p must carry the same criterion
4. If q has anti-unity, then p must also have anti-unity
5. If q is externally dependent, then p must be……"

Moreover, If you can display these OntoClean principles as hints in OntoSeer, it is good for developers to get a good understanding of the hierarchical validation.
Furthermore, when using this feature through OntoSeer, I initially struggled with how to choose or determine which classes should be put to the test: Whether I need to test the hierarchical classes in the existing ontology OR whether I need to test the hierarchical classes before adding them to the existing ontology. This situation has not been clearly explained in the paper.
Additionally, there is no reporting/recording feature for the hierarchy validation that was done. This would help post-validation after adding classes to the ontology in the case of a new addition.

*** Suggestions for the tool’s Usability ***
• Having a feature to keep or record the recommendation/validations is helpful, but this tool does not have one.
• Need some improvements to the tool's usability:, e.g.:- some features (adding naming convention, axioms adding, etc.) can be automated rather than allowing developers to do them manually.

--User Study--
+ The necessary proof pertaining to the user study has been presented.
+ The results' explanation has also been given.
- Page 11, Line 39: “..Fourteen, that is, 66.67% of users believed that OntoSeer saves modeling time while, the remaining seven chose to be neutral……”
I do not satisfy with the test performed for modeling time. Testing the modeling efficiency is crucial for this type of tool (i.e., as it is one of the main objectives). Providing rigorous testing is required at this level. Please kindly read my comments given at the beginning: “For instance, the authors stated that Ontoseer, based on the user study (21 respondents), helps to reduce…………….”

- Quite General. Repeating the same things in the content.
- Limitations: No clear explanation. In addition to the summary of contents, it is important to add a rigorous discussion highlighting limitations.