TermIt: Managing Normative Thesauri

Tracking #: 3487-4701

Petr Křemen
Michal Med
Miroslav Blasko
Lama Saeeda
Martin Ledvinka1
Alan Buzek

Responsible editor: 
Guest Editors Tools Systems 2022

Submission type: 
Tool/System Report
Thesauri are popular, as they are well-understood by domain experts, yet formal enough to boost use cases like semantic search. Still, as the thesauri size and complexity grow in a domain, proper tracking of the concept references to their definitions in normative documents, interlinking concepts defined in different documents, and keeping all the concepts semantically consistent and ready for subsequent conceptual modeling, is difficult and requires adequate tool support. We present TermIt, a web-based thesauri manager aimed at supporting the creation of thesauri based on decrees, directives, standards, and other normative documents. In addition to common editing capabilities, TermIt offers term extraction from documents, tracking term definitions in documents, term quality and ontological correctness checking, community discussions over term meanings, and seamless interlinking of concepts across different thesauri. We compare TermIt to other tools and evaluate its features and usability in E-government scenarios in the Czech Republic.
Full PDF Version: 

Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 27/Jul/2023
Minor Revision
Review Comment:

Overall, the authors have addressed my comments. However, the paper still needs polishing before it is published.

Features of the tool have been mentioned in the abstract. However, the last sentence of the abstract is unclear. I suggest the authors briefly elaborate on the results of the evaluation.

In the list of contributions, it will be good to add references and/or links to the resources. Fo instance, a link to the "browser plug-in for web document annotation". A reference/link to TermIt (when it is first mentioned in the paper) is missing. The related work section is still a bit flat.

Page 3, line 24 is not necessary.
Page 3, line 31 - not a common practice to begin sentences with "E.g.".
Page 4, line 3 - the term UFO has already been introduced in the beginning of section 2. Abbreviations should be used consistently.
Page 6, line 8 - paragraphs should have more than 1 sentence. This appears at other places in the paper as well.
Missing references to some technologies in section 3.
Page 9, line 14 - "w.r.t" should be spelled out or paraphrased.
Page 9, line 15 - can the authors provide the names of the project and if possible links to them?

References should be checked for consistency.

On a side note, I recommend the authors highlight their changes in different colour in next revisions.

Review #2
By Patricia Martin-Chozas submitted on 27/Jul/2023
Minor Revision
Review Comment:

Although authors have addressed my comments of the previous revision, I still have some concerns, for instance, regarding the title of the paper. The title states “TermIt: Managing Normative Thesauri” but then, in page three, authors state that the tool was already presented in a previous paper. My point here is, what’s the real purpose of the paper? It is not clear neither in the title nor in the abstract or the introduction.

Moving to the abstract, it has been improved. However, there is more room for improvement: it states that “thesauri are popular”, but authors do not justify why. In which context are they important? For NLP tasks? In that case, which ones? The paper lacks motivation, both in the abstract and in the introduction. Why are the resources from MPP and PBR selected? More context to justify the need of this work should be added.

Following with the introduction, Figure one needs a deeper description. Authors have added a legend that explains the meaning of arrows and boxes. However, the in-text reference to the figure is scarce (“as exemplified in Figure 1…”). Such a complex diagram needs to be accompanied with a thorough description in the text.

In Section 2 I still have the same concern as in the previous revision: it is supposed to present the background of the paper where the reader would expect something like previous work by the authors on the topic, the beginning of the research, but it only contains references to the vocabularies used by the application: SKOS and the UFO ontology. Authors did not address this comment in this version. In my opinion, they should either change the name of the section (to “applied vocabularies” for example) or change the content of the section, really explaining the background of this work.
With regard to Fig 2, describing the architecture, authors have also skipped my remarks from the previous review: I really recommend to indicate which are the input and output data of each component. In the current version of the diagram it is not clear what the arrows mean nor the data flow. I think authors should look for examples of complete architectures and redo this figure.

In Section 4, I was surprised by the first sentence: “Various tools addressing the problems P1-P3 were investigated, which we described in a survey report”, and a footnote to a Google Sheet. This seems to me quite non-academic. I suggest the removal of this footnote and the addition of this tool comparison as a table in the Related Work section.

My final comment is with regard to the conclusions section, which seems a bit scarce and it is limited to a brief summary of what is presented and a paragraph for future work. I suggest the improvement of this section with some discussion about limitations and next steps, for instance.

Minor remarks:
P1, L34 and 41: unify quotation marks
P3 L10: the use of “languages” in that context is not accurate, I recommend the use of “models” or “vocabularies”.
P4 L13. I suggest adding a complete example for the use of the UFO ontology as it was done for SKOS in Listing 1.
P4 L32. I’m confused by the use of “vocabularies” here. In the SW, vocabularies are used to denote semantic models. If the tool is used to manage thesauri, why using another term to call them?

Review #3
By Harshvardhan J. Pandit submitted on 20/Aug/2023
Review Comment:

I am satisfied with the responses to my comments, and recommend this article be accepted - provided the other reviewer's comments are specifically addressed (as judged by their reviews).

I am not completely satisfied with the comparison of TermIt with other existing tools as it only compares them based on self-identified problems (P1-P3 in paper), rather than a broader set of requirements associated with thesauri management. However, I have interpreted this as a tool developed only to address these challenges as identified in the motivation section. The authors can add a statement to this effect in the conclusion, with a possible future direction being towards either embracing existing features into TermIt or feeding lessons learnt here (from application) to specific tools e.g. VocBench. I also encourage reiterating that the tool is available for reuse with its code and a permissive license.

Minor issues:
1. Please use numbers in the contributions list for easier reference to it.
2. Footnote 17 is for a Google Docs document. This is not good for longetivity. Please either put a copy of this in the long term resource repository, or upload it to an archive e.g. Zenodo.