Review Comment:
This manuscript was submitted as 'Tools and Systems Report' and should be reviewed along the following dimensions: (1) Quality, importance, and impact of the described tool or system (convincing evidence must be provided). (2) Clarity, illustration, and readability of the describing paper, which shall convey to the reader both the capabilities and the limitations of the tool.
==========
The paper describes the Linked Open Vocabulary (LOV) initiative in detail. The authors provide an extensive overview on the current state of LOV, describe the system’s functionality as well as the entire process from analyzing new vocabularies to enabling a LOV user to search for reusable vocabularies and vocabulary terms, and how LOV is adopted in other applications or research activities.
The paper is well structured/written, and I enjoyed reading it. The authors motivate their work very clearly, illustrate the impressive amount of harvested vocabulary indicators, and describe the different ways Linked Data practitioners can access all functionality LOV offers. The system itself is also of high quality and works just as described. Overall, the paper and the system leave a satisfying impression. However, there are still a few small uncertainties in the paper, thus, I recommend to accept the paper given that the authors revise the paper according to the following suggestions.
General remarks:
Whereas bullet-points are very helpful for structuring a document and drawing attention to some specific statements, their excessive use causes the opposite. In Section 3, you use them just fine to briefly enumerate various information types, categories or others, but in Sections 6 and 7, they are way too long to point out something very clearly. Please use bullet-point to enumerate solely the categories (e.g. “Catalogues of generic vocabularies/schemas”, “Catalogues of ontologies for a specific domain”), and go into detail in the following paragraphs. However, this is my personal preference and other reviewers can of course proof me wrong.
Furthermore, let a native speaker check the paper. There are various mistakes in spelling and (especially) grammar, which at times disturb the flow of reading.
Abstract:
“LOV goes beyond existing Semantic Web vocabulary search engines and takes into consideration the value’s property type, matched with a query, to …” – what is meant by “the value” and what is its property type? Please make that clear.
Section 2:
Table 1 describes the LOV dataset content by vocabulary element type. Please clarify what you mean by “Instances” and “Datatypes”, especially as they have a median value of 0. It seems that you refer to rdfs:Class, rdf:Property, and rdfs:Datatype, but the reader can get confused, as there is no rdfs:Instance according to http://www.w3.org/TR/rdf-schema/.
In Table 3, the reader can observe that searching for Agents is most prominent. But why? Traditionally, an ontology engineer filters vocabularies that represent the domain of interest directly. Therefore, please provide a brief explanation for this difference-making feature, particularly as it was introduced only in version 3 and users were not accustomed to that type of search.
About Figures 1- 5: In Figure 5, you provide a description of the axes. Please do so for the other figures (Figure 1 – Figure 4) as well to make them more self-explanatory.
Section 3:
The Data Layer of Figure 6 is basically not explained. What are the differences of the “LOV Catalogue” and the others in the Data Layer, and what are their benefits for LOV? Please, provide more details on that.
Regarding the inter-vocabulary relationships, especially the Specialization, Generalization, and Extension: How exactly are these links established? For example, if LOV finds a class from V1 to be a subclass of V2, it automatically establishes V1 to be a specialization of V2, or is there some kind of threshold that must be exceeded? Please, make that clear.
Figure 8 seems to be obsolete. First, The general process is clear from your description. Second, it is more confusing, since the difference between “Is a good fit?” and “Does meet quality?” is not explained. Is there one anyway? It seems that a vocabulary falls in the scope of LOV, iff it meets the quality requirements 1. – 5., or is the scope something else? Please clarify.
Furthermore, the five requirements can also be checked automatically. Yet for LOV it is performed manually. The following two questions arise: a) Which guidelines do the curators follow to ensure the vocabulary meets the five requirements? b) What is the cost or effort (in average) for the curators to review/validate a submitted vocabulary? Please clarify these aspects. (This was also mentioned in the previous reviews)
Section 3.3 is a bit off in its structure. Three of the four data access possibilities are described from 3.3.1 to 3.3.3, but the fourth one is described in 3.4. The readability can be increased, if 3.3.1 describes the Search Engine, 3.3.2 the additional UI facets and elements helping to navigate within the vocabularies catalogue, 3.3.3 the data dump, 3.3.4 the SPARQL Endpoint, and 3.3.5 the API, since they are all part of the data access.
Section 4 and Section 5:
The listing in Section 5 provides fairly convincing evidence for an adequate impact of LOV on the Linked Data and ontology community. Whereas, IMHO, this is sufficient for accepting the paper, there is still a need for a user-study illustrating the actual impact of LOV on the Linked Data and ontology community. Specifically, Section 4 describes the relevance of LOV in three activities to support data publication and ontology engineering. However, this relevance is never proven. A user-study could do so (Such a study is solely a nice-to-have for this submission). One simplistic example: The participants of the user-study have the task of finding equivalent classes for the FOAF classes in Table 7 with and without LOV.
Section 6:
There are no clear statements distinguishing LOV from the other catalogues. For instance: “Catalogues of generic vocabularies/schemas similar to LOV catalogue. Example of catalogues falling in this category are vocab.org, ontologi.es, JoinUp Semantic Assets or the Open Metadata Registry”. But what are the differences between LOV and these catalogues? Is LOV the only manually curated catalogue, or are there further differences? This should be made very clear in the related work section, despite the fact that it might have been already mentioned somewhere else in the paper.
In “Search Engines of ontology terms”, the service vocab.cc (http://vocab.cc/) is missing. Using it, just like LOV, one is able to search for prominent vocabularies and vocabulary terms. Please clarify how LOV is different from vocab.cc and add it to Table 8.
Section 7 and Section 8:
Both comprise “Future Work” in their titles. Please concentrate on the shortcomings of LOV in the Discussion and provide a brief outlook on the future work in the conclusion of the paper.
Minor remarks about the paper:
- Generally, when referring to a section, a figure, or a table, capitalize the reference, e.g., “In Section 3, we will…” or “…as it is illustrated in Table 2.3”.
- Refer to equations with the number in the brackets, i.e. instead of “Equation 1 shows...” use “Equation (1) shows...”
- In various tables: instead of describing the number of vocabularies with “Nb Vocabs”, use “# of vocabs” or "#vocabs"
- Table 6: Instead of N, use |V| for specifying the number of vocabularies in the set of Vocabularies. Less variables provide a better readability.
- “…auto-completion together with http://prefix.cc for namespace…” → use just prefix.cc with a footnote pointing to http://prefix.cc
Minor remarks about the system:
- In the SPARQL editor it says “press CMD – Spacebar for autocomplete” in the completionNotification HTML-div. On a Mac that this command for something else, but “CTRL – Spacebar” works
Typos:
- The last two decades has → the last two decades have
- breakdown of LOV dataset content → breakdown of the LOV dataset content
- 27.98% vocabularies → 27.98% of the vocabularies (that typo occurred more often. Please adjust all of them)
- LOV architecture is composed → The LOV architecture
- The information concerning vocabulary terms use in Linked Open Data → The information concerning the use of a vocabulary term in the Linked Open Data cloud
- In both case → in both cases
- We list below some tools... → Below we list some tools...
- Maguire et al. [17] use LOV search API → Maguire et al. [17] use the LOV search API
|