Review Comment:
Review Summary:
This paper describes the LOD that is developed as part of the OntoIOp registry. I'm confident that this will become an important linked data set in the future. While there is no doubt about the dataset's importance, improvements are necessary to make it easily accessible to a larger audience. The description of the dataset lacks sufficient clarity and detail to be useful to the novice user. The description of the dataset in Section 2 needs to be elaborated (adding detail and precision). Lists/tables and simple statistics could help address this issue (compare previous LOD papers in the journal). Furthermore, the figures need to be better tied in by explaining the depicted relationships and using them as examples in Sec. 2.
The authors remain vague on the maturity of the dataset, which is a concern, though it might be less pressing once sufficient detail is provided. The current state (what is there, what is missing) should be stated more explicit.
While some major rewriting/editing is necessary, I see no technical problems with the described data set. The raised issues about clarity/accessibility to the community at-large can be easily fixed. I support accepting this paper contingent on "the lack of detail and clarity" issue being addressed.
More details on the 3 evaluation criteria:
(1) Quality of the dataset.
I have no doubt that the relationships between the included logics and languages are correctly captured. However, the maturity/completeness of the dataset is an issue: as I understand it, not all mappings/relations between logics and languages are included yet. Be clear about which ones have been modeled and which are left for the future.
As a side issue: While one cannot reasonably expect the dataset to ever be complete, some mechanisms for the inexistence of mappings/translations could be helpful to differentiate between non-mappability and incomplete knowledge. I'm not sure whether that is within the scope of the OntoIOp registry.
(2) Usefulness (or potential usefulness) of the dataset.
The usefulness is not as clearly visible as would be desirable. Neither Hets nor Ontohub use the dataset, though potential future applications are hinted at. The authors do provide some example queries that help understand how the dataset may be useful by itself.
(3) Clarity and completeness of the descriptions
This is my chief concern. For a LOD description, I expect more detail than what is provided in Section 2. While the explanation of the provenance is sufficient, the explanation of what the dataset describes requires elaboration. This should be at a level that non-logicians can understand the basic ideas and use the LOD. For example, you need to explain the difference between logics and languages -- this will not be clear to most users (as often one language is associated with a single logic and vice versa).
Also, a better explanation of the intuitions behind "mapping", "translation", "serialization", "sublanguage", etc. are needed. Explain why mappings/translations are modeled as types as opposed to binary relations.
The current scope of the LOD is a bit vague, some lists/tables to summarize the dataset would be very helpful:
- explain the kind of items (maybe each of the "subdirectories" of the URLs) from http://purl.net/dol/registry that are reflected in the directories in http://purl.net/dol/
- how many of each of the types of items and relationships does the dataset include?
- list & briefly explain the kinds of mappings available, it wouldn't hurt to include the hierarchy of mapping relations from [13]
- what languages and logics are currently included? Given the manageable scope of 29 logics, 43 translations, and 14 languages, it would be easily to list them in a table/figure.
The figures could be more helpful by explaining what the depicted relations in Fig 1 and 2 are: most, I believe, are mappings (though I'm not sure whether sublanguage relations are mappings; at the beginning of Sec. 2 mappings are restricted to logics), but also serializations are included. Are the color coding of expressivity/decidability in Fig. 2 captured in the dataset?
Some minimal working example would be very helpful: one (or more) logics with one (or more) languages and two serializations as well as mappings to other logics/languages and metadata (showing how VoID and SKOS are utilized).
Lesser, though more general concerns about the described project/dataset:
1) The maturity/completeness of the LOD: the OntoIOp registry is still very much under development. While publication on the underlying research are very valuable, I'm note sure about the value of a description of the registry's LOD at this stage. It seems highly likely that the description will be outdated as soon as it is published. That defeats the purpose of describing the dataset to others for them to use/reuse.
2) ability for others to contribute: the purpose of the registry is to enable the community to contribute descriptions of languages, logics, and translations. However, for maintaining the registry, the authors propose to generate it automatically from Hets. This is counter to the desired openness: it would require others to first extend Hets instead of directly contributing to the directory/dataset. I personally think that the LOD should not be permanently tied to any specific software, which poses a significant barrier for the community to contribute. Other mechanisms for maintaining/updating the registry are needed.
Other things that need to be fixed in the final version:
- given that the paper is less than 5 pages in content, the abstract is unnecessarily long. It includes much background information (2nd paragraph, 1st sentence of 3rd paragraph, last paragraph) that should better be placed in the main part.
- p 4: last paragraph of Sec. 3 needs a rewrite to improve clarity
- if possible, the wealth of technical terminology should be reduced to what is essential. This is not supposed to be a description of the entire OntoIOp project, but of the dataset only.
You also need to more clearly separate and exlain differences between the DOL language, Lola vocabulary and the language of the OntoIOp registry at the beginning and clearly distinguish between what is a project (OntoIOp) vs. an artifact (registry, DOL, Lola)
- I can't quite appreciate the relevance of the example on p. 2 as it only uses the language and syntax statements that relate to the registry.
- The URl to Lola on p. 3 needs to be updated
|