Review Comment:
The paper presents a solution for opening data silos associated with Digital Humanities (DH) projects by using the Component Metadata Infrastructure (CMDI). The solution is applied to a number of datasets about textbooks managed by the Georg Eckert Institute (GEI), which are used as a test case. Those datasets are heterogeneous in (i) semantics, (ii) formats and (iii) sizes. The authors provide an analysis of the GEI datasets from different perspectives, which include data visualisation, exploration, and interaction. This analysis aims at identifying general requirements that Linked Data applications should have and linked datasets should provide. Finally, the authors conclude that the CMDI is a fair solution for enabling the reuse, interoperability, and interaction of data in the DH domain.
==== Overall comments ====
The paper is in general well structured.
Nevertheless, the paper shows significant weaknesses that, in my opinion, prevent it from publication as it is in its current form.
=== Strengths ===
The problem of opening data silos by solving heterogeneity issues is challenging and relevant to the Semantic Web Journal.
The idea of identifying requirements for homogenising heterogeneous datasets in DH in a bottom-up fashion (i.e. by analysing existing applications that use legacy data) is interesting.
=== Weaknesses ===
However, the analysis of existing datasets resulting from past GEI projects should be significantly extended. More in detail, the paper misses a glue between the description of data visualisation, exploration, and interaction schemata (as resulting from legacy applications) and the adoption of CMDI.
First, it is unclear why CMDI should be adopted, besides the fact that "CLARIN assumed that metadata for language resources and tools existed in a variety of formats, the descriptions of which contained specific information for a particular research community". Then, the authors completely overlook how CMDI should be adopted.
The paper misses a discussion about how CMDI has been used. To be honest, the proof of concept presented in Section 3.4 does not add any scientific value to the paper.
Furthermore, the paper lacks of scientific rigour. There is a high level view over the problem, but such a problem is never properly tackled as a work submitted as a research paper should deserve.
The related work section is not sufficient and in general too shallow.
The authors take into account a very limited horizon of works at the state of the art that tackle a similar problem.
|