Review Comment:
The paper describes the Linked Data version of the Rijksmuseum Collection with the use cases of data aggregation and querying for solving complex research questions.
The paper fits well in the Linked Data Descriptions category of the journal.
Section 1 provides an introduction to the topic in very generic terms without much substance content, e.g., regarding the dataset or related works. Section 2 describes the digitization project underway in the museum. Also this chapter does not contain much content regarding the actual case and should be made more concise.
Section 3 describes the dataset model and statistics about it. The model in use is EDM and is not described in detail but rather by giving a reference. I think more details would have been useful here, e.g., whether full EDM was used or whether additional properties were introduced. It would have been nice to know also, how well the EDM model fit the use case or other lessons learned in applying it, and why and how “handles” and purl.org URIs were used and experiences of using them.
Figure 1 illustrates the data model and related vocabularies. This is useful, but the figure should be explained in the text, which would be helpful to the reader.
The data set is “linked” because it makes use of the structured vocabularies AAT and Iconclass in RDF form. The links to e.g. other collections and data sets are therefore indirect via these thesauri, and only if they are used also in the other collections. Collection items do not e.g. have direct links the same objects described in DBpedia. It is not clear if links to related pictures etc. can be aggregated in Europeana using EDM. Linking (for getting the fifth star) is therefore not very “rich” but anyway useful.
Figure 2 illustrates frequencies of concepts used in annotations. They could be discussed in more detail. Does 2a mean that only 4-5- AAT concepts are actually used? Then linking to AAT is trivial.
After this in Section 5, applications of the dataset are described. Supporting multilingual access has been a driving force behind the project, as well as the desire to establishing compatibility with Europeana. It is not explained how well the data finally fits with Europeana – discussions on data quality would be important in this category of papers. Multilingual access and data compatibility are important aspects of usefulness but not really “applications”, as the section title suggests. Next, “curser search” [11] is described shortly as an application. I tried the demo, it worked, but the claim that this “provides an ideal basis for users to explore the collection” is not substantiated without more explanations. It is not easy to see the benefits by looking at the demo interface clusters. Finally, using the RDF base as a research artifact to answering a research question about themes in bibles is discussed. It turns out that an additional dataset is needed for this, mapping collection objects to objects in a bibliographical dataset. This is described in Section 6.
The last section Discussion summarizes the work opening some avenues for further development.
This paper presents an important and extensive linked dataset. The general approach seems quite appropriate. The paper does not present novel scientific results, which is acceptable in this category of data description papers. Instead, authors are advised to focus on 1) evidence of data quality, 2) usefulness, and 3) clarity/completeness of the descriptions. In my mind the presentation still needs more rigor and major revisions as explained above. Especially, the dataset should be described in more detail, reasons for the design choices made there should be justified/explained explicitly, and lessons learned discussed. A footnote (14) web address is given to a “description” of the data but there is no detailed documentation about the data. Thse revisions require more space but that can be obtained by e.g. shortening sections 1-2, perhaps explaining concept usage in Fig. 2 only verbally, and leaving out or shortening several non-informative or speculative or “museum political” paragraphs now present in the text, such as the last two paragraphs in Discussion. Also some concrete evidence and discussion about the linked data quality is also needed, in addition to explaining the museum's process and goal for high quality.
|