Review Comment:
This manuscript was submitted as 'Data Description' and should be reviewed along the following dimensions: Linked Dataset Descriptions - short papers (typically up to 10 pages) containing a concise description of a Linked Dataset. The paper shall describe in concise and clear terms key characteristics of the dataset as a guide to its usage for various (possibly unforeseen) purposes. In particular, such a paper shall typically give information, amongst others, on the following aspects of the dataset: name, URL, version date and number, licensing, availability, etc.; topic coverage, source for the data, purpose and method of creation and maintenance, reported usage etc.; metrics and statistics on external and internal connectivity, use of established vocabularies (e.g., RDF, OWL, SKOS, FOAF), language expressivity, growth; examples and critical discussion of typical knowledge modeling patterns used; known shortcomings of the dataset. Papers will be evaluated along the following dimensions: (1) Quality and stability of the dataset - evidence must be provided. (2) Usefulness of the dataset, which should be shown by corresponding third-party uses - evidence must be provided. (3) Clarity and completeness of the descriptions. Papers should usually be written by people involved in the generation or maintenance of the dataset, or with the consent of these people. We strongly encourage authors of dataset description paper to provide details about the used vocabularies; ideally using the 5 star rating provided here .
The paper describes the current state of Rijksmuseum collection as Linked Data. It presents the history of the data, its characteristics and provides some statistics and overview of the links from the collection. The paper provides a solid work that points state-of-the-art in semantic technologies for the cultural heritage domain. The paper is clear and well-written however there are some weak points regarding the description of the usefulness of the data and there is little being said about some aspects of the dataset which I specify below.
(1) Quality and stability of the dataset - evidence must be provided.
No evidence as to the quality and stability of the dataset are provided, there is however a link to an object that gives evidence to the fact it is a stable dataset. Nevertheless there is no mention of how frequent the dataset is being updated and what are the major difficulties regarding keeping it up to date, and at the same time keeping the links to other datasets stable. what is the procedure around version numbers is maintained.
You mention in Section 6 that there is a danger of Getty vocabularies disappearing, what is this statement based on? please add a reference. Also the authors write that the museum chooses to maintain its own vocabulary, what implication does this have to your dataset and to others.
The authors should also consider adding more descriptions about the Iconclass vocabulary, how was it linked and is it sufficient? Furthermore the authors write 189,041 objects have at least one Icoclass annotations, what about the renaming objects, how many of them are linked to other annotations and to which ones.
In the last paragraph in Section 6 the authors mention a matching process, was it an automatic process? and if it was, how accurate is this process, what are the major difficulties?
(2) Usefulness of the dataset, which should be shown by corresponding third-party uses - evidence must be provided.
The authors describe the usage of the data rather than its usefulness. There are some statistics about the amount of people who are using the collection; some older references to systems which demonstrate the collection was in use 10 years ago and nowadays through the Europeana's API, but they say nothing about the purposes of of these usages, how does this collection extends other collections and what are the benefits it brings to other users? do you know who uses the collection? the types of API requests? what has changed comparing to previous systems you refer to and today's systems?
(3) Clarity and completeness of the descriptions.
Related to Figure 1, it will be interesting to see a figure representing the hierarchy of these concepts in your model such as how many concepts are related to ic:71 via skos:broader.
Most of the predicates listed in Table 1 are straightforward and easy to understand but the writer could expand the text in Section 5 and describe these predicates more thoroughly, for example, was dcterms:hasPart also used to specify objects belonging to an exhibition or a collection, how many "edm:type" are there in the Rijksmuseum collection? is it a closed set of types? do you use any vocabulary for the types?
In the discussion section, the authors claim some facts without providing any references, for example "only a limited number of institutions have managed to make their collection available as Linked Data", "... many institutions are hesitant to do so, in fear of losing a possible revenue stream"?
In the last paragraph, "digitised objects are added on a daily basis and employees extend and refine information ...", is this does manually? can you please elaborate how information is defined?
Some minor comments:
* The abbreviations presented in Fig 1 should be explained for example, I assume ic: is the abbreviation for Iconclass but this is not stated explicitly before the figure is presented.
* Section 5 on page 4, write explicitly that "textual description" is dc:description the same way you exemplify the other predicates in this paragraph.
* The URL in reference [1] is not reachable.
|