Review Comment:
I acknowledge the fact that the authors have considered most of my previous comments and improved the quality of the dataset.
I have a major concern about the sustainability of the dataset, which I consider extremely important to be able to accept this paper as a Linked Dataset description: although the authors claim that they do a weekly update, the dumps are only made available for two specific points in time (September 2013 and March 2014). Why is that happening? What is going to be the sustainability of the dataset in the future? With the query at [1] (link to the results at [2]), I get that last updates were done in April, which is not so good.
[1] select distinct ?x ?y where {?x a ; ?y}
[2] http://bit.ly/1rkXsfQ
Before going further, I would really like to have an explanation of what will happen in this respect. This will be fundamental in order to know whether the paper can be accepted or not as a Linked Dataset description, IMO.
Now I will move into providing my review according to the three sets of topics that are consdiered for the Linked Dataset descriptions: (1) Quality of the dataset. (2) Usefulness (or potential usefulness) of the dataset. (3) Clarity and completeness of the descriptions.
(1) Quality of the dataset:
The quality of the dataset has improved since the last review that was provided, taking into account most of my comments. I enumerate some of them:
- The threshold for selecting or not a dataset has been updated and it looks now much more sensible and less ad-hoc.
- There is still the issue of considering whether properties with the same name in different datasets really refer to the same property. Some reconciliation has been done though in terms of time and geographical information, which is good.
- I am happy with how the aspect related to slices has been dealt with in the paper.
- I am also happy with the initial interlinking that has been done with LinkedGeoData. I understand that more work could be done in this respect, but I also understand that this would require a huge effort.
(2) Usefulness:
This is clearly a useful dataset. It has a strong dependency on OpenSpending, but given that this is a well-maintained project, it seems that the data will become more and more useful over time.
(3) Clarity and completeness
The descriptions are of good quality, and it is acknowledged that the source code is made available and documented.
Final set of comments to improve readability:
- In the second paragraph, the links to CORDIS or Greece public spending look very ad-hoc and weird. I would even suggest removing them as they are such a small set of data in the context of the whole dataset that is presented here, but I leave this decision up to the authors.
- I would suggest adding the namespaces in table 1 to services like prefix.cc, since not all of them are there. I would also recommend renaming the caption of table 1 to Namespaces and prefixes used in the paper.
- You comment in section 3.1 something that I do not understand: "Apart from the fixed data cube meta model, the structure of each dataset is completely up to the creator". I cannot understand this point and I suggest removing it since it does not add anything to the description. In fact, I do not agree with it unless explained differently. Or are you referring to a generic data cube model instead of RDF DataCube? It may be good to join then this data cube model with the RDF Data Cube model in section 3.1, so that everything is easier to understand.
- Be careful in general with cross-references to sections. For instance, in section 4 you refer to the RDF Data Cube vocabulary as being described in Section 3, but it is actually described in section 4 in the end.
Typos
- "datasets Such" --> "datasets. Such"
- "allows to serve" --> "serves"
- "can can" --> "can be"
- "adressed" --> "addressed"
- "skos:ConceptSchema" --> "skos:ConceptScheme"
|