Review Comment:
Before reviewing this paper I thoroughly read a previous version of this paper published in SEMANTiCS 2016 [1] (as mentioned in the cover letter). Both papers are well written and describe 1) metrics to characterise RDF archives (versioning) and 2) 5 categories of queries to benchmark the performance of data retrieval from RDF archives (the first 3 types of queries present in the evaluation). These metrics are reasonable and are novel when they are published for the first time. However, it's not convincing that the current version provides enough extra insight to be included in the special issue, despite it would be a good fit.
The current version shares the same essential ideas with the previous one and provides more details of those ideas, and more importantly, more comprehensive evaluations with 2 more datasets and more complex queries. In other words, most new contents fall in the evaluations. These extra evaluations are probably useful for researchers in related areas but don't seem to provide more insight about the benchmark described in the paper.
* Evaluation of metrics of dataset configuration (metrics in Section 3.1)
Instead of giving these metrics for one dataset (bear-a), two more tables are given for DBpedia Live (bear-b) and Open Data portals (bear-c) respectively. Again I think these tables are very informative, however, they don't further justify the proposed metrics. These tables would make important contributions in a paper surveying popular RDF datasets from an archiving point of view, but less so for the purpose of evaluating a benchmark. The metrics are reasonable (to me) and applying them to a dataset demonstrate their utility. However adding two more datasets adds little to the contribution of the paper.
* Evaluation of atomic types of queries (metrics in Section 3.2)
Five types of queries are proposed to cover a broad spectrum of data retrieval from RDF archives. In both versions three types are evaluated. In this version two more datasets are used and more complex queries (i.e. queries with more than a single triple pattern). The same argument stated above applies here too. It'd be more interesting to have queries from all five categories rather than to repeat similar evaluations with different queries from the same categories. The evaluation gives samples falling in a relatively short range of the spectrum while the more interesting and more demanding types of queries are not discussed. In addition the evaluation compares the performance of two engines, Jena and HDT, and HDT outperforms Jena in most of the tests. The paper shows that HDT is faster than Jena on the independent copy policy, which means HDT is faster in general disregard of the RDF archiving scenario. Later the paper concludes that HDT implements the delta copy policy more efficiently than Jena, however, it's not clear to me whether this conclusion is made after ruling out the general advantage of HDT over Jena. If not, then the only conclusion we can draw from the evaluation is that HDT is faster than Jena, which is straightforward and irrelevant to the main contribution of this paper.
In summary, I'd like to see a more comprehensive evaluation that covers a broader spectrum of RDF archive related queries. This probably requires a major revision. Meanwhile this paper is a good fit to the special issue and could be useful for researchers in related fields, therefore I put the decision to Minor Revision to increase the chance of its acceptance.
[1] Fernández, J. D., Umbrich, J., Polleres, A., & Knuth, M. (2016, September). Evaluating Query and Storage Strategies for RDF Archives. In Proceedings of the 12th International Conference on Semantic Systems (pp. 41-48). ACM.
|