Review Comment:
Summary
The paper tackles the problem of prescriptive performance analysis and presents a library named PAPyA, that enables to perform this type of analysis over knowledge graphs. PAPyA implements a whole pipeline for transforming data stored into a relational database into an RDF knowledge graph from where prescriptive analysis can be performed; its goal is to reduce manual work in the phases of graph processing preparation and data loading. The performance of PAPyA is assessed using several testbdeds with large RDF graphs generated using Watdiv and SP2Bench. The tool is publicly available in Github.
Positive Points (PPs)
PP1) The paper describes PAPyA, a Python library that supports users in evaluating SPARQL query processing on top of Big data settings. Several parameters can be configured to analyze the dimensions that affect reproducibility.
PP2) A showcase demonstrating PAPyA functionality and main features.
PP3) Many options to visualize the outcomes of an experimental study.
Negative Points (NPs)
NP1) It is not clear how PAPyA can be extended to the requirements in Section 3.1
NP2) Nothing is mentioned about the number of users who have utilized PAPyA and consider that it actually reduces the work during query processing assesments.
NP3) It is not clear how different configuration of query rewriting, optimization, and evaluation can be included in PAPyA.
Detailed Comments
The development of PAPyA resorts to the assumption that the performance of query processing over RDF graphs is sensitive to various parameters, and PAPyA aims to enhance reproducibility and facilitate the best combination of relevant paraments (e.g., schema, partitioning technique, and storage format) during the application of the Bench-Ranking methodology (previously published by the authors).
It is important to make clear in the introduction that this paper extends the work presented in [5] and explicitly states the novel contributions. Figures and Tables, which are also part of the paper [5], should be cited (e.g., Figure 1 and Tables 1 and 2).
Although PAPyA is agnostic of the KPIs, it would be easy to include new metrics. For example, dief@t and dief@k are very informative metrics ( see https://github.com/SDM-TIB/diefpy) that enable quantifying the dieficiency or continuous performance of an engine.
It is not clear how clear how the query optimization techniques can be confifured in an experimental setting.
It is unclear how the Executor's specification of the SQL queries is executed. Are they pushed down into the database management system, or is there a local optimization and processing? This will be another parameter that could impact the outcome of the experiments.
In the same line, it is unclear how the SPARQL queries are translated into SQL. The transformation process considerably impacts the outcome of the experiments. Are the users able to select various strategies for transforming SPARQL into SQL? Please, check the paper by Karim et al. 2021 to see how the representation may considerably impact the outcome in terms of execution time.
The authors claim that PAPyA is able to reduce the effort needed to analyze the performance of Big data systems when processing large RDF graphs. However, how has this statement been validated? How much are the savings? How easy is this framework for the users? Are they actually finding the PAPyA easy to use and reducing their work? A user study supports this statement.
Evaluation and Recommendations
The paper describes a library that has the potential to be very useful for the community. However, there are issues that reduce the value of this current version (See negative points). The recommendation is for major review addressin the comments previously presented.
References
Farah Karim et al. Compact representations for efficient storage of semantic sensor data. J. Intell. Inf. Syst. 57(2): 203-228 (2021)
|