Review Comment:
This paper presents the usage of the Sampo framework for a network analysis. The novelty of the presented work seems to be in the tranformation of the historical data into linked data, which is then visualized, to some extend in the already existing Sampo framework. Whereas the addition of the network analysis is then done in seperate Google Colab and Jupiter Notebooks. I'm unsure, how this paper first the call of "Tool and Systems Report".
There are multiple large problems with this paper that need to be addressed, as well as my concern of the fit for the call.
Even though stated multiple times, the goal of this paper is nonetheless unclear, as there seem to be different signals given throughout. The analysis of the datasets, to my knowledge, seems to well done. However, the authors state multiple times that the conclusions from this analysis is not the main goal of the paper, which given that the authors are not experts on the field, makes a lot of sense. However, there does not seem to be focus on the state goal in any way. The tool itself is not analysed in a way that would provide the answer to the question of if the using LD and the Sampo framework is sufficient or even providing more support in this sort of analysis than conventional tools. The authors do not showcase how it is that the tool/system supports this sort of analysis (better) as opposed to just using NetworkX without transforming the data in to LD. I suggest here, that the authors also make use of a modern network and conduct an analysis of said network, where the network has previously been analysed with out transformation to LD. That would provide a direct comparison and would enable the autors to draw conclusions on the actual system/tool/approach, rather than having to state multiple times that the data is historical, hence could be incomplete and biased. (This is an important limitation, but has nothing to do with the tool and everything to do with the analysed data, hence not contirbuting to the goal of the paper).
The abstract, introduction and conclusions (called discussion in this work) are not aligned to each other. For example, the abstract and conclusions mention 4 datasets, but throughout the paper and in the introduction only 2 datasets are introduced, discussed and analysed. One third dataset is mentioned briefly as being part of future work, but in the abstract it is presented as if 4 datasets are transformed into LD, made available and also analysed in the paper itself.
Given the major issue of contribution of this paper, I refrain from going into too much detail on form and grammar, however there are inconsitencies in, e.g. the usage of abbreviations or how figures are referred to (Fig. X vs. Figure X) throughout the paper.
Lastly, I have some concerns on how the LD was transformed into an NetworkX network, as this is not described in detail in the work itself. Also, it is very unclear to me why the two datasets were combined, or taken together, when calcuating the measures for the specific actor which are reported in Table 3. This is very unconventional, especially given that certain measures like the betweenness or eigenvector centrality takes the entire network into account and by combingin the two datasets the numbers are being calculated over its entirety rather than the relevant network.
|