Review Comment:
Structural Quality Metrics to Evaluate Knowledge Graph Quality
==============================================================
Summary
-------
This article describes quality metrics which can be applied to Knowledge Graphs. In addition, the authors applied at-scale them by reviewing the quality of 6 major knowledge graphs, namely Wikidata, DBpedia, YAGO, Freebase, GoogleKG and Raftel. Based on these experiments, they provide an analyze of what should be done by the KG administrators to improve their KGs. Typically, they found the quality of a KG should not be limited to "scale-related indicators such as the number of classes and properties".
Structure and writing
---------------------
- Article structure is easy to follow.
- Writing quality: ok. Even though some efforts could be made to make the flow easier.
Major comments
--------------
- The Introduction lacks positioning. It would have been appreciated if the authors presented a stronger story by motivating better their approach and the need of it. In addition, a quality-related example would have been nice.
- The Related Work section seems, to me, a bit short, in the sense that I feel like it misses some important related efforts. For instance, the Semantic Web works from Debattista aren't referenced [1], [2]. Same goes for the generic data quality effort in [3]. In the same idea, this section misses a detailed discussion between the various quality metrics already existing in the literature in order to better understand the gaps that the authors are filling.
- Section 4 lacks a discussion on the benefits to have these new quality metrics as compared to the previous ones existing in the literature. Similarly, it would have been interesting to have a stronger motivation leading to the need of such metrics, demonstrating for example that they cover necessary aspects, ignored until now. Finally, having an aggregated score/metric could have been a nice addition too (even though Table4 provides some aggregation).
- Section 5 is only a description of the respective scores for the six datasets used to show the metrics. A stronger discussion, leading to the suggestions of guidelines and actions to improve these dataset respective qualities, would have been very useful to have. In a sense, this would have let the reader understand the interest of such a new set of quality metrics in the context of designing/building a Knowledge Graph.
[1] Luzzu—A Methodology and Framework for Linked Data Quality Assessment (Debattista et al.)
[2] Evaluating the Quality of the LOD Cloud: An Empirical Investigation (Debattista et al.)
[3] Requirements for data quality metrics (Heinrich et al.)
Minor comments
--------------
- It seems that the article is Korean focused. I do not really understand the motivation behind this restriction.
- I do not see the need for Section 3, what are the findings and how are they used to justify the need of 'structural quality metrics' as the next Section?
- On page 5 line 37, "4.2.2 have been examined in previous studies" needs references then.
- On page 7 line 44, typo. "memeber" → "member"
- Having both Fig.2 and Table.4 seems unnecessary.
- I do not really understand the relevance of the Appendix section…
Overall [REJECT]
----------------
This article presents 6 quality metrics to be used in order to evaluate Knowledge Graphs. In addition, the authors present experiments related to 6 large Knowledge Graphs among which 5 are very popular.
However, to me, neither they motivated their approach enough nor they used the experimental results in order to draw conclusions on how to improve the reviewed Knowledge Graphs. In addition, I think the overall positioning of the article should be reviewed so to better highlight the need for the community of such new metrics. Finally, I found the article not fully related to Wikidata (when the special issue is "Wikidata 2022"), indeed, the authors are only considering Wikidata within their KG set for their experiments but are not putting Wikidata at the center of their efforts.
For these reasons, I do not think this article fits within the scope of this Semantic Web Journal issue.
|