Review Comment:
Summary
-------
This article presents a methodology to extract empirical ontology design patterns from knowledge graphs. (In particular, the authors focused their study to the case of Wikidata.) According to them, this approach can be especially useful for knowledge graphs whose ontology is "loosely" present and not enforced at all when new triples are added. The solution they describe is able to output both probabilistic OWL ontology design patterns and probabilistic ShEx shapes. A specific emphasis is put on the statistical aspect of their approach since its version dependent in the sense that their recommendations may vary from one point of time to another as per the various updates which could have occurred in-between; this aspect makes their approach particularly suitable to keep track of practice evolution in terms of data updates. In order to highlight the solution, detailed sections presenting the results and discussing them are presented.
Structure and writing
---------------------
- The article structure is easy to follow and allows the reader to follow the logical path of the presented solution.
- Very good writing quality.
Comments
--------
I only have minor comments regarding the article.
- It would be great to have more info about the time it took to run such analytics and the resources used, so to give an idea on the „heaviness” of the process.
- To me, one of the most interesting part is Section §6.4. first of all, it would be great to have the years / more details for the dates ("April version from now on") regarding the dumps the authors consider. Also, since some time elapsed since April, it would be great for the final version of the article to include at least a third checkpoint e.g. July 2023 so to have a more solid section which would have three points in time which is better to draw conclusions and describe tendencies.
- A Section with ideas for other datasets would be a nice addition to the article. Even though, I admit it's not part of the main scope, as reader I'd like to know more about the possible uses of the approach for other datasets instead of just a line in the Conclusion „Moreover, we would like to test the method on knowledge graphs other than Wikidata”. (For example: What to do for datasets having strong ontologies? Could this be used to discover pattern-errors? …)
- The link to the supplementary material is very much appreciated. Nevertheless, it could've been more "complete" in the sense that I'd have liked to see a do-it-all script, or an example of how to run it completely. Also, I couldn't find the .png generator in the scripts (maybe I did not search for it properly though). → Anyway, this could be easily added and does not impact the review at all. ;-)
Overall [Minor Revision]
------------------------
This article presents a very interesting approach to be able to extract from Semantic Web knowledge graphs (empirical) ontology design patterns and ShEx shapes.
I would be happier if:
- §6.4 was augmented with an additional checkpoint and the findings updated accordingly.
- Some more details were provided when it comes to apply this approach to other datasets.
Compared to the original article (reference [5] in the article) presented at the Wikidata workshop (co-located with ISWC), there are a sufficient amount a new contributions.
For these reasons, I believe this effort is a great piece of work which deserves to be part of the Semantic Web Journal.
Thank you! ☺
|
Comments
ODPs and Wikidata
Thanks I find this topic definitely of interest. FYI this (arXiv) paper may be loosely related (in the end, different topic though): https://arxiv.org/abs/2205.14032 - Ontology Design Facilitating Wikibase Integration - and a Worked Example for Historical Data, by Cogan Shimizu, Andrew Eells, Seila Gonzalez, Lu Zhou, Pascal Hitzler, Alicia Sheill, Catherine Foley, Dean Rehberger