Review Comment:
Summary
The paper presents the Medieval Charters Knowledge Graph (MCKG), an RDF-based dataset designed to support scholarly exploration of medieval charters. The infrastructure is hosted on Wikibase Cloud, and the modeling approach uses a hybrid of CIDOC-CRM and Wikidata-style properties. Validation is performed using Shape Expressions (ShEx) and sheXer, with shapes stored as EntitySchemas in the instance.
The authors are transparent about simulating community contributions, and demonstrate that the platform is extensible and viable for cultural heritage applications. While at an early stage, the dataset aligns well with Semantic Web principles and provides significant scholarly value.
Overall the paper presents a well-executed dataset with strong potential impact. It is technically sound, clearly written, and thoughtfully designed for future extensibility and community involvement. While a few minor revisions are necessary to improve metadata completeness and usability, the core contribution is solid and aligns well with the goals of the Semantic Web Journal and this special issue.
Evaluation Against SWJ Dataset Criteria
Dataset metadata (name, URL, version, license): URL provided; license and version not specified
Topic coverage, domain, and source description: Highly relevant and clearly defined.
Use of standard vocabularies (RDF, OWL, SKOS, FOAF, CIDOC-CRM): Strong vocabulary reuse (CIDOC-CRM, FOAF, Wikidata)
Language expressivity & modelling patterns: Clearly explained hybrid model (CIDOC + Wikidata)
Validation schema and completeness: ShEx shapes provided and validated via sheXer
Internal linking (within dataset): Well-modeled through events and roles
External linking (to other KGs): Not implemented yet; acknowledged as future work
Documentation clarity and completeness: Excellent; transparent and well-written
Usefulness and third-party usage: No external adoption yet; simulated contributions
Long-term hosting and accessibility: Stable on Wikibase Cloud
README / data completeness: Generally complete.
Known limitations: Clearly stated
5-star vocabulary reuse assessment: Not provided; recommended by SWJ guidelines
Section-by-Section Comments
Introduction
Provides a clear and concise motivation for the work
Well framed within the context of digital humanities and Semantic Web practices
Related Work
Covers relevant datasets and projects (e.g., FactGrid, WarSampo)
Could benefit from a more explicit comparison to datasets with active external linking
Methodology and Modelling
Hybrid model (CIDOC + Wikidata properties) is pragmatic and clearly described
Event-based modelling aligns with best practices in cultural heritage KGs
The authors acknowledge modelling trade-offs (e.g., direct document links for incidental mentions)
Optional: Consider exporting schema as OWL/Turtle or RDF for improved interoperability
Validation and Shape Expressions
ShEx shapes are auto-generated using sheXer and published as EntitySchemas
Example SPARQL queries function as intended
Recommendation: Provide direct link to the EntitySchema list for easier access
Infrastructure and Dataset Access
Dataset is hosted on Wikibase Cloud with stable URIs and GitHub repo
However, no explicit license or version number is visible—this should be addressed
Evaluation
Summary statistics provided (2,211 entities, 12,429 statements)
Competency questions are practical and meaningful
Contributions are currently simulated—this is clearly stated
Corpus coverage metrics (e.g., percentage of AMSPO corpus processed) would improve transparency
Applications and Utility
Use cases (e.g., querying persons by office, exploring geographic coverage) show real potential
No third-party adoption yet—understandable at this stage, but should be noted explicitly.
Additional Insights from Graph Exploration
UI Navigation Asymmetry
The graph currently supports one-way navigation: entity pages (e.g., persons, places) list the documents in which the entity appears, but document pages do not display the entities they mention. This creates a usability barrier for non-SPARQL users.
External Linking to Other Knowledge Graphs
Currently, the dataset does not include any data-level links to external datasets such as Wikidata, GeoNames, DBpedia, or VIAF. This prevents it from reaching the 5th star in the 5-Star Linked Open Data model. The authors acknowledge this and list it as future work.
Required and Recommended Revisions
Required
Add License and Version Information
Clearly specify an open license (e.g., CC0, CC-BY) and include versioning metadata for the dataset. This is critical for proper reuse, citation, and compliance with Linked Open Data standards.
Improve Navigability from Document Pages
Document pages currently do not display the entities they reference. To improve usability, especially for non-technical users, add inverse properties or helper statements (e.g., mentionsEntity) so entities documented in a charter are directly visible on that charter's page.
Strongly Recommended
Add External Entity Linking (for 5-Star LOD Compliance)
The dataset currently lacks data-level links to external knowledge graphs such as Wikidata, VIAF, GeoNames, and DBpedia. Begin linking key entities (especially places, persons, and offices) to existing URIs to enable broader interoperability and LOD Cloud eligibility. Tools like LIMES, SILK, or OpenRefine reconciliation can assist with this.
Include a 5-Star Vocabulary Reuse Self-Assessment
The Semantic Web Journal encourages authors to assess their dataset using the 5-Star Vocabulary Use guidelines. Including this will help demonstrate the dataset's alignment with Linked Data best practices.
Clarify Future Community Contribution Plan
While the authors currently simulate contributions, a roadmap for real user engagement would improve confidence in long-term sustainability. Suggestions include contributor guidelines, editorial workflows, validation mechanisms, or outreach to domain experts and institutions.
Provide Corpus Coverage Metrics
To help reviewers and future users understand dataset completeness, include quantitative coverage information. For example, state how many charters from the AMSPO corpus have been processed, or what percentage of referenced entities (people, places) have been modeled.
Add a Direct Link to EntitySchemas
The paper references Shape Expressions (ShEx) and their use in validating entities via Wikibase EntitySchemas, but no direct link is given. Include a link to the EntitySchema list
|