Review Comment:
The article "SemanticTafsir: Building a Cultural Heritage Ontology and Knowledge Graph from the Quranic Exegesis of al-Tabari" introduces a significant contribution to the digital representation of Islamic intellectual heritage. It details the creation of an ontology and knowledge graph based on the tafsir of al-Tabari. The project transforms TEI-encoded manuscript data into an OWL-based semantic framework, making the rich exegetical tradition not only machine-readable but also interoperable within the broader LOD ecosystem.
This contribution is highly original. While previous efforts have focused on representing Quranic text or hadith individually, this initiative semantically model a tafsir corpus, specifically Tafsir al-Tabari. The ontology developed here successfully captures verse-level commentary, narrator chains, hadith citations, and thematic structures. Moreover, the resulting knowledge graph is accessible through a SPARQL endpoint, supporting complex queries and enabling new scholarly interactions with classical Islamic sources. The open-access nature of the ontology, the accompanying source code, and query examples on GitHub reflect a strong commitment to transparency and reusability.
The technical aspects of the ontology are rigorous. Built in OWL 2 using Protégé and tested with reasoners, the ontology exhibits logical consistency and sound modeling principles. It applies recognized ontology design patterns such as part-whole relations, n-ary structures, and enumerated value sets. Particularly notable is the reuse of SemanticHadith ontology for modeling hadith, along with alignments to Schema.org, Dublin Core, DBpedia, and Wikidata. These design choices ensure interoperability with existing cultural heritage datasets. Nevertheless, the article would benefit from more explicit discussion of the specific OWL 2 profile used (DL, EL, or Full), as this impacts reasoning capabilities and scalability.
Although the article is well written and structured, there are areas where greater clarity or additional detail would strengthen the work. First, the potential role of CIDOC CRM is notably absent. Given that CIDOC CRM is an ISO standard widely used in cultural heritage informatics to support interoperability, it could provide a valuable bridge between SemanticTafsir and other heritage knowledge graphs. Its mention would also allow readers to better situate this ontology among other domain-spanning efforts such as the Hypermedia Dante Network (HDN), which semantically models literary sources including Dante’s Divine Comedy. Given that the article references digital work on Dante, it seems particularly appropriate to include HDN as a more recent and relevant comparison than the brief reference to Dante in the introduction currently allows.
The article's discussion of hadith is also somewhat underdeveloped. While Figure 1 provides a motivational scenario involving hadith, the actual meaning, role, and relationship of hadith within tafsir are not well explained in the main text. A more robust explanation of how tafsir utilizes hadith, and how this is reflected in the ontology, would help readers unfamiliar with Islamic exegetical traditions. Similarly, Figure 2, which outlines the conceptual model, is not adequately explained. The color coding of classes and properties is not described either in the caption or in the accompanying text. Moreover, while the authors claim to use owl:equivalentClass and owl:equivalentProperty to align new ontology terms with external vocabularies like Schema.org, DBpedia, and SemanticHadith, the actual mappings are not explicitly illustrated within the article itself. Although these alignments are available in the supplementary material, the lack of a summary or mapping table in the main body of the text limits transparency. Including such a table, even in abbreviated form, would significantly improve clarity and support reusability. Listing the most critical classes and their external equivalents would not only showcase the ontology’s interoperability but also help readers understand how SemanticTafsir fits within the broader semantic web landscape.
Another area for enhancement is the modeling of narrator types. In Section 3.7, the ontology defines narrator types as individuals in a value set, such as sahabi and rawi. However, the article does not clarify whether these are linked to corresponding entities in Wikidata, such as Q17638669 for rawi. Including these links would enrich the ontology’s semantic network and allow for better integration with global knowledge graphs. It would also help address the typological ambiguity of modeling narrator types as individuals rather than subclasses.
The reasoning capability described in the article appears to be limited to ontology validation, not inference at the knowledge graph level. It would be worth exploring whether and how reasoning could be extended to the populated graph itself, particularly in support of inference-driven queries or consistency checks. The article states that multiple reasoners were used (HermiT, Pellet, FaCT++) but does not indicate which was adopted for production use or what reasoning profiles were tested. Greater clarity on this front would enhance the methodological robustness of the work.
Regarding evaluation, the article references a set of competency questions as part of the ontology design and testing process. While it is commendable that SPARQL queries were used to validate the graph’s expressiveness, the criteria for assessing the results, particularly in terms of accuracy and completeness, are not defined. Were human experts consulted? Were any benchmarks used? Providing even a brief explanation of the evaluation methodology would significantly enhance confidence in the conclusions drawn.
Figure 4, which outlines the knowledge graph construction framework, is a good example: it employs various types of arrows and visual markers to represent different processes or relationships, but these visual distinctions are not explained either through a legend or in the figure caption. As a result, readers are left to infer the meaning of directional flows, stages of transformation, or distinctions between components. Including a legend or providing a more detailed caption that explicitly clarifies the function of each visual element, especially the different kinds of arrows, would significantly improve interpretability.
A minor yet noticeable typographical issue appears in Section 3.7, where Figure 3 is erroneously referenced as “??” instead of by its proper number.
Looking ahead, the project’s future directions are promising and aligned with contemporary trends in digital humanities and knowledge representation. Extending the ontology to include other tafsir texts, developing a natural language interface for query construction, and aligning with additional Islamic knowledge domains (e.g., fiqh, theology) are all logical next steps. In particular, a focus on CIDOC CRM compatibility would significantly enhance the graph’s ability to interoperate with museum and manuscript data across disciplines. The potential for machine learning applications trained on the annotated graph is another exciting avenue, especially for automated tagging and content analysis of unstructured tafsir texts.
In conclusion, SemanticTafsir represents a well-constructed, original, and technically sophisticated effort to model Islamic interpretive literature using semantic web technologies. It bridges classical scholarship and modern informatics, providing scholars, educators, and technologists with a robust tool for exploring and preserving tafsir literature. With improvements in visual clarity, ontology mapping transparency, and methodological reporting, this project could become a foundational infrastructure for the semantic representation of Islamic knowledge.
|