|Review Comment: |
This paper presents the Semantic Quran dataset, a dataset that contains information in 43 languages, including Arabic, Amharic and Amazigh.
First, the authors describe the datasets from which the data has been extracted: Tanzil and the Quranic Arabic Corpus. Then, the present the ontology they have designed to provide description of the “localization” or position of data in the Quran and also morpho-syntactic descriptions.
When defining the ontology, they state that it as a “general-purpose linguistic vocabulary”. I would say that this description has a much wider scope than the one of the ontology they are proposing, which is explicitly tailored to represent the information of the Quran. In this sense, I would suggest that they reconsider this definition.
Regarding “localization” or provenance information, I realize they have not reused standard provenance vocabularies, such as PROV-O, but
In my previous review I said: “Regarding multilingualism, which is one of the main characteristics of this dataset, the authors have simply relied on the rdfs:label property, and have assigned the corresponding language tag to the label. This seems to be enough because only one translation is provided for each preferred label. Why not using skos:altLabel? Or even prefLabel with the corresponding language tag? What if different alternatives were provided for each label, maybe coming from different resources? Why not representing each label as one lexical item and then using “translation or equivalent links” between them? I would suggest the authors to justify this”. This has not been approached by the authors in this new submission.
As for the linking phase, it is still not clear to me why they do not take advantage of the morpho-syntactic information contained in the resource. Could they further clarify this in section 5?
• Arabis ->Arabic (section 2.2)
• … can be improve -> improved (end of section 5)