Review Comment:
This paper describes a multilingual corpus based on the Koran, in 42 languages, many of which are severly underresourced. As such this work has interest for researcher in both translation and comparative linguistics.
It was significantly difficult to review this paper as the resource described did not seem to available! The authors must clarify how they intend to maintain this resources availability and future evolution.
The resource itself it described by an ontology that reuses several other linked data vocabularies. The authors do not clarify why they made these particular choice, e.g., why GOLD was used for basic linguistic categories, as opposed to ISOcat or OLiA. Or why existing linked data models of corpora, such as POWLA were not used.
The linking section is particular weak, links are made to DBpedia and Wiktionary but it is not clear if any disambiguation is applied. The only description of this mapping is an XML snippet showing the input to one of the author's system. It needs to be more clearly defined what this matching is... from reading the original paper I guess the "trigram" metric means it is a fuzzy string based match?? Furthermore, there should be some evaluation of the correctness of these mappings.
The use case section is a bit too technical and only really understandable to someone familiar with SPARQL. A more general, accessible overview of the intended uses would be desireable.
Minor
p1. "English and American labels"?? American is not another language!
|