Review Comment:
This dataset description is about the OWL/RDF version of two manually constructed lexicons providing morphological, syntactic and semantic information for the languages Spanish and Catalan.
(1) Quality of the dataset: the two lexicons have been created manually by linguistic experts in the PAROLE/SIMPLE projects. The data quality can be assumed to be excellent. Therefore I strongly recommend to accept this paper.
(2) Usefulness: In the context of the emerging Linguistic Linked Open Data cloud, this dataset can be expected to be of great value, as there are a number of highly interesting linkings that could be created based on this dataset.
(3) Clarity
The paper lacks clarity. Many details on the lexicons which have been converted to OWL/RDF are not described explicitly. Important references are missing. The "big picture" is missing as well. I had to google "PAROLE SIMPLE lexicons" and look at the "Final SIMPLE Spanish Lexicon Report" in order to get an overview of the two projects and the lexicons created in these projects.
Please address the following comments in order to improve the clarity:
General comments:
1) The introduction needs to be completely restructured. This section does not "introduce" anything. Some content of the introduction could be moved to sec 2.
2) Please make the relationship between Lexinfo and Lemon clear. Although Lemon is one of the keywords, the first mention of "the Lemon model" occurs on p. 3, sec. 4. A reference for Lemon is missing
3) The paper does not mention which parts-of-speech are included in the lexicons.
4) It would be nice to mention the relationship between GENELX and LMF somewhere in the paper
Some more detailed comments:
- briefly introduce the Lexinfo model and the Lemon model, describe the relationship between the two
- p.1 Sanfilippo et al. : add a proper reference to the references section
- the references list a paper on LMF, however it is not used - as LMF derives from GENELEX it would be good to mention LMF
- sec 3: "Syntax semantic linking in the PAROLE/SIMPLE
model is also complex and, in most cases, useless." Why is it useless? Please elaborate.
*********** Revised Review *****************************************************************
This paper describes the OWL/RDF version of
- two manually constructed lexicons providing morphological,
syntactic and semantic information for the languages Spanish and Catalan,
- the PAROLE/SIMPLE lexicon model.
The latter has been mapped to an OWL/RDF ontology based on the LexInfo/lemon lexicon model.
Quality of the dataset
The two lexicons have been created manually by linguistic experts in the PAROLE/SIMPLE projects.
Therefore, the quality of the original data can be assumed to be very high.
Since the mapping to OWL/RDF is grounded in a mapping of the PAROLE/SIMPLE lexicon model
to the LexInfo/lemon lexicon model, this is likely to hold for the
dataset described in the paper as well.
Usefulness (or potential usefulness) of the dataset
Medium. Currently, the dataset is not linked to other datasets in the LLOD cloud.
Yet, this lexical resource provides syntactic information which could be
very useful in NLP applications.
There are a number of highly interesting linkings that could be created in the future based on this dataset,
e.g., a linking to lemonUby at the sense level based on subcategorization frames.
Clarity and completeness of the descriptions
The paper lacks clarity. Many details on the lexicons which have been converted to OWL/RDF are not described explicitly. Important references are missing. The "big picture" is missing as well. I had to google "PAROLE SIMPLE lexicons" and look at the "Final SIMPLE Spanish Lexicon Report" in order to get an overview of the two projects and the lexicons created in these projects.
Please address the following comments in order to improve the clarity:
General comments:
1) The introduction needs to be completely restructured. This section does not "introduce" anything. Some content of the introduction could be moved to sec 2.
2) Please make the relationship between Lexinfo and Lemon clear. Although Lemon is one of the keywords, the first mention of "the Lemon model" occurs on p. 3, sec. 4. A reference for Lemon is missing
3) The paper does not mention which parts-of-speech are included in the lexicons.
4) It would be nice to mention the relationship between GENELX and LMF somewhere in the paper
5) sec.2: this section refers to Classes and properties. In the context of RDF/OWL and
LMF-based lexicon models "Class" is a very ambiguous term:
- class in the UML sense, this is the lexicon model use
- rdfs:Class
- owl:Class
The same is true for Property: rdf:Property or owl:ObjectProperty or owl:DataProperty?
This data modeling section would gain much clarity, if you could be more specific when using the terms
Class and Property.
Some additional comments:
- briefly introduce the Lexinfo model and the Lemon model, describe the relationship between the two
- p.1 Sanfilippo et al. : add a proper reference to the references section
- the references list a paper on LMF, however it is not used - as LMF derives from GENELEX it would be good to mention LMF
- sec 3: "Syntax semantic linking in the PAROLE/SIMPLE
model is also complex and, in most cases, useless." Why is it useless? Please elaborate.
The paper needs to be revised for minor spelling and grammar issues.
|