Review Comment:
The manuscript entitled “A Semantic Meta-Model for Data Integration and Exploitation in Precision Agriculture and Livestock Farming” describes a semantic model based largely on existing standards for integrating complex datasets for commercial/industrial agricultural applications as part of an EU Horizons 2020 initiative.
Originality
While there are models and ontologies for specific domain areas already in existence, there are no semantic models for integrating the range of complex data described in the manuscript. It is ambitious! As such, this topic is an area of great interest to many across organizations and sectors.
Significance of the results
The paper outlines a clear methodology and results for integrating very complex data. The manuscript clearly describes a process used to 1) understand the impacts of the lack of interoperability and the difficulties in integrating data, 2) understand and document current data variability as well as different use cases and data use objectives towards maximizing agricultural/livestock outputs, 3) identify appropriate existing standards, 4) use the standards to develop a cohesive model across highly variable datasets, and 5) finally apply the model for standardizing and querying data.
Using a survey, the authors worked with 7 different companies carrying out very different operations, to assess data challenges and needs. A copy of the instrument was included as well as a summary of the results. The authors use a combination of standards (e.g. DCAT and QB vocabulary for labelling classes, as well as thesauri such as AGROVOC for standardizing terms) to generate a graphical semantic model based in RDF and OWL. Competency criteria were provided, as were 10 clear examples of implementation and SPARQL queries, illustrating how the model could be used to standardize very complex data and help to generate knowledge. I also liked the suggested explicit statement of datasets conforming to a particular data standard (to help make data FAIR).
Both the methodology and the results are compelling.
I have a few questions/suggestions.
1. The authors evaluated and mention a number of pertinent ontologies and thesauri including the EOL, which is an OBO Foundry Library ontology. It seems like there are a number of other OBO Foundry ontologies that would be pertinent (EnvO, FoodOn, AgrO). Can the authors describe why these were not evaluated/mentioned?
2. As the questionnaires pertain to human data (even though it is just opinions and input and not necessarily sensitive (like health data)), were research ethics protocols followed? Was an institutional research ethics review conducted?
3. The examples illustrated how specific data could fit the model. Could the authors mention areas of weakness in the model? What kinds of data have you encountered that does not fit? Is there future work planned to tackle these areas?
4. Can the authors provide some examples of how the model helps to harmonize datasets across use cases or data providers e.g. pig data from different farms using different data collection instruments and/or devices? So we can see the “starting material” and harmonized end products?
5. The authors use fields like “weight” to capture the measured mass of animals, but they also note that there could be differences in precision of that measurement which depend on the instrumentation used. Do the authors have fields to address differences in precision (e.g. “measurement precision value”)?
Quality of writing
The manuscript is written very well, and is an excellent example of a systematic approach to developing semantic models. To my mind, everything is there. The process and decisions made are well documented, the use cases and partnerships and real world implementation are presented. There is good technical documentation. Overall, I enjoyed reading this work very much, and support its publication with some very minor revisions.
Spelling mistakes:
Line 39, p 3, “reasearch”
Line 42/43, p 8, “requiremears”
Lines 36/37, p 9, “fata fusion”
Line 50, p 16, “transorm”
Long-term stable URL for resources
The link to their permanent repository is given on the second page of the article, which directs readers to the semantic model’s stable resource page where one can find the license, terms and definitions, PURLS, curation status, JSON schema, OWL file, and other information. The repository is not GitHub, Figshare or Zenodo, so its discoverability may be reduced. It is maintained by the Open Geospatial Consortium who are supporting the development of this data standard as part of the EU H2020. It doesn’t have a README file per se, but I think there is sufficient documentation - although I could not find any of the data mapping examples which might be nice to include there.
|