A Semantic Meta-Model for Data Integration and Exploitation in Precision Agriculture and Livestock Farming

Tracking #: 3156-4370

Dimitris Zeginis
Evangelos Kalampokis
Raúl Palma
Rob Atkinson
Konstantinos A. Tarabanis

Responsible editor: 
Guest Editors Global Food System 2021

Submission type: 
Full Paper
At the domains of agriculture and livestock farming a large amount of data are produced through numerous heterogeneous sources including sensor data, weather/climate data, statistical and government data, drone/satellite imagery, video, and maps. This plethora of data can be used at precision agriculture and precision livestock farming in order to provide predictive insights in farming operations, drive real-time operational decisions, and redesign business processes. The predictive power of the data can be further boosted if data from diverse sources are integrated and processed together, thus providing more unexplored insights. However, the exploitation and integration of data exploited in precision agriculture is not straightforward since they: i) cannot be easily discovered across the numerous heterogeneous sources and ii) use different structural and naming conventions hindering their interoperability. The aim of this paper is to: i) study the characteristics of data used in precision agriculture & livestock farming and ii) study the user requirements related to data modeling and processing from nine real cases at the agriculture, livestock farming and aquaculture domains and iii) propose a semantic meta-model that is based on W3C standards (DCAT, PROV-O and QB vocabulary) in order to enable the definition of metadata that facilitate the discovery, exploration, integration and accessing of data in the domain.
Full PDF Version: 


Solicited Reviews:
Click to Expand/Collapse
Review #1
By Emma Griffiths submitted on 14/Sep/2022
Minor Revision
Review Comment:

The manuscript entitled “A Semantic Meta-Model for Data Integration and Exploitation in Precision Agriculture and Livestock Farming” describes a semantic model based largely on existing standards for integrating complex datasets for commercial/industrial agricultural applications as part of an EU Horizons 2020 initiative.

While there are models and ontologies for specific domain areas already in existence, there are no semantic models for integrating the range of complex data described in the manuscript. It is ambitious! As such, this topic is an area of great interest to many across organizations and sectors.

Significance of the results
The paper outlines a clear methodology and results for integrating very complex data. The manuscript clearly describes a process used to 1) understand the impacts of the lack of interoperability and the difficulties in integrating data, 2) understand and document current data variability as well as different use cases and data use objectives towards maximizing agricultural/livestock outputs, 3) identify appropriate existing standards, 4) use the standards to develop a cohesive model across highly variable datasets, and 5) finally apply the model for standardizing and querying data.

Using a survey, the authors worked with 7 different companies carrying out very different operations, to assess data challenges and needs. A copy of the instrument was included as well as a summary of the results. The authors use a combination of standards (e.g. DCAT and QB vocabulary for labelling classes, as well as thesauri such as AGROVOC for standardizing terms) to generate a graphical semantic model based in RDF and OWL. Competency criteria were provided, as were 10 clear examples of implementation and SPARQL queries, illustrating how the model could be used to standardize very complex data and help to generate knowledge. I also liked the suggested explicit statement of datasets conforming to a particular data standard (to help make data FAIR).

Both the methodology and the results are compelling.
I have a few questions/suggestions.

1. The authors evaluated and mention a number of pertinent ontologies and thesauri including the EOL, which is an OBO Foundry Library ontology. It seems like there are a number of other OBO Foundry ontologies that would be pertinent (EnvO, FoodOn, AgrO). Can the authors describe why these were not evaluated/mentioned?
2. As the questionnaires pertain to human data (even though it is just opinions and input and not necessarily sensitive (like health data)), were research ethics protocols followed? Was an institutional research ethics review conducted?
3. The examples illustrated how specific data could fit the model. Could the authors mention areas of weakness in the model? What kinds of data have you encountered that does not fit? Is there future work planned to tackle these areas?
4. Can the authors provide some examples of how the model helps to harmonize datasets across use cases or data providers e.g. pig data from different farms using different data collection instruments and/or devices? So we can see the “starting material” and harmonized end products?
5. The authors use fields like “weight” to capture the measured mass of animals, but they also note that there could be differences in precision of that measurement which depend on the instrumentation used. Do the authors have fields to address differences in precision (e.g. “measurement precision value”)?

Quality of writing
The manuscript is written very well, and is an excellent example of a systematic approach to developing semantic models. To my mind, everything is there. The process and decisions made are well documented, the use cases and partnerships and real world implementation are presented. There is good technical documentation. Overall, I enjoyed reading this work very much, and support its publication with some very minor revisions.

Spelling mistakes:
Line 39, p 3, “reasearch”
Line 42/43, p 8, “requiremears”
Lines 36/37, p 9, “fata fusion”
Line 50, p 16, “transorm”

Long-term stable URL for resources
The link to their permanent repository is given on the second page of the article, which directs readers to the semantic model’s stable resource page where one can find the license, terms and definitions, PURLS, curation status, JSON schema, OWL file, and other information. The repository is not GitHub, Figshare or Zenodo, so its discoverability may be reduced. It is maintained by the Open Geospatial Consortium who are supporting the development of this data standard as part of the EU H2020. It doesn’t have a README file per se, but I think there is sufficient documentation - although I could not find any of the data mapping examples which might be nice to include there.

Review #2
Anonymous submitted on 01/Oct/2022
Review Comment:

Suitable for publication

Review #3
By Rui Zhu submitted on 13/Oct/2022
Review Comment:

The revision on Related Work and motivation in Introduction helped to clarify some of my comments. The authors' responses also addressed most of my concerns. I would suggest to accept this paper for publication.