Review Comment:
This paper described is a Linked Data/Semantic Web in use. It models a collection of real world observations as SOSA Observations and uses a series of other ontologies to bring in classes and properties for describing the sensor platform (elephant seals) and observed features.
The custom ontology and the dataset are well presented online in Linked Data form.
I do question some of the modelling in the custom ontology BiGe-Onto, for example, the class Region could just be a GeoSPARQL Geometry, but this is out of scope for this paper which is just about the dataset.
I find the paper uses the ontologies it lists sensibly except for elements of SOSA/GeoSPARQL. The paper declares Geometry objects, e.g. http://linkeddata.cenpat-conicet.gob.ar/page/geometry/point_-63.57_-42.773, with a time and date but the time and date are of the Sensor visiting the geometry, not a property of the geometry itself. Better woul be be to associate the time with the Observation - where it was made. Geometries are also allocated to the sensor in an unordered list that can be temporally ordered by looking at the geometry time property. Better would be to order the geometries by using a rdf:Seq or other RDF collection (see SOSA's extension ontology for an OrderedCollection).
The paper also creates confusion in how it uses geometry in Figure 2 where the geometry example isn't linked to the rest of the example and should probably be in Figure 1.
This Platform/Observation geometry is really the only technical (modelling and data) problem I have with the paper. There are some small ontology errors listed below but these are minor.
The papers references need work. Many don't include all reference parts, e.g. URIs, and some are to outdated resources, e.g. old versions of standards. I suspect some of the issues are related to LatTex format issues. I have noted bad references that need fixing.
Below are a series of small points that should be easy to address. They need to be resolved but only the modelling issues above and references are holding this paper back from instant publication.
* Can the online location of the BiGe Ontology online be quoted when the first reference to it is made on Page 1 (reference [4])? It is given later in the paper, Table 3, but it's not obvious to the reader that it is online until Table 3 is reached.[4]
* Could a web front end be provided for the dataset's SPARQL endpoint: http://linkeddata.cenpat-conicet.gob.ar/sparql? This will make the data much easier to query. There are many easily installable ones to choose from, e.g. https://triply.cc/docs/yasgui/
* In the dataset, the Dublin Core class FileFormat is used incorrectly as a predicate, e.g.:
dcterms:FileFormat "PDF" .
dcterms:FileFormat "PDF" .
The correct RDF uses dcterms:format, which is a property, not dcterms:FileFormat, which is a class:
dcterms:format "PDF" .
dcterms:format "PDF" .
* BiGe-Onto Ontology: this ontology uses mixed forms of URI, e.g. bigeonto:belongsTo, bigeonto:has_location. Can this be standardised in a new ontology version?
* the reference given for QUDT, [13], is not to QUDT itself but to an extension and it's URI is broken. Please just refer to QUDT proper. Refer to http://qudt.org.
* no persistent URI is provided for the GeoSPARQL reference, [14]. Should be http://www.opengis.net/doc/IS/geosparql/1.0
* reference to "SPARQL query language" is out dated - URI for 1.0 is given, should be for 1.1: https://www.w3.org/TR/sparql11-query/
* reference to OWL TIME, [15], does not quote persistent URI, should be https://www.w3.org/TR/owl-time/
* there is a space in the URI http://www.w3id.org/cenpat-gilia/bigeonto/ in Table 3 that needs to be removed for it to work (be able to be clicked on)
* Figure 1 has two spelling mistakes: sosa:host -> sosa:hosts, "average depht" -> "average depth"
* acronym TDR is not explained when first mentioned, page 5, column 1, line 30. Assume Teperature depth recorder?
* Figure 2 contains a geo:Geometry instance not linked to any other instances. It should be moved to Figure 1 where it may be linked to the Platform that the text describes it links to
* Incorrect SOS class use - restatement of Platform/Geometry observation above
In the data I find this (turtle pseudo code):
PREFIX geom:
PREFIX platform:
PREFIX sensor:
sensor:AMLJ
a sosa:Sensor ;
rdfs:label "viaje_config #57" ;
sosa:isHostedBy platform:SES_AMLJ ;
sosa:resultTime "2005-11-30"^^xsd:date ;
geo:hasGeometry geom:point_-63.659_-42.784 ,
...
geom:point_-35.293_-43.747 ,
...
geom:point_-63.874_-42.835 ;
.
sensor:AMLJ is declared of type sos:Sensor, which is fine for the predicate sosa:isHostedBy but not for sosa:resultTime. sosa:resultTime's documentation states its domain (schema:comainIncludes) as domainIncludes sosa:Actuation, sosa:Observation, sosa:Sampling. While SOSA uses schema:comainIncludes not rdfs:domain and thus technically anything many be used for the domain of sosa:resultTime, the obvious intention is for a temporal thing, an activity, to use it. It makes no sense for the sensor sensor:AMLJ to have a sosa:resultTime. I understand what is being modelled here - all the observations have relative time starting at the sensor's sosa:resultTime, but different modelling must be used. Perhaps look into the use of an ObservationCollection (https://www.w3.org/TR/vocab-ssn-ext/#sosa:ObservationCollection) with a sosa:phenomenonTime to contain the current sosa:resultTime value.
If the multiple geometries given for the sensor indicate the location of observations, then they should be attached to each observation, not the sensor. Observations in the dataset do already indicate sosa:resultTime but not location.
* Since geometry is associated with the Platform (the seal), but in an unordered array of points (we don't know which is the first, last next etc point) not a POLYLINE, how can individual observations be linked to their location?
* The reference for D2RQ, [18], is to an un-linked conference poster. A link must be provided, e.g. http://wifo5-03.informatik.uni-mannheim.de/bizer/pub/Bizer-Cyganiak-D2R-.... Better would be a reference - perhaps a footnote - to the tool's online documentation (https://www.csee.umbc.edu/courses/graduate/691/spring14/01/examples/d2rq... or http://d2rq.org/d2r-server)
* the URI is used in the dataset, e.g.
.
But it should be - with an s, "hosts"
Page 1
Column 1 Line 37 query -> queries
1 39 accessible for machines - > accessed by machines
2 33 remove 'the'
2 34 rephrase "To meet Linked Data requirements, datasets must be described with rich metadata such as controlled vocabularies in a particular form - RDF - and published as a findable resource with a unique identifier.
2 38 reference to SSN (ref [3]) should be (from specref.org):
Armin Haller; Krzysztof Janowicz; Simon Cox; Danh Le Phuoc; Kerry Taylor; Maxime Lefrançois. Semantic Sensor Network Ontology. 19 October 2017. W3C Recommendation. URL: https://www.w3.org/TR/vocab-ssn/ ED: https://w3c.github.io/sdw/ssn/
2. 39 reference [4] has some funny numbers at the end that need reformatting
2 41 specie -> species
2 42 "collected along two decades" -> "collected over two decades"
2 43 SES is only defined later, needs to be defined here
2 46 You can't study the demography of non-humans. demography -> ecology
2 50 "and contribute" -> "and to contribute"
2 51 "species behind the changes" -> "species from changes"
Page 2
1 3 Reference [6] has an online accessed date but no URL
1 10 "During their terrestrial phase they are also characterized by high fidelity to the site where they have previously been" -> "During their terrestrial phase they frequently revisit previous years' sites"
ending of language checks
2 41 censuses -> census
Page 8
1 39 the link for the example FoI, SDN:P01::DEPTHC01, is broken. Is http://vocab.nerc.ac.uk/collection/P01/current/DEPTHC01/, should be http://vocab.nerc.ac.uk/collection/OG1/current/DEPTH/
2 21 class is foaf:Person, not foaf:Person
* the sentences between Page 8 and 9 seem to be broken. they read:
Page 8:
One crucial aspect is how to access and analyze data, and especially how to get only that part of data which is of interest for a given research question.
Page 9:
solves the access part, and SPARQL allows to query only a subset of the data.
I suspect a sentence is covered by Table 6.
|