Review Comment:
This manuscript was submitted as 'full paper' and should be reviewed along the usual dimensions for research contributions which include (1) originality, (2) significance of the results, and (3) quality of writing.
--------------------
The authors have a great idea -- an automated processing workflow that automatically publishes heterogeneous sensor-derived data almost as fast as the data comes off the sensors, both in the SSN ontology model and also (user-designed) data cubes, suited for analysis applications. This is both novel and worthwhile.
Then the problems start. The title is good, but the first two-thirds of the abstract are misleading and disconnected statements without context.
Grammar problems start here and plague the paper, making it sometimes hard to understand what is meant.
The introduction immediately talks about "event processing" and contradicts the abstract about the nature of an "event" ("chronologically ordered" or "chronologically independent")? I don't think that event processing, as defined in the first sentence is relevant to the paper's work (and clearly the author who wrote that "sensed data stream", "sensor data" and "event-data" are used interchangeably on page 7 agrees with me). JMS middleware is used (sensibly enough) for event-oriented programming, but the significance in the overall problem is minor. Instead, the paper is certainly about streaming heterogenous sensor data. I cannot see why the paper talks so much about event processing and complex event processing throughout.
The paper has a running example around a smart building – which works well.
There are a lot of poor word choices, overuse of commas, grammatical errors etc, too many for me to report. The authors are advised to do a thorough rework. Figure 1 adds no value. Section 2 is much too introductory for this journal – 2.1 should be removed, 2.2 is much too long and mostly irrelevant, figures 3 are lifted and have to be properly attributed ( [38] is possibly ok, but [33] certainly is not, see the LOD diagram website for the right attribution for both).
Sec 2.3 – why does the user “not to worry about domain concepts and restrictions on quality” if they use Semantic Sensor Networks? “
3.1.1 what is a “channel” in this context? What is a “sensor” (it seems a single sensor can measure multiple qualities in your example).
3.1.2 It seems weird to convert “raw” quite readable date and time character strings to “milliseconds” along the processing chain! Does not that reduce the interoperability, and requiring the reference time to be known? Later on in the process you clearly convert again into a different date/time format –where did the base reference time come from this time? Why do this?
3.1.4 “event enricher” – is it using a linked data “knowledge base” for its “meta-data”. It looks like it, but the writing has a strange way of saying so. It says the W3C SSN ontology is used, but it does not look like it –it certainly does not show in listing 4.
3.1.5 ‘Event middleware” Say this is JMS and be done with it!
3.2 “ proposed methodology” .It seems you have ideas to, but have not implemented, publishing to the LOD cloud. This section should be deleted then (along with all the previous LOD background) – its enough to say that you propose to publish to the LOD cloud. And remove the claims in the conclusion that you have shown how to do it.
3.2.1 Again, too much well known stuff here, including some (all the linking technologies) that are irrelevant to this work. You mention two techniques usually used together, and say you (unusually) use only one of them. Why only that one, then. Why even mention the other?
4.1 Is this OWL? What part of it? Or something else? Use the language of the modelling language you use to describe your model (you are not using the language of OWL). Ref [27] does not speak to “lack of vocabularies” as you suggest. As it uses the W3C datacube ontology, it certainly does not support your design of a fresh datacube ontology as is implied. Why is the design of [27] , that also puts SSN together with a datcube, not good enough for your use case?
Fig 5a is useless (covered in 5(b)). Provide the ontology online for 5(b) if possible – it helps to understand when you do not formally describe it in the paper. Use double colons for namespace prefixes in the diagram. It is not clear why you model this data as an ontology at all – why is it not a database schema as its only use is internal to your software (it seems).
4.2 “following rules” are nonsense as members of D and M are individuals (instances of classes) but members of P are properties. Extended discussion about “intended” and “desired” object URIs is confusing.
Fig 8 do you really mean “rdf:type” to be there? Surely you only want domain properties (could filter on namespace).
Fig 12 is good to have – but you should have a much more informative screen copy. Otherwise remove it.
Listing 5: several of these edwh: namespace terms were missing from fig 5(b).
Sec 5.1 page 14 right column: you should have explained much earlier that you can only do one-dimensional cubes – this is a major limitation and would help to clear up earlier confusion when you explain how it works (and also why you use different cubes for quarter, hour, day and month.
Sec 5.1 it becomes clear here that you do not use the W3C rdf data cube. Why not? You have to justify your independent design! Would your auto-generation not work, for some reason I cannot see?
There is a lot of space devoted to evaluation in section 6. Unfortunately the evaluation does not seem to address the thrust of the paper. Except perhaps “accuracy” , which tests whether your software does some of what it should do correctly, something like a reverse-engineering test. The part of the third evaluation that looks at the performance of generating a datacube could be worthwhile if done comprehensively (vary some more parameters).
The rest is all about comparing the construction and retrieval speed and size of your datacube design vs the W3C datacube. I find your results rather insignificant. However, if you had written a paper about your design and how it improves in many ways over the RDF datacube, these results might be worth having in an evaluation. As it is there are too many assumptions unexplained, neither your queries nor the alternative models are adequately explained, and the performance is not much different anyway (nor any explanation of how the observed differences might scale nor what is causing them or at least what the pattern of differences is). I can’t really see how the evaluation can be used by the reader of this paper. You do not make any general conclusions about feasible limits on datacube sizes or numbers or number or frequency of sensor readings On the other hand, it seems that you *have* set up an W3C RDF datacube version of your processing pipeline – an explanation of this would make a much better paper! How did you align the SSN and datacube?
Why do you assume zero readings are missing data? But only for some cubes but not others?
Table 5/table 4 is out of sequence.
There are a lot of irrelevant references throughout, and lots of un-related “related work” (e.g. 17, 25, maybe 28 or else it needs more detail, [2] needs more detail and comparison as it sounds highly relevant, also [18] needs expansion as the difference you identify is rather tiny, deserving a much smaller paper) – but they should disappear with a tightening of the paper-- there is no need to reference every paper you have read or written. [16] must be updated to the 2014 final version. [9] and [10] are duplicates. Use e.g. {IoT}, {RDF} in titles in bibtex to preserve upper case when appropriate.
The authors might also be interested in this breaking news: http://www.w3.org/2015/01/spatial, also where the SSN and RDF Datacube ontology might be combined. If you made your combination available it might influence the standard.
|