Review Comment:
This is a timely and excellent contribution, fulfilling the purpose of the 10years-special issue: it is well written, provocative and - in many respects - spot on, arguing that the SW community should (re-)fosuc on what it is good at and on what is was founded for: the Web. The authors argue sharply and in an opinionated manner --> which is not bad for such a vision/position statement at all.
I have to admit I regret to not having read the paper earlier - I just gave a keynote at DEXA which made a couple of similar points at DEXA, cf. http://polleres.net/presentations/20190827DEXA_keynote.pdf (which I don't mean to be cited or anything, I think the article is mostly complete in what it wants to pursue, but maybe the authors may want to have a look) while coming from a slightly different angle - I think the community has achieved quite soem take-up, but maybe not as visible and maybe not in the open and decentralized way we wished for.
Anyway, there we need not to agree 100% and I think the conclusions are similar: there's a lot left to be done and to work on, to make the story continue... and here is where the authors might think of maybe adding a more conciliable end, i.e. highlighting/summarizing the open research questions and topics, and - as they suggest prioritizing them from their perspective. This is IMHO the missing part, where the current end of the conclusions comes across rather a bit too negative (while these open challenges *are* spread over the paper... I think it would be worthwhile/valuable to make them more explicit again in the conclusions.
I have a couple of editorial/detailed remarks, to follow, which should help the final version:
p.2 when you first mentione SHEX,SHACL, I thought it may be worthwhile to give references... think of the people not tied closely to our community - if you want to make the paper readable for them as well, you shouldn't expect them to know all acronyms of specs and technologies. Might make sense to read over the paper again with this in mind, which would make it more accessible.
p.3
and we risk baby being thrown --should be-->
and we risk being the baby thrown
or
and we risk the baby being thrown
?
ex Tela quod libet --should be--> ex falso quodlibet
p.4
"Linked Data" as bigger than "Big"
-->
"Linked Data" is bigger than "Big"
"Big Data solutions derive their strength from
a rigorous, extensive schema, which strongly contrasts
with rdf’s highly normalized triple format.
While there have been solutions that leverage Big Data technologies
to address rdf use cases such as querying [16], they
require reformatting data to fit the Big Data paradigm."
I think this needs to be donwtoned, as it's too narrow (doesn't apply to *all* big data technologies). Suggestion:
"Many Big Data solutions derive their strength from
a rigorous, extensive schema, which strongly contrasts
with rdf’s highly normalized triple format."
While there have been solutions that leverage Big Data technologies
to address rdf use cases such as querying [16], they
often require reformatting data to fit the Big Data paradigm."
p.5
"By keeping
data in millions of small personal data stores close to
people, we are in a much better position to safeguard
people’s most precious digital assets. The challenge
then of course is in connecting these distributed pieces
of data at runtime, which the Solid project [21] does
through Linked Data."
while I agree this is one challenge, which is interesting to work on, it should be mentioned that there are many more challenges here, whithin reversing network effects and/or providing appealing/convenient user experience, that additionally need to be overcome, not all of which technological. might be worth a mention.
As for the AI and ML section, my feeling was - honestly that this one got a bit dragged away an wasn't as clear and understandable as the other ones, I have a couple of remarks/questions there:
"Developing such approaches is crucial to reduce the high manual
currently required for participating in the SemanticWeb."
-->
"Developing such approaches is crucial to reduce the high manual effort
currently required for participating in the SemanticWeb."
"For instance, semantics and inference
can pre-label data that improve the accuracy of models"
-->
For instance, semantics and inference
can pre-label data that improves the accuracy of models" ???
I am not sure, I got what you want to say here? More details/reformulate?
"Or, post-execution explainability could be achieved by
reasoning over semantic descriptions of nodes."
--> what do you mean by a node here? not clear again what this means.
"Some more fundamental
questions also need to be answered, such as training
a model under the open world assumption."
Again, I do not understand what you mean here exactly by open world, can you be a mit more specific? Example? I mean, aren't most/all ML applications in AI learning from a partial observation of the works and generalise the models...?
"Semantic inference and first-order logic might lead to
less spectacular conclusions"
... less spectacular than what?
"Maybe this is the better way to position ourselves in one
of the next waves to come: reinforcement learning."
??? how to you get to reinforcement learning here now? I think you need to explain this jump a bit (it might actually be justified to pull this out of the drawer - referring to the 2001 SW article again and argue with agents needing to plan and expose behavior to act rational on our behalf, and that... and that the community went down that route already at some point (semantic web services) to some degree, but also forgetting about the premises: who would annotate/formally describe semantic descriptions of preconditions and effects of services?
p.6
In section 6, when you call for prioritizing, as mentioned, I would appreciate if you came up with an even subjective such prioritization.
Likewise here:
"It is impossible
to tell whether the remainder is trivial or not; and many
of the experiences above reveal that some of the most
complex research problems appear exactly there"
which are these complex problems exactly?
Again, I would find it valuable to have named them (in the opinions of the authors): which do you think are the hard nuts to crack? Open questions? What is the kind of research needed and viable?
"Such endeavours have not been attempted at the research level, let alone they
would be ready for implementation by skilled engineers."
FWIW, I think there *were* at least attempts in this direction, e.g. ActiveRDF
http://www2007.wwwconference.org/htmlpapers/paper272/index.html
and I think this was not the only one, i.e., to easily wrap RDf into dynamic typing programming languages for developers.
"We have been wrong before" --> concrete examples where?
"This brought us as a community into a disconnect with
the place where we can make a difference: the Web.
There, new technologies still emerge every day—just
not ours." --> this is something I am not convinced of
(we may disagree, which is ok). Web industry and big players have hired some of the brightest minds from this community and the shift away from strings to entities in Web search demonstrates that something has changed/made an impact. However: behind closed doors and maybe to a smaller extent and in the technologies being implemented differently/proprietarily and a different role than we had envisioned.
"To this end,
positioning semantic technologies as compliment to machine
learning is a necessity."
-->
To this end,
positioning semantic technologies as *complement* to machine
learning is a necessity.
Plus, as mentioned above, in the end, I'd be happy about a more conciliable, hopeful summary/end.
I think the paper - as it ends now, too much gives the impression pure rant (which it isn't but the abrupt end might be read as such.
HTH, Axel Polleres
|