Review Comment:
The paper argues that Semantic Web should be based on knowledge more than in the current state, where it is based mostly merely on data (after a wave of interest in Big Data in research and industry).
The paper provides very nice cross-disciplinary motivation on why more knowledge-oriented approach is needed, discussing humanity as a species that is able to transfer and share knowledge between individual entities, using explicit symbols.
Then the paper points to that nowadays machines mostly process data to learn new knowledge often from scratch over and over, and this knowledge is not made available to humans in an understandable way, but actioned immediately, and then it does not allow for the body of human knowledge to grow.
This seems like a regression from what were the goals of the Semantic Web.
Then the paper also gives two cases on how sharing explicit knowledge helps to increase the common body of human knowledge: in eScience and in knowledge evolution inspired by experimental cultural evolution.
The paper reads well and I agree with the general theses of the paper.
Below, I mention some less clear or arguable issues that could be clarified in the final version of the paper:
*** Definition of "knowledge" ***
The central notion of the paper is "knowledge" and it would be very helpful to have some definition in the paper. Is it understood as "justified true belief"?
I am raising this issue since the paper discusses "knowledge" in many aspects, including also conflicting knowledge that can co-exist in the form of micro-theories.
The paper generally criticizes approaches for learning knowledge from data from scratch. But what kind of knowledge is it that is being learnt? I would guess that it might be "local knowledge" that is learnt by some machine learning algorithm for a limited scenario, concerning "dynamic" knowledge such as on whether to recommend some product to a particular customer or what will be energy demand for some household at the particular time.
When it comes to "global knowledge", then increasingly this is re-used by such, machine learning based approaches via feature engineering, and resources such as Wikipedia or knowledge graphs are increasingly used. I do hope that this trend is kept and knowledge is transferred.
Then, when knowledge evolves when used for a particular task in a "local" scenario, learnt from scratch, can it become obsolete for other scenarios on a more global scale?
*** Knowledge representation for machines ***
"The semantic web could be characterised by one of its early slogans: a web for machines."
I think it should be processable by both: machines and humans.
This may not be the only case of having the Web machine understandable when this understanding is given by human programmers who share common vocabularies, developed for common tasks, and then machines are expected to "understand" those and provide back human-understandable knowledge and results.
I can imagine the case, when some knowledge representation is understandable to machines, and in machine-to-machine sceanrio, and if the machines were autonomous, then when they evolve they could even develop their own language, understandable to them, but not really to humans?
This is not necessarily the case that when a language is easily interpretable by humans it is also better for a machine. Maybe just numbers would be easier for a machine to grasp?
Therefore I think, it needs to be more stressed that Semantic Web languages should provide a common platform to exchange and evolve knowledge between humans, between machines, and between humans and machines.
*** Evolutionary computation ***
The second case discussed in the paper (experimental cultural evolution) is very interesting and inspiring.
It is also less developed and clear than the first one (eScience).
The paper mentions here evolutionary computing and genetic programming approaches, where usually those approaches produce offsprings from their parents by artificial selection and mutation, which enables a population to evolve.
This seems not to be the case of cultural evolution. Though the author gives some hints how it differs, and that there is no inheritance used in cultural evolution, it would be very helpful to have more explanation on this topic and how it should work in the Semantic Web scenario.
|
Comments
A bold and inspiring piece!
A very nice piece of writing:
1. articulating very clearly the value of explicitly expressed knowledge, both in humans and machines, that can be communicated (as opposed to actionable but implicit knowledge that has to be relearned all the time).
2. unashamedly expressing the ambition that such explicit knowledge in format that are interpretable by machines can contribute to the next step in the knowledge ecosystem (storytelling, teaching, book writing, monasteries, universities, semantic webs)
3. using eScience as a good illustrator for what could be achieved (a good choice, because eScience is a field where more progress in "real semantics" has been made than elsewhere
A minor complaint would be that the final section on knowledge dynamics (and the role of evolutionary mechanisms in knowledge dynamics) is rather disconnected from the main thesis of the rest of the paper. The whole "in defense of explicit knowledge" argument of the paper could have been done without that final section.
Finally, I'd like to point out that Euzenat's whole argument about the value of explicit knowledge in a form processable by machines is also very relevant to the major debate that's raging currently in Artificial Intelligence: should we not just fully rely on statistical techniques that learn actionable patterns from data. This paper is a clear articulation of the viewpoint that the answer to this question is 'no':
"Nowadays, web users are not expected to provide knowledge, nor to access it. It seems that they are mere data provider, mostly through their actions, e.g. click, buy, like. These data are machine processable, but not open. They are kept secret, in silos, to the exclusive exploitation of a single organisation. They are processed by corporations which eventually learn knowledge from that data. But this knowledge, in turn, is not shared nor even prone to be communicated because not necessarily expressed in an articulated language. Instead, it is directly actioned. Hence, knowledge does not improve."
Amen to that.