Review Comment:
This paper is a good contribution towards the ongoing discussion of what is going on around the notion of "knowledge graph". It is highly important for scientists and for practitioners to understand what is behind a keyword that suddenly becomes popular. As the authors write: "Since usage has evolved, it is appropriate to develop a definition that follows how the term is currently used." (p.3) Being able to discriminate between a marketing buzzword and and emerging discipline could make a huge difference in resources and researcher's efforts and time.
The paper essentially is a discussion of current definitions of "Knowledge graph" in the literature and contrasting them on the light of a wealth of real-life cases presented and analyzed. As the authors claim: "Our purpose with this paper is to survey the evolving notion of a knowledge graph, to describe the general space, and to provide an explicit operational description of a knowledge graph." Well done. And this is a well written and easily readable paper, well structured and organized. In summary, is a highly useful article, and based on this, I recommend acceptance to publication with minor revisions.
As a proof of the value of the article (the amount of discussion it could provoke, what in my opinion should be the measure of the impact of a scientific article) I will discuss (rather critically) the notion of KG advanced in the paper. I am not suggesting that the authors consider them, but I would like just to point the issues in case of are of help to refine some aspects of the paper.
My main criticism of the paper (and of the current use of the notion in the literature today) is that the ontological (in the philosophical sense) notion of knowledge graph is absent. That is, what is the object of inquiry?
I will quote several pieces of text occurring in the paper to show that this concern is relevant. Let us begin with the following expressions in the Introduction: "We provide an updated definition along with a set of knowledge graph requirements [...] We discuss how knowledge graphs as defined are crucial component for the future of the Web have great potential change in data science and domain sciences." We face here an attempt to show that an undefined object characterized by a list of certain requirements have certain potential for (again) a vague discipline. This is not a problem of the authors nor I will blame them for this, but is the fate of those who work in KG today: working towards what could have potential for the development of the potential that have given the work of previous practitioners... Definitively we need and external point of reference and an anchor to other developments.
The key word here is "knowledge". But in what sense? The authors write: "Knowledge graphs provide an opportunity to expand our understanding of how knowledge can be managed on the Web [...]" I like very much this claim. Now, trying to refine the above claim, the authors write: "Google was one of the first to promote a semantic metadata organizational model described as a “knowledge graph,” and many other organizations have since used the term in published research on knowledge management and graph databases." True. Even though the notion was used before, it was the Google use that gave the publicity and prominence that today has "knowledge graph".
In 2. Related Work: "[...] we believe that knowledge graphs created for specific domains such as Biology can be considered knowledge graphs if they follow the other requirements." In section 3, where the authors conceptualize "knowledge graph", one can find a thorough discussion and a clear proposed definition. Let us see: "a knowledge graph represents knowledge, and does so using a graph structure" This is a good starting point, but still avoids telling us what is the animal we are talking about. Looks like: A knowledge graph is an XXX that represents knowledge that uses a graph structure. What is XXX? How can one interpret the sentence: "Knowledge graphs use a limited set of relation types" (p.4). What are the object that "use a limited.."? A set of entities clearly. Probably nodes and edges. But soon we will learn that knowledge graphs include semantics and more. Same with the following two sentences: "Knowledge graph meaning is expressed as structure" and "Knowledge graph statements are unambiguous". Clearly a knowledge graph includes a set of sentences and a semantics. Or with this: "All identified entities in a knowledge graph, including types and relations [...]". Hence KG is a complex object that includes entities. Also we learn that context is another property that at least relations in a knowledge graph must have.
In this regard, the more formal definition given falls short:
"Graph A set of assertions (edges labeled with relations) that are expressed between entities (vertices) where the meaning of the graph is encoded in its structure.
Unambiguous Graph A graph where the relations and entities are unambiguously identified.
Knowledge Graph An Unambiguous Graph with a limited set of relations used to label the edges that encodes the provenance, especially justification and attribution, of the assertions."
This would be no more than a special type of Semantic Network (with special type of identifiers, etc.). But, as the authors implicitly and explicitly state, knowledge graphs involve much more than a mathematical definition. In fact, one realizes that editors, visualizations, extraction, integration, learning, semantics for different epistemologies, accessing methods, concerns about usability, reusability, web interfaces, etc. etc. are important aspects of the object known as knowledge graph.
So, a question remains unanswered: what is the field of KG research? How it differs from the tradition of KR? or of KB? or of IR? or of DB? KGs, from the cases of the useful catalog given in the text, seems to be a virtuous combination of real-life software, formal representation techniques, knowledge bases and information retrieval, plus the increasing weight that other techniques (e.g. machine learning) is having in capturing human semantics involved in text, images, videos, etc. It is important, in my opinion, to remark that it is beyond representation of knowledge in the classical formal sense (semantic networks, concept maps, etc.), because it includes multimedia and any other media that could carry human knowledge, and it encompasses software and machinery that automated information and knowledge (databases, knowledge bases, information retrieval) because it includes any technique and machinery to deal with it (capturing, extracting, transforming, visualizing, using, etc.). From the above, the KG field is a combination of science and technology. Not surprising that the Google patent that used "KG" (and not the many theoretical works that used that notion as developments of Semantic Networks) is now considered the starting point of the field. KG are interesting as long as are real-life software (and perhaps more) capturing, integrating, transforming, enriching and providing human knowledge.
In this regard KGs are closely related to the Semantic Web project. One could say that the field of KGs --in some sense-- implements the idea of the SW through these "objects" that are KG, that have more limited scope (are less universal than the whole Web with semantics) in serving particular purposes, but have been shown more "practical" in the short term.
An important practical detail: The following site is a key resource for the paper
http:// graphs.whyis.io
and was not working on nov 10th (server error)
|