A Unified and Evolvable Knowledge Graph Management Mechanism for Medical Data

Tracking #: 3126-4340

Authors: 
Linlin You
Gengxiang Chen
Hongli Li
Xuan Jiang
Yuren Zhou

Responsible editor: 
Guest Editors SW Meets Health Data Management 2022

Submission type: 
Full Paper
Abstract: 
To ease the management of medical data changing over time, knowledge graphs (KGs) have been widely utilized in life science and healthcare as a common approach to preserving domain-related knowledge. However, the dynamic feature of data can lead to KGs in different versions, which manage the same knowledge but with some of the semantic triples altered. To reduce the overwhelming knowledge duplication of versioned KGs to save storage spaces, and analyze the evolution correlations hidden behind these versions to accelerate the information query process, an appropriate method to integrate the knowledge of all the versions is of great significance. Therefore, in this study, a unified and evolvable knowledge graph management mechanism (EKG2M) is proposed, in which evolution records among versions can be first computed and then merged into an Evolvable Knowledge Graph (EKG) designed according to a unified evolution data structure. After the merging, the generated EKG can be used to support not only conventional operations, i.e., searching for entities or relationships, but also novel queries, i.e., revealing entity evolution routes. Moreover, the assessment of EKG2M shows that it is able to detect the evolution record accurately and reduce both storage space and query time efficiently and effectively.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Reject

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 01/Jun/2022
Suggestion:
Accept
Review Comment:

This is a well-written manuscript about an Evolvable Knowledge Graph Management Mechanism integrating the knowledge of all knowledge graph versions in an evolution ontology for medical data. The authors have presented convincingly the state-of-the-art, the originality of their work, as well as their methodology and results. A minor comment concerns the conclusion of the paper. The authors should emphasize more the primary outcome of their work, and discuss the impact and implications of their study, in comparison also with previous works which should be cited.

Review #2
Anonymous submitted on 03/Jul/2022
Suggestion:
Reject
Review Comment:

# A Unified and Evolvable Knowledge Graph Management Mechanism for Medical Data

-- Summary

The work tackles an important issue: managing evolving knowledge graphs and describes an application in the medical domain.
Unfortunately, the paper is missing a problem formalization and the data model formalization is too vague, hence it is not possible to judge wether the proposed solution is in fact appropriate.
Overall, it is also unclear why the proposed solution is specifically targeted or well suited for the medical domain.
Finally, the experimental evaluation is completely missing a comparison to a baseline system.
Therefore, it is unclear whether this solution is practically advancing the state of the art.

Strong points:

S1) important research area

S2) considers end-to-end pipeline

S3) experiments consider real datasets

Weak points:

W1) lack of formalism

W2) limited and ill specified scope

W3) unclear experimental evaluation

Detailed Comments:

D1) The introduction lists 3 research challenges. These challenges are not formalized in corresponding problems later on. The work is missing: a formal model of the data and a formal definition of the problems under study.

D2) Overall, the minimal formalization present is at points incomplete and at other points inconsistent. Definition with equation 1, the contents of sets V_i are not defined, as well as the contents of F, A, R. Also Why is that V_i =SO_i ? why having both? This is confusing. Furthermore, the title and abstract talks about a Knowledge Graph, while in the text sometimes ontology is used, as well as in core parts of the solution. This is a fundamental inconsistency, which invalidates the work at its core.

D3) A running example of the evolution problem should be presented and used across the paper to clarify the presentation. Overall, using examples cannot in any way substitute a detailed formalization of the data model and the problem.

D4) The work seems to only consider subsequent versions of an ontology. On this note than, it appears that the goal is to find between definitions across 2 versions, which definitions map to each other. This is then a subset of the ontology alignment problem. If this is the case, a number of questions are left unanswered:
- Why is this mapping required to be done automatically? A new ontology version is usually generated from the previous one with some edits, thus the ontology editor can simply mark this connections.
- Why no comparison is given with ontology alignment methods?

D5) Figure 1 (B) typo in label "evolvale" . Also "Other cases" is very vague and handwavy. Should be removed or replaced with something concrete

D6) IT is unclear what is the purpose of the unified data structure. The KG or ontology need to be stored within a triplestore in order to enable inference and querying. So what is the role of this in the overall picture of a KG management system? Moreover, ontologies are not very large, why is space efficiency an issue? Moreover, the UDS simply stores all information without any specific or novel idea. It looks like a very naive encoding technique used already in temporal databases with record timestamps. What is supposed to be special about this data structure?

D7) Overall the text uses too many abbreviations which are confusing

D8) No discussion is given of the issues that a wrong detection can cause.

D9) Only one dataset is used. Moreover, the dataset is not characterized. How large is it, how many triples, how many different entities, how much it changes through time?

Review #3
Anonymous submitted on 08/Mar/2023
Suggestion:
Reject
Review Comment:

This paper presents a unified and evolvable knowledge graph management mechanism (EKG2M), in which evolution records among versions can be first computed and then merged into an Evolvable Knowledge Graph (EKG) designed according to a unified evolution data structure.

Overall an easy to read and understand paper focusing on an interesting problem. However, formalization is substandard, everything presented already exists in one form or another and the paper fails to identify what is new here. Further evaluation section is sub-standard requiring more details and a comparison with existing works.

--Introduction.
It is not clear what “the evolutionary analysis” mentioned in the introduction is exactly.

--Related work
Several key papers are missing from the related work:
On ontology evolution:
Fouad Zablith, Grigoris Antoniou, Mathieu d'Aquin, Giorgos Flouris, Haridimos Kondylakis, Enrico Motta, Dimitris Plexousakis, Marta Sabou:Ontology evolution: a process-centric survey. Knowl. Eng. Rev. 30(1): 45-75 (2015)
On language of changes
Vicky Papavassiliou, Giorgos Flouris, Irini Fundulaki, Dimitris Kotzinos, Vassilis Christophides:
On Detecting High-Level Changes in RDF/S KBs. ISWC 2009: 473-488
On exploring evolution
Haridimos Kondylakis, Nikos Papadakis: EvoRDF: evolving the exploration of ontology evolution. Knowl. Eng. Rev. 33: e12 (2018)

--Evolution detection
“is a set of biological sequence ontologies (SOs)” what makes biological ontologies so important here?

Is the evolution detection algorithm deterministic and its results unique? What about its correctness and its complexity?

--Evaluation

Evaluation performed is modest and based on a single ontology.

As there are many works on change detection a comparison with at least an existing work is required.

How the nine queries used for query time evaluation were selected. What are their characteristics. Random selection is not appropriate here.

It is not clear how the accuracy was evaluated. How the “truth” was constructed in order to compare it with the results of your algorithm.

Storage compression can be further enhanced by dictionary encoding.