Morph-KGC: Scalable Knowledge Graph Materialization with Mapping Partitions

Tracking #: 3135-4349

Authors: 
Julián Arenas-Guerrero
David Chaves-Fraga1
Jhon Toledo
María S. Pérez
Oscar Corcho1

Responsible editor: 
Elena Demidova

Submission type: 
Full Paper
Abstract: 
Knowledge graphs are often constructed from heterogeneous data sources, using declarative rules that map them to a target ontology and materializing them into RDF. When these data sources are large, the materialization of the entire knowledge graph may be computationally expensive and not suitable for those cases where a rapid materialization is required. In this work, we propose an approach to overcome this limitation, based on the novel concept of mapping partitions. Mapping partitions are defined as groups of mapping rules that generate disjoint subsets of the knowledge graph. Each of these groups can be processed separately, reducing the total amount of memory and execution time required by the materialization process. We have included this optimization in our materialization engine Morph-KGC, and we have evaluated it over three different benchmarks. Our experimental results show that, compared with state-of-the-art techniques, the use of mapping partitions in Morph-KGC presents the following advantages: i) it decreases significantly the time required for materialization, ii) it reduces the maximum peak of memory used, and iii) it scales to data sizes that other engines are not capable of processing currently.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Accept

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Maria-Esther Vidal submitted on 31/May/2022
Suggestion:
Accept
Review Comment:

The authors have addressed the comments indicated in the two previous reviews. The current version of the paper shows the benefits of the developed techniques. It puts in perspective the need for efficient physical operators to execute the process of knowledge graph construction.