smart-KG: Partition-Based Linked Data Fragments for Querying Knowledge Graphs

Tracking #: 3232-4446

This paper is currently under review
Amr Azzam
Axel Polleres
Javier D. Fernandez1
Maribel Acosta

Responsible editor: 
Ruben Verborgh

Submission type: 
Full Paper
RDF and SPARQL provide a uniform way to publish and query billions of triples in open knowledge graphs (KGs) on the Web. Yet, provisioning of a fast, reliable, and responsive live querying solution for open KGs is still hardly possible through SPARQL endpoints alone: while such endpoints provide a remarkable performance for single queries, they typically can not cope with highly concurrent query workloads by multiple clients. To mitigate this, the Linked Data Fragments (LDF) framework sparked the design of different alternative low-cost interfaces such as Triple Pattern Fragments (TPF), that partially offload the query processing workload to the client side. On the downside, such interfaces come with the expense of higher network load due to the necessary transfer of intermediate results to the client, also leading to query performance degradation compared with endpoints. To address this problem, in this work, we investigate alternative interfaces able to ship partitions of KGs from the server to the client, which aim at reducing server-resource consumption. To this extent, first, we align formal definitions and notations of the original LDF framework to uniformly present partition-based LDF approaches. These novel LDF interfaces retrieve, instead of the exact triples matching a particular query pattern, a subset of partitions from materialized, compressed graph partitions to be further evaluated on the client side. Then, we present \approach, a concrete partition-based LDF approach. Our proposed approach is a step forward towards a better-balanced share of query processing load between clients and servers by shipping graph partitions driven by the structure of RDF graphs to group entities described with the same sets of properties and classes, resulting in significant data transfer reduction. Our experiments demonstrate that \approach significantly outperforms existing Web SPARQL interfaces on both pre-existing benchmarks for highly concurrent query execution as well as a novel query workload benchmark we introduce -- inspired by query logs of existing SPARQL endpoints.
Full PDF Version: 
Under Review