Helio: a framework for implementing the life cycle of knowledge graphs

Tracking #: 3083-4297

Authors: 
Andrea Cimmino
Raúl García-Castro

Responsible editor: 
Aidan Hogan

Submission type: 
Tool/System Report
Abstract: 
Building and publishing knowledge graphs (KG) as Linked Data, either on the Web or in private companies, has become a relevant and crucial process in many domains. This process requires that users perform a wide number of tasks conforming to the life cycle of a KG, and these tasks usually involve different unrelated research topics, such as RDF materialisation or link discovery. There is already a large corpus of tools and methods designed to perform these tasks; however, the lack of one tool that gathers them all leads practitioners to develop ad-hoc pipelines that are not generic and, thus, non-reusable. As a result, building and publishing a KG is becoming a complex and resource-consuming process. In this paper, a generic framework called Helio is presented. The framework aims to cover a set of requirements elicited from the KG life cycle and provide a tool capable of performing the different tasks required to build and publish KGs. As a result, Helio aims at providing users with the means for reducing the effort required to perform this process and, also, Helio aims to prevent the development of ad-hoc pipelines. Furthermore, the Helio framework has been applied in many different contexts, from European projects to research work.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Accept

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 22/Mar/2022
Suggestion:
Accept
Review Comment:

I am very pleased with my comments for the authors addressed from the first reviewing round. Publication recommended.

Review #2
By Umutcan Serles submitted on 20/Apr/2022
Suggestion:
Minor Revision
Review Comment:

I would like to thank the authors for their detailed answers and effort to address my comments. My concerns are addressed to a great extent, yet I would still have some minor points (with reference to the rebuttal letter) that may improve the paper even more:

R3C5: I think the confusion comes from the word "injecting". What by injecting meant is obviously generating HTML based on RDF data and not "injecting" RDF into HTML. (e.g., as JSON-LD)

R3C4-B: I think the curation module is still a bit ambitiously described as "any" tool can be connected. There is already a prerequisite that they should either support interacting with SPARQL endpoints or be compatible with RDF4J interface. I understand the argument about curation tools being generically connected to the framework, however one example would be nice (the delta use case in Annex E apparently has a SHACL module connected, which one and how does that work?).

- generally it may make sense to mark the requirements as mandatory and optional as it is emphasized later in the paper that usage of some modules are optional.