Visual Notations for Viewing RDF Constraints with UnSHACLed

Tracking #: 2834-4048

Sven Lieber
Ben De Meester
Pieter Heyvaert1
Femke Brückmann
Ruben Wambacq
Erik Mannens
Ruben Verborgh
Anastasia Dimou

Responsible editor: 
Karl Hammar

Submission type: 
Full Paper
The quality of knowledge graphs can be assessed by a validation against specified constraints, typically use-case specific and modeled by human users in a manual fashion. Visualizations can improve the modeling process as they are specifically designed for human information processing, possibly leading to more accurate constraints, and in turn higher quality knowledge graphs. However, it is currently unknown how such visualizations support users when viewing RDF constraints as no scientific evidence for the visualizations’ effectiveness is provided. Furthermore, some of the existing tools are likely suboptimal, as they lack support for edit operations or common constraints types. To establish a baseline, we have defined visual notations to represent RDF constraints and implemented them in UnSHACLed, a tool that is independent of a concrete RDF constraint language. In this paper, we (i) present two visual notations that support all SHACL core constraints, built upon the commonly used visualizations VOWL and UML, (ii) analyze both notations based on cognitive effective design principles, (iii) perform a comparative user study between both visual notations, and (iv) present our open source tool UnSHACLed incorporating our efforts. Users were presented RDF constraints in both visual notations and had to answer questions based on visualization task taxonomies. Although no statistical significant difference in mean error rates was observed, all study participants preferred Sha- peVOWL in a self assessment to answer RDF constraint-related questions. Furthermore, ShapeVOWL adheres to more cognitive effective design principles according to our performed comparison. Study participants argued that the increased visual features of ShapeVOWL made it easier to spot constraints, but a list of constraints – as in ShapeUML – is easier to read. However, also that more deviations from the strict UML specification and introduction of more visual features can improve ShapeUML. From these findings we conclude that ShapeVOWL has a higher potential to represent RDF constraints more effective compared to ShapeUML. But also that the clear and efficient text encoding of ShapeUML can be improved with visual features. A one-size- fits-all approach to RDF constraint visualization and editing will be insufficient. Therefore, to support different audiences and use cases, user interfaces of RDF constraint editors need to support different visual notations. In the future, we plan to incorporate different editing approaches, informed by visualization task taxonomies, and non-linear workflows into UnSHACLed to improve its editing capabilities. Further research can built upon our findings and evaluate a ShapeUML variant with more visual features or investigate a mapping from both visual notations to ShEx constraints.
Full PDF Version: 


Solicited Reviews:
Click to Expand/Collapse
Review #1
By Marek Dudas submitted on 14/Aug/2021
Review Comment:

The authors addressed all comments and thanks to adding the follow-up qualitative study, the paper now includes some valuable findings.
What I would like to see more would be a more high-level study analyzing what is a usual workflow of users working with RDF constraints. I.e., what do they actually do, and where the visualization can help them. An evaluation could then include some real-world scenarios that include working with the visualization.
However, in its current state, focusing on the low-level evaluation of the visual notations, the paper already represents a valuable contribution and might be considered a first step towards such a high-level study.

Review #2
Anonymous submitted on 29/Aug/2021
Review Comment:

The authors have substantially revised their paper, addressing satisfactorily the reviewers' concerns. Theoretical discussions and clarifications have been provided and the evaluation section has been rewritten to a great extent, improving aspects that were in need of revision and focusing more on qualitative aspects rather than quantitative. The scope of the title has also been properly adjusted. The data, its analysis and the code have been made publicly available.

In its new form, the paper originality, significance and quality of writing warrant publication, and I can recommend that the paper is accepted.