Visual Notations for Viewing and Editing RDF Constraints with UnSHACLed

Tracking #: 2680-3894

Authors: 
Sven Lieber
Ben De Meester
Pieter Heyvaert1
Femke Brückmann
Ruben Wambacq
Erik Mannens
Ruben Verborgh
Anastasia Dimou

Responsible editor: 
Karl Hammar

Submission type: 
Full Paper
Abstract: 
The quality of knowledge graphs can be assessed by a validation against specified constraints, typically use-case specific and modeled by human users in a manual fashion. Visualizations can improve the modeling process as they are specifically designed for human information processing, possibly leading to more accurate constraints, and in turn higher quality knowledge graphs. However, it is currently unknown how such visualizations support users when viewing RDF constraints as no scientific evidence for the visualizations’ effectiveness is provided. Furthermore, some of the existing tools are likely suboptimal, as they lack support for edit operations or common constraints types. To establish a baseline, we have defined visual notations to view and edit RDF constraints, and implemented them in UnSHACLed, a tool that is independent of a concrete RDF constraint language. In this paper, we (i) present two visual notations that support all SHACL core constraints, built upon the commonly used visualizations VOWL and UML, (ii) analyze both notations based on cognitive effective design principles, (iii) perform a comparative user study between both visual notations, and (iv) present our open source tool UnSHACLed incorporating our efforts. Users were presented RDF constraints in both visual notations and had to answer questions about it. Although no statistical significant difference in mean error rates was observed, a majority of participants made less errors with ShapeVOWL and all preferred ShapeVOWL in a self-assessment to answer RDF constraint-related questions. Study participants argued that the increased visual features of ShapeVOWL made it easier to spot constraints, but a list of constraints – as in ShapeUML – is easier to read. However, also that more deviations from the strict UML specification and introduction of more visual features can improve ShapeUML. From these findings, we conclude that ShapeVOWL has the potential for more user acceptance, but also that the clear and efficient text encoding of ShapeUML can be improved with visual features. A one-size-fits-all approach to RDF constraint visualization and editing will be insufficient. Therefore, to support different audiences and use cases, user interfaces of RDF constraint editors need to support different visual notations. In the future, we plan to incorporate different editing approaches and non-linear workflows into UnSHACLed to improve its editing capabilities. Further research can built upon our findings and evaluate a ShapeUML variant with more visual features or investigate a mapping from both visual notations to ShEx constraints.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 01/Mar/2021
Suggestion:
Major Revision
Review Comment:

This manuscript was submitted as 'full paper' and should be reviewed along the usual dimensions for research contributions which include (1) originality, (2) significance of the results, and (3) quality of writing.

This paper describes a new tool (or, rather, pair of tools) to help with reading and editing RDF constraints. The two tools are evaluated against Moody's Physics of Notations, and then undergo an empirical evaluation involving 12 participants. The paper is original, and the new tools have the potential to be significant in the field. Furthermore the paper is generally well-written (a list of typos is included at the end of this review). There are some significant short-comings with the article, however, which occur mainly in the data analysis (which can be rectified) and one particular issue with the study design (which it will be too late to rectify, with data collection already having occurred). I think that with some changes the paper could be accepted for publication, although it will perhaps not be as wide-in-scope as the authors intended it to be.

The first three sections of the paper are very strong, and on their own would merit some form of publication. Things start to go slightly wrong in section 4, however. On page 14, section 4.1, semiotic clarity is discussed. However, this is never properly introduced. We have the definition from [8] (the correspondence between symbols and their reference concepts), but then we have two further concepts of symbol redundancy/overload, and symbol excess/deficit, introduced. Importantly, it is never stated how semiotic clarity relates to these two new concepts. These new concepts are measured against the competing notations, but then semiotic clarity is re-introduced as a conclusion. In section 4.2, about perceptual discriminability, it is claimed that ShapeUML uses shape as a variable, but then immediately says only one kind of shape is used. Which means it is not variable. I find the claim (page 14, right column, line 42) that two different shapes can be distinguished because of the incoming edge to be spurious. I can distinguish between the collection of line-shape pairs, but not the shapes themselves. Furthermore, this could easily be obviated by using more shapes. Why use just rectangles and ellipses? In section 4.3, the claim that ShapeVOWL has high semantic transparency does not seem supportable, even with the references. It may have higher semantic transparency than ShapeUML, but that does not mean that there is high transparency in absolute terms.

In general, for section 4 there is too much reliance on just one framework (Moody's physics of notations), whilst other frameworks complement it. For example, Gestalt principles, Bertin's semiology of graphics, the use of pre-attentive features, to name but three. A stronger theoretical comparison could be made if more than one framework was utilised.

Section 6 details the analysis of an empirical study. The number of participants, whilst low, is appropriate given the target user-group, and the within-design used. However, it does not seem like a full within design, as participants did not see each example in both notations. That is a minor issue, however, as large portions of text after 6.4.2 cannot be supported. There was no statistical difference found between the notations (not even remotely close to significance) in 6.4.2, but then section 6.4.3 proceeds to explain various "differences" in the data, and give reasons for why these "differences" occur. As an example, on page 24, right column, line 2, we are told that there were "slightly better scores" in ShapeVOWL, and gives a potential reason why. However, we've already been told there was no statistical significance, and further the rates of 21% versus 25% probably equates to one participant. Unfortunately, this pattern repeats: despite the differences in scores being more plausible as random variation than any actual difference, minor "improvements" of ShapeVOWL over ShapeUML are explained in both 6.4.3 and 6.4.4. The conclusion drawn at the end is then that ShapeVOWL is better. Which, as mentioned, is entirely unsupported by the data.

The qualitative data analysis is more compelling, but taken together they still do not provide remotely enough evidence for the claim on page 28, right column, lines 14, 16-18 that "ShapeVOWL is preferred" and that the "work strongly suggest[s] that ShapeVOWL will find more user acceptance than ShapeUML".

In order for this work to be published, the entire analysis section needs to be rewritten to properly reflect the results of the analysis. In essence, it would become much shorter (effectively excising sections 6.4.3 and 6.4.4), but would be much more robust. I would like to see a more rounded theoretical comparison, too. I did like this work, but only what is in the data should be explained.

Other minor issues:
Page 5, right column, line 29: is redefining the compartments of a UML diagram not going to confuse people who are used to reading UML diagrams?
Page 9, figure 4: the Venn diagram strongly suggests inclusive or, not exclusive or, given that the intersection is shaded the same as (A-B) and (B-A). This could be a typo, as exclusive or is available in figure 5 as "one of".

Minor typos:
Page 1, left column, line 49: de described
Page 3, left column, line 9: cognitive effective --> cognitively effective
Page 7, right column, line 46: striked through --> struck through
Page 10, left column, line 21: suggest the red 6 is incorrect

Review #2
By Marek Dudas submitted on 07/Mar/2021
Suggestion:
Major Revision
Review Comment:

The paper focuses on the highly current problem of SHACL visualization. Little work exists in this domain (compared to, e.g., ontology visualization) and users seem to often decide to manage with textual representation or use surrogate tools (like using concept map visualization tool for SHACL shapes in Allotrope Data Models).

Given that, the authors chose a rather unambitious goal of finding out whether users perform better with visualizations they are familiar with by conducting a user study. Although that might result in a valuable contribution (even more had it followed up on related work such as [1]), there are several shortcomings in how the study was conducted and analyzed:

1. The definition of the actual tasks users are expected to perform with the help of the visualization is missing.
2. The task given to the test subjects seem to be (it is not very clearly stated in the paper) counting occurrences of various entities in various models. This probably enabled easier execution and measurement of the experiment but might be too far away from real world scenarios. Or was the goal only to measure how well the users understand each visual notation? Then there is the problem that the test subjects sometimes (and it is not clear how often) didn’t understand the question (the task) itself, which the authors themselves acknowledge.
3. Only the number of errors was measured. With time measurement, the results might have been more interesting.
4. A complete list of the questions seems to be missing.
5. The fact that having only 12 participants will lead to statistically not significant results could have been expected. The authors should focus more on qualitative analysis. E.g., focusing more on why the errors were made (the authors already include some discussion in this direction) by the participants instead of how many were made.

Apart from the user study, the paper provides thorough description of two novel (although to some extent described in previous work) RDF constraints visual notations and their comparison with respect to Moody’s design principles. That part is well written and in itself provides valuable contribution to the field.

Some minor issues and typos:

Section 2.2 is called “Constraint Types” but talks about generating constraints automatically. This is confusing as the reader expects enumeration of the constraint types first.

Authors state that because TopBraid Composer is commercial tool, details are unavailable. Most commercial tools offer some sort of trial version for testing purposes where the details can be obtained. There are other commercial tools like Stardog or Metaphactory that support editing and using SHACL and the authors do not even mention these.

Section 2.3: “OntoPad and shaclEditor are visually editors” -> visual editors

[1] DASGUPTA, Aritra. Experts’ Familiarity Versus Optimality of Visualization Design: How Familiarity Affects Perceived and Objective Task Performance. In: Cognitive Biases in Visualizations. Springer, Cham, 2018. p. 75-86.

Review #3
Anonymous submitted on 27/Apr/2021
Suggestion:
Major Revision
Review Comment:

This manuscript was submitted as 'full paper' and should be reviewed along the usual dimensions for research contributions which include (1) originality, (2) significance of the results, and (3) quality of writing.

The paper introduces two visualization methods focusing on RDF constraints and evaluates their effectiveness. The first of the methods follows the UML visualization paradigm while the second method follows the VOWL paradigm. Both methods cover a number of constraints defined by the Shapes Constraint Language (SHACL). The two visualizations have been implemented and they are assessed regarding (a) their compliance to a number of design principles and (b) a user experiment where participants evaluated the system after conducting different visualization-based tasks.

The visualization systems presented in the paper are a solid work on their own right. The authors have covered a significant portion of the SHACL specifications, exhibited a command of the principles that should be followed and have actually followed the principles to a considerable extent. The proposal also offers the added value to include a (somewhat implicit) comparison between the visualization paradigms of UML and VOWL.

The main issues with the paper is the scope of the title and the coverage of the human participant-based evaluation. Firstly, the title includes the "Editing (of) RDF Constraints", however the editing part is not adequately covered. Section 5 mainly lists high-level requirements and besides that part, editing is only referred to in the "data panel" passage, where raw data editing is listed, and the "interactions" section where a brief passage regarding editing is listed. The scope of the paper could be reduced to "Visualization", and the authors could describe the editing part in another paper.

The first issue with the coverage of the human participant-based evaluation pertains to the types of the tasks that are involved in the evaluation. The evaluation includes only counting tasks, where visualized items (nodes, edges) are examined in isolation, i.e. a query can be answered by examining each item by itself, without taking into account the relationships between nodes e.g. "which/how many constraints are placed on a person's address". Literature on the effectiveness of visualization for information retrieval tasks has identified a distinction between simple and complex tasks, and the paper could benefit from examining both type of tasks within the experiment. Secondly, the authors should present some statistics on the datasets used in the evaluation, such as number of nodes, relationships, depth etc., so that readers can more readily assess the effectiveness of the visualization for datasets of different sizes. Wherever appropriate, the text could be complemented with relevant discussions.

In section 2.3, the phrase "The tool only statically visualizes RDF constraints" should be further analyzed ("statically" in what sense, and as contrasted with which mode?). The same applies also to the phrase "collaborative workspace for RDF datasets and use different visualizations which are not specified".

Having listed the above, the paper length is already considerable and some readers may find it trying. Parts that are deemed by the authors to be nonessential could be removed.

It's probably best not to include URLs within the text, but list them as footnotes or references.