Path-based and triplification approaches to mapping data into RDF: user behaviours and recommendations

Tracking #: 3585-4799

Authors: 
Paul Warren
Paul Mulholland
Enrico Daga
Luigi Asprino

Responsible editor: 
Armin Haller

Submission type: 
Full Paper
Abstract: 
Mapping complex structured data to RDF, e.g. for the creation of linked data, requires a clear understanding of the data, but also a clear understanding of the paradigm used by the mapping tool. We illustrate this with an empirical study com-paring two different mapping paradigms from the perspective of usability, in particular from the perspective of user errors. One paradigm uses path descriptions, e.g. JSONPath or XPath, to access data elements; the other uses a default triplification which can be queried, e.g. with SPARQL. As an example of the former, the study used YARRRML, to map from CSV, JSON and XML to RDF. As an example of the latter, the study used an extension of SPARQL, SPARQL Anything, to query the same data and CONSTRUCT a set of triples. Our study was a qualitative one, based on observing the kinds of errors made by par-ticipants using the two paradigms with identical mapping tasks, and using a grounded approach to categorize these errors. Whilst there are difficulties common to the two paradigms, there are also difficulties specific to each paradigm. For each para-digm, we present recommendations which help ensure that the mapping code is consistent with the data and the desired RDF. We propose future developments to reduce the difficulty users experience with YARRRML and SPARQL Anything. We also make some general recommendations about the future development of mapping tools and techniques. Finally, we propose some research questions for future investigation.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Accept

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Sergio Rodriguez Mendez submitted on 09/Feb/2024
Suggestion:
Accept
Review Comment:

* Summary:
The manuscript presents an empirical user behaviour study that compares two different RDF mapping paradigms: (1) path-based (focusing on YARRRML) and (2) triplification (focusing on SPARQL Anything).
The paper presented an analysis of the common difficulties found in the study, along with recommendations and proposed future improvements to reduce the difficulty users experience.

* Overall Evaluation (ranging from 0-10):
[Q]+ Quality: 10
[R]+ Importance/Relevance: 7
[I]+ Impact: 8
[N]+ Novelty: 8
[S]+ Stability: 9
[U]+ Usefulness: 9
[W]+ Clarity, illustration, and readability: 10
[P]+ Impression score: 9

* Dimensions for research contributions (ranging from 0-10):
(1) Originality (QRN): 8
(2) Significance of the results (ISU): 8.7
(3) Quality of writing (QWP): 9.7

* Overall Impression (1,2,3): ~88
* Suggested Decision: [Accepted]

* General comments:
None

* Feedback:
{
Thank you for addressing the comments and highly improving the manuscript's readability.
}

* Minor corrections:
- The font of the captions for the figures/tables is much smaller than the regular text font. The font size disparity on some pages is notable and doesn't look good (see page 7, table 1).
- Fig 1. "three..." --> "Three..."
- Page 26: "XQUERY" --> "XQuery"
- The paper includes citations for R2RML, RML, YARRRML, and SPARQL Anything. However, it has no JSON, XML, JSONPath, XPath, and XQuery citations. I recommend including those citations. For XML-related technologies, the authors should refer to the W3C Recs.

Review #2
By Miel Vander Sande submitted on 14/Feb/2024
Suggestion:
Major Revision
Review Comment:

This paper presents a qualitative study of two tools for mapping various formats to RDF: YARRRML, an high-level syntax for the RDF Mapping Language (RML), and SPARQLAnything, a SPARQL abstraction of heterogeneous data sources.
According to the authors, each one represents one of the two dominant paradigms in RDF mapping: path descriptions and default triplification. Path description approaches rely on path expression languages such as JSONPath or XPath to reference values in the data, while default triplification approaches turn data formats into a default RDF representation so they can be processed further using SPARQL.
The goal of the study is to analyse the usability of each tool individually in order to make recommendations on their further development.

I found the methodology presented in this paper refreshing. As the authors also state in the Related Work section, there is little qualitative research in the semantic web / RDF community, let alone usability research. Given the contrast with the number of papers that present tools, it seems that this community is not very concerned with people actually using them. Hence, it’s great to see this paper attempts to set an example in this area.
The paper is also well-written and reads quite fluently.

Unfortunately the content of the paper has some significant weak spots that need to be addressed. My main concern is that the foundation (ie., goals, setup, data collection and analysis) of the study is simply too light for a full journal paper. The study conducted in this paper is not unlike case study research in software engineering[1]; The authors would have made a much stronger case if their research was rooted in such an established methodology.
The remainder of concerns are summarised below:

1. Scientific value

While the research methodology itself is ok, the scientific value of this study is questionable. The study is more of an informal exploration than a full-fledged scientific research study. It is up to the SWJ whether this is accepted.

First, there are no clear research questions and hypotheses that can be tested. These are leading elements even in the aforementioned case study research. Despite this study being exploratory, the authors much have had some expectations of the errors users would make in both YARRRML and SPARQLAnything?
Second, why were participants only recruited from the RDF community? Are they really the target audience of these tools? A concrete description of the study’s targeted population would be of great help.
Third, what the study is actually testing is ambiguous. The authors acknowledge that, at least partially, they were testing the learning curve of these tools. Couldn’t have this been avoided with more intensive training or more extensive selection of subjects? Also, is it the learning experience of RML, SPARQL, YAML, YARRRML , JSONPath, XPath or SPARQLAnything that is being observed? And what is the effect of this learning experience in the results? Why did the authors not introduce a cold-start? Aren’t you introducing bias? The analysis in section 7 and 8 could go a lot deeper in trying to separate those effects.
In general, the attempt to explain the errors of users is a bit superficial. For instance, what users had experience with tools similar to SPARQLAnything (ie. JSON2RDF) and did the difference in triplification method (JSON2RDF does not use `rdf:_1`) cause some of the mistakes?
Fourth, some of the findings in user behaviour are mostly exposing bad tool design and therefore do not offer much to the external validity of the study. Some of the YARRRML path concatenation errors might be caused by the tool itself. YARRRML offers a shorter syntax to RML, but does little to improve the actual user experience. For instance, would users still make the same mistakes if paths were surrounded by `forEach` instead of `tripleMap` and `path`? While I really appreciate that the authors discuss the limitations of their study; it does not exempt them from designing a study that tests something meaningful.

2. Paper title & definitions

Given my remarks on external validity, the paper is not really in a position to draw conclusions for path-based and triplification approaches other than YARRRML and SPARQLAnything. Hence, I suggest to soften that claim in the intro and title. This would also make the scope of the paper more concrete, which is a good thing.
Instead of ‘Path-based and triplification approaches to mapping data into RDF: user behaviours and recommendations’ a suitable title could be something along the lines of ‘Evaluating user behaviour for YARRRML and SPARQLAnything, a path-based and triplification approach to mapping data into RDF’.

Next, the definitions of the paradigms are a little misleading. SPARQLAnything also supports path languages, while (R2)RML for instance uses SQL and other query languages. RML and YARRML are agnostic of the language with which values are selected in the source. This is defined in the reference formulation and is by design in order to be extensible.
Would it be more accurate to coin them as query/sparql-driven and source-driven?

3. Structure and formatting

Overall, the flow of the paper is pretty good, but I would suggest some alterations to the order of sections to help unexperienced readers.
First, I expected Section 9 a lot earlier in the paper; right after the intro. This way, the reader has the necessary context to understand the exercises and the results.
Second, using author names with reference instead of [12] would make the related work section easier to read.
Third, Section 5 and 6 would fit better in the appendix. It’s unnecessary detail that makes the paper long. Sections 7 and 8 should then provide enough context for the reader, so I suggest revising those sections for that purpose.

[1] https://link.springer.com/article/10.1007/s10664-008-9102-8

Review #3
Anonymous submitted on 21/May/2024
Suggestion:
Minor Revision
Review Comment:

Declaration: I reviewed the initial version of this paper (as Reviewer 3), but not the revision in between.

I appreciate the effort that went into the revisions, which helped to improve the paper and appreciate the changes and extensions; in particular:

*) I acknowledge that the authors aimed to address my previous concerns over the use of the term "usability" (by removeing the term "usability analysis" from the title and adding a reference to a definition). However, the use of the term and the claimed scope of the paper still remains somewhat murky:

- in the introduction, the authors provide a definition in terms of five components (i.e., learnability, efficiency, memorability, errors and falsification) and state that their scope is limited to errors and "to a considerable extent" learning experience.

- In other parts of the paper, however, they suggest that the paper addresses usability in a broader sense, e.g.,
- Abstract: "empirical study comparing two different mapping paradigms from the perspective of usability"
- p. 4: "We have focussed here on usability in the sense of Nielsen's [6] five components."; Given that only 1-2 out of the five components are considered, such statements are misleading.

The confusion may also arise from different levels of abstraction: usability assessments are typically performed for particuluar system implementations and include aspects such as UI, interaction design etc.
By contrast, the authors state that their goal is to contrast mapping *paradigms* in terms of their *conceptual* difficulties (using YARRML and Sparql Anything only as "examples" of these paradigms) and their methodology seems less focused on usability, than on user behavior. I recommend to clearly scope the paper and at a minimum consistently use the chosen terms.

*) Parts of the paper are still exceedingly verbose - this has not changed since the original version of the paper despite my and other reviewers concerns and suggestions. Sections 4-5 in particular are highly anecdotal and still read more like a technical report. I'm also not convinced that the large number of listings provide particular value. Therefore, I suggest to shorten Sections 4-5 and move the detailed listings to an appendix.

*) I appreciate the addition of a Recommendations section, which adds significantly to the value of the contribution. The "recommendations for future development" for both paradigms in particular should be useful.

The value of the "recommendations for users", on the other hand, are a bit less clear. I understand that these recommendations are synthesized from common errors made by users in the study, but (i) most of them are not a particular outcome of the study (e.g., that , but quite basic particularly insightful (i.e., mostly covered in the documentation), and (ii) the general validity of the prioritization is a bit unclear given the low n and the very particular mapping task. Nevertheless, the list of pitfalls to avoid might be helpful to some.

*) I appreciate the added "Comparing the paradigms" section.

*) I appreciate that the authors extended the "overview of the study" section and included some details on the methodology.

Some comments on that: I understand that the study is exploratory and qualitative and that the experimental design is therefore not as rigurous (or: narrow-focused) as a confirmatory study would be, but the description of the experimental design is still quite limited; open questions include:

*) "The particpants were free to choose whether to work with SPARQL Anything or YARRML":
- Was the equal number of participants (9) in each group then just a coincidence or how did the authors ensure a balanced sample across the groups?
- in the limitations section, the authors state that neither of the participant groups had much knowledge of the two specific technologies under trial → we therefore do not learn about their respective merits/usability in the hands of more advanced users.

*) The authors also note that "both sets of participants had difficutlies with JSON arrays". This leaves me to wonder whether/which of the reported problems/errors were due to the familiarity with the respective source format vs. truly attributable to characteristics of the mapping paradigm/language.

*) Thank you for adding a discussion on limitations;
As the goal of such qualitative, exploratory studies is to produce new hypotheses, I was hoping for more insights and directions for future work. Given that the stated goal of the paper was to discern "conceptual differences" between mapping paradigms (using SPARQL Anything and YARRRML only as "examples")"" - I was also expecting more general insights on that level.

*) Terminology: "Triplification" vs. "Path-based" paradigm

I'm still not sure if the distinction between "path-based" vs. "triplifcation" is clear and commonly accepted or if this may be a false dichotomy. The implication that e.g., R2RML is not a "triplification approach" seems to be inconsistent with common use of the term (cf., e.g., https://www.snap4city.org/drupal/node/359, https://pypi.org/project/triplify-csv/). Althought there does not seem to be a single commonly accepted definition in the literature, [1] for instance defines "triplification" as "The process of transforming data stored in relational databases (RDBs) into sets RDF triples is known as triplification" - which would include "path-based approaches".

Section 2.2 seems to confirm that, as the authors state that "All approaches discussed in this paper are used to create RDF triples. In that sense, they are examples of triplification."

As this distinction of two paradigms is the key premise of the the paper, I recommend to clarify this, clearly define the terms, and use them consistently.

[1] Marx, Edgard, et al. "RDB2RDF: A relational to RDF plug‐in for Eclipse." Software: Practice and Experience 43.4 (2013): 435-447.

-----------

# Strenghts

+ A comparison between mapping paradigms and their respective strenghts and weaknesses is relevant and highly useful both from a research and practical perspective.

+ The is now more readable and the added recommendations make it a more useful contribution.

+ Provides empirically derived recommendations for further development of mapping tools

+ Well written and clearly structured

+ Originality

+ Followed some reviewer suggestions and added sections and clarifications

# Weaknesses

- Limited general insights about the relative merits of mapping paradigms.

- Excessively verbose and anecdotal sections

- (Small n, no rigurous experimental design etc) -> ok, but for a qualitative study: limited general insights/hypotheses

- Inconsistent formatting of listings

- falls somwhat short of the goal to yield fundamental insights on mapping paradigms and their relative merits (rather than their particular implementations)

# Minor

I appreciate the minor changes already implemented.

*) The experiment was apparently structured into "questions", but it's not clear to me whether there were any "questions or whether this refers to the "mapping problems/tasks" to be performed by the study subjects.

*) Code listing formats:

I appreciate that the authors changed the listings from low-res images to text, but:
- the listings are still formatted inconsistently (tables with horizontal lines vs. box only)
- the tabular form makes them difficult to read
- I'd recommend to remove the table lines that appear in some listings, include line numbers consistently, and add syntax highlighting

*) The explanation of and reference to Figure 1 at the very end of the paper (beginning of conclusions) is unexpected and seems misplaced.

Review #4
Anonymous submitted on 07/Jun/2024
Suggestion:
Accept
Review Comment:

First of all i would like to thank the authors for addressing most of my concerns. I still believe that 9 users is not a high number of participants for the user evaluation, however, i know that it requires a lot of effort and time to do such an evaluation. Generally, since the other reviewers are happy with the structure and writing of the paper and since the revised version has been improved based on the comments of the previous review round, i also recommend the paper to be accepted.