Quantifiable Integrity for Linked Data on the Web

Tracking #: 3409-4623

Christoph Braun
Tobias Kaefer

Responsible editor: 
Elena Demidova

Submission type: 
Full Paper
We present an approach to publish Linked Data on the Web with quantifiable integrity using Web technologies, and in which rational agents are incentivised to contribute to the integrity of the link network. To this end, we introduce self-verifying resource representations, that include Linked Data Signatures whose signature value is used as a suffix in the resource's URI. Links among such representations, typically managed as web documents, contribute therefore to preserving the integrity of the resulting document graphs. To quantify how well a document's integrity can be relied on, we introduce the notion of trust scores and present an interpretation based on hubs and authorities. In addition, we present how specific agent behaviour may be induced by the choice of trust score regarding their optimisation, \eg, in general but also using a heuristic strategy called Additional Reach Strategy (ARS). We discuss our approach in a three-fold evaluation: First, we evaluate the effect of different graph metrics as trust scores on induced agent behaviour and resulting evolution of the document graph. We show that trust scores based on hubs and authorities induce agent behaviour that contributes to integrity preservation in the document graph. Next, we evaluate different heuristics for agents to optimise trust scores when general optimisation strategies are not applicable. We show that ARS outperforms other potential optimisation strategies. Last, we evaluate the whole approach by examining the resilience of integrity preservation in a document graph when resources are deleted. To this end, we propose a simulation system based on the Watts-Strogatz model for simulating a social network. We show that our approach produces a document graph that can recover from such attacks or failures in the document graph.
Full PDF Version: 


Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 11/Apr/2023
Review Comment:

The revised version and cover letter have addressed the comments and concerns in my previous review. The quality of the paper has been improved to a satisfactory level.

Review #2
Anonymous submitted on 18/Apr/2023
Review Comment:

This manuscript was submitted as 'full paper' and should be reviewed along the usual dimensions for research contributions which include (1) originality, (2) significance of the results, and (3) quality of writing. Please also assess the data file provided by the authors under “Long-term stable URL for resources”. In particular, assess (A) whether the data file is well organized and in particular contains a README file which makes it easy for you to assess the data, (B) whether the provided resources appear to be complete for replication of experiments, and if not, why, (C) whether the chosen repository, if it is not GitHub, Figshare or Zenodo, is appropriate for long-term repository discoverability, and (4) whether the provided data artifacts are complete. Please refer to the reviewer instructions and the FAQ for further information.

Review #3
By Marvin Hofer submitted on 11/May/2023
Review Comment:

In the revised version of the paper, the authors have noticeably implemented changes concerning the previous reviewers' comments.

The majority of changes affect the displayed figures and listings. The changes made to them increase the consciousness and readability of the paper.
- Figure 1 is now correct
- Listing 2 depicts a more evident example
- Section 4 name is now more fitting, and there are now references to earlier examples
- Figure 3 was reworked, is now more precise, and fits better into the text explanation
- Figure 5 is new and supports the explanations in section 8.1

The paper now includes a discussion section (9.4) that considers three aspects of the proposed work. First, the evaluation of the resilience of the trust score; second, the non-evident document updates; and third, the notion of trust to only refer to the structural integrity of an RDF document and not content-level inconsistencies.

Only two aspects concerning the open comments could still be made to increase replicability. The first is regarding missing parameter numbers in the evaluation section, and the second issue is w.r.t. the redundant text paragraphs. However, these points do not directly impact the scientific contribution of the paper and its topic.

With the changes made, and no significant open points, the current version is ready to be accepted.