Access Control and the Resource Description Framework: A Survey

Tracking #: 1084-2296

Authors: 
Sabrina Kirrane
Alessandra Mileo
Stefan Decker

Responsible editor: 
Bernardo Cuenca Grau

Submission type: 
Survey Article
Abstract: 
In recent years we have seen significant advances in the technology used to both publish and consume structured data using the existing web infrastructure, commonly referred to as the Linked Data Web. However, in order to support the next generation of e-business applications on top of Linked Data suitable forms of access control need to be put in place. This paper provides an overview of the various access control models, standards and policy languages, and the different access control enforcement strategies for the Resource Description Framework (the data model underpinning the Linked Data Web). A set of access control requirements that can be used to categorise existing access control strategies is proposed and a number of challenges that still need to be overcome are identified.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 13/Jul/2015
Suggestion:
Major Revision
Review Comment:

The paper presents a survey on different Access Control frameworks suitable for open/distributed environments in general and, more specifically, for the RDF data model and the Semantic Web.
Briefly, the survey is structured as follows: Section 2 provides an overview on Access Control models such as RBAC (Role Based Access Control) or ABAC (Attribute Based Access Control)
and standardisations as XACML or WAC. For each model/standardisation, the main works adopting that model/standardisation are illustrated.
Section 3 introduces different policy languages based on Ontologies (e.g. KAoS) or rules (Rei, Protune and Proteus).
Section 4 focuses on Access Control and the RDF data model. Here, access rights are generally modelled by means of rules (e.g. Jena rules), predefined views, ontologies (e.g. PPO), etc.
Different techniques are used to authorize and propagate access rights (SPARQL queries, OWL inference engines, RDFS entailment, semantic networks etc.).
Section 5 presents a list of pivotal requirements for Linked Data Access Control. This gives the opportunity to frame different approaches w.r.t. reasoning capabilities,
negotiation support, explanation, just to mention a few.
Finally, conclusions summarize the presented overview and delineate key aspects that deserve to be investigated in future works.

Given the large number of works on this topic and the different approaches that have been proposed so far, I think that a survey can be helpful to orient who intends
to contribute in this area. In this respect, the paper examines a valuable amount of papers and
tries to provide a general schema in order to systematize them.
However, I personally find that several improvements should be done to benefit both readability and significance.
In the "detailed notes" below I will focus on precise points where the exposition is somewhat inadequate. Here, I just give you general impressions and advices.

First, Section 3 -- which focuses on a few policy languages and introduces them in more detail -- and Section 5 -- where the exposition is driven by a sort of grid of salient aspects --
looks to me more significant and easy to read than Section 2 and 4 which instead appears as a long and somewhat *flat* list of works.
In "Context relating to the subject, object, transaction and environment", for example, the presentation is rather vague; it is not clear what a context concretely is and how is
specifically used. Also in "Using Rules to create views" two works are concisely presented without emphasizing pro/contra etc.
The problem is that a large part of these sections just provides an overview where about 20-30 rows are dedicated for each single paper. At the end, the impression is that you do not
examine in depth any of them. Perhaps, you should selected a restricted number of relevant works and just mention the others.

Second, the presentation generally lacks of an original analysis which emphasizes analogies as well as differences among the proposed works. One important aspect of an overview is that
it works as general cross review on what is done so far and which are strong points of a work with respect to another.
In this respect, a comparison based also on concrete examples would be very useful.

Detailed notes:

-- Introduction:
"[...] but also to identify challenges that still need to be overcome", which challenges?

-- Section 2.1.3
"As OWL is monotonic, using either approach, it is not possible to remove assertions
from the knowledge base, as such role deactivation cannot be supported."
This sentence is not completely clear to me. This seems a matter of static vs dynamic knowledge bases rather than monotonic semantics.

-- Section 2.1.5
"Given that it is not possible to specify
policies based on relations between instances
using description logic, the authors use two
predicates requiresTrue and requiresFalse
to specify run time constraints based on user
attributes."
This explanation is too vague to grasp what is really going on.

-- Section 3
"which evaluates twelve different
policy languages against a set of criteria, that
are deemed necessary for ensuring security and
privacy in a Semantic Web context."
Which criteria? are they exhaustive? can they be improved? How?

-- Section 3.1
"Additionally, access control
policies specified using different vocabularies
can easily be merged."
Merging ontologies is quite a difficult task which may require to check whether the semantics of the involved concepts is not jeopardized (look at the large
literature about modules, conservative extensions etc.).

-- Section 3.1.1
"Administration is primarily concerned
with subsumption based reasoning and the determination
of disjointness."
Why?

-- Section 3.2
"One of the benefits of a rule based approach
is that it is possible to support access control
policies that contain instance dependencies or
variables."
The fact that a policy contains variables is not a benefit per se. Explain properly the impact of variables in terms of expressiveness.

-- Section 3.2.1
"Like KAoS, Rei
provides support for four distinct policy types,
permissions, prohibitions, obligations and dispensations."
What does it specifically mean dispensation in this context?

"a set of ontologies, used to represent Rein
policy networks (resources, policies, metapolicies
and the Rein policy languages)
and access requests"
What is a metapolicy? you never mentioned it before.

"In order to cater for scenarios where
part of the policy is private and part is public,
the authors propose a hybrid approach, where
Rein acts both as a client and a server."
Here, "hybrid" is too vague to me. How does this approach work specifically?

-- Section 3.2.2

"Protune policies are
specified using rules and meta-rules"
Meta-rules are a key feature of Protune; for instance, they are used to define confidential parts of a policy (which is an essential ingredient of the trust negotiation interaction model, see also
a comment below). Here, I would use some examples to explain how meta-rules are used in this context.

"Protune explanations are provided
by a component known as Protune-x"
Protune-x is another key feature of the framework which provides explanations in controlled natural language.
With respect standard Prolog tracers, it uses many heuristics
to facilitate human understanding. Also here, examples would be useful to give a reasonable and yet effective overview on Protune-x.

-- Section 4
"This section, presents" --> "This section presents"

-- Section 4.2.1
"If multiple explicit conflicting
policies exist, conflicts are resolved using
logical conjunction. In the case of implicit
policies, conflicts are resolved using logical disjunction."
Give at least the intuition behind, why explicit and implicit conflicts trigger different behaviours?

-- Section 5.1
"One requirement,
which was not included is monotonicity. According
to Bonatti and Olmedilla [10], the addition
of new evidences and policies should not
negate any of the previous conclusions. However,
given the need to support negative access
control policies, and also changes in contextual
constraints, one could argue that access control
should in fact be non-monotonic."
I find this sentence rather foggy.
Here, the problem is closely related to the trust negotiation model and the fact that policies are not entirely public (consider again the use of meta-rules in Protune).
In this model if the policies are entirely public, then reasoning is essentially local and all required credentials can be provided beforehand. On the contrary, when
private parts of access policies involve true negotiation, at a given point the lack of a credential C may satisfy some negation as failure in a rule's body which in its turn triggers
some information disclosure. If some steps later you realize that C exists, then how can you retract the information that you have already disclosed?
This issue is clearly put forward in [8] where conditions C1 and C2 are introduced to prevent the case described above (see Remark 6.1).

-- Section 5.1
"However, a number
of researchers have also proposed using
OpenID"
Why? Pro/contra?

-- Section 5.3
Usability and effectiveness are not very informative and should be discussed in more details
In particular, which of the presented performance evaluations is more comprehensive? How benchmarks should look like?

Review #2
By Luca Costabello submitted on 23/Jul/2015
Suggestion:
Accept
Review Comment:

This is a much needed survey on access control for RDF. Although a number of prior art surveys on the subject exist, they are fragmented and scattered across a number of research papers. This survey is therefore a valuable contribution to Linked Data researchers and the broader Semantic Web community, and has the potential to become a standard reference for researchers in the field. The survey is particularly useful for PhD students and researchers approaching the field of access control, as it nicely introduces the main challenges, and proceeds with describing the most prominent works in the field.

The survey is comprehensive and covers more than ten years of Semantic Web research in access control.
The authors describe and compare models, policy languages (ontology-based and rule-based), and enforcement mechanisms. Besides, they discuss a list of requirements for RDF access control frameworks. The paper also describes a number of important open research questions (above all the lack of unified access control benchmarks and the need for usability assessment campaigns). The authors show solid awareness of RDF access control landscape: the survey cites over a hundred AC-related academic publications, and covers research in the field from the early Semantic Web days to the latest Linked Data-oriented works. It includes an in-depth summary of 28 access control frameworks for Linked Data, which have been grouped according to common features. Such frameworks are also qualitatively compared along different dimensions: specification requirements (policy granularity, partial results, etc.), enforcement requirements (e.g. explanation, conflict resolution), administration requirements(e.g. usability, understandability), and implementation aspects (e.g. supported triplestores). It is worth mentioning that every work described in the survey is summarized in a concise and effective manner. Also, follow-up papers for each surveyed framework have been included, and deltas are clearly described. The survey also includes additional topics, such as propagation of authorizations and the inference problem, thus covering a broad range of access control-related fields.

This is a well written and well structured survey paper. As a minor side note, I have the impression that the reader is not given a complete chronological evolution of access control works. This is a deliberate choice - the authors decided to follow a "thematic approach" - but it would have been interesting to also have a glimpse at the chronological evolution, for example to identify rising/fading trends in the fields. This can however be done to some extent, e.g. the authors conclude their survey with requirements drawn from the most recent Linked Data-oriented works, and they acknowledge that "the focus has shifted [...] on Linked Data".

Talking about Linked Data, there could be room for additional access control requirements derived from recent Linked Data access strategies (e.g. Linked Data Fragments) and novel languages for navigating and consuming triples on the Web (e.g. nSPARQL), although the authors acknowledge that little has been done to date in that direction.

Maybe the only limitation of the survey is the lack of focus on Linked Data developers. Whether this is deliberate or not, the paper is missing an assessment of tools availability on the Web (i.e. is this access control framework available for download? Is the codebase open source? Which software license?). Such assessment would be valuable for both researchers and practitioners, and would also help transfer access control for RDF from academia into the mainstream.

Details

- Sec 2.2: The survey does not include the ongoing work around access control in the W3C Linked Data Platform Working Group (LDP) (https://dvcs.w3.org/hg/ldpwg/raw-file/default/ldp-acr.html). Although still a preliminary work, it might be worth mentioning the use cases and the requirements derived from LDP.

- For better readability, Tables 1 should be showed on the same page, or merged and rotated sideways.

- In Sec 5.1, it would be nice to have an example to support the claim that access control is non-monotonic.

- Typo: sec 5.1, under "Attributes, context & evidence":
"As the requester many" --> may ?

- sec 5.2, "explanations": Even if this is said in the conclusions, it is worth mentioning also in this section that whether explanations are provided or not depends on the use case. Access control is put in practice to protect from adversarial behavior. If this holds, any explanation would "tip off" a potential intruder, thus weakening the access control mechanism.

- sec 5.3, table 3, "usability" column: The line of works by Costabello et al. also includes an additional policy manager web app described in:
L. Costabello, S. Villata, I. Vagliano, and F. Gandon. Assisted Policy Management for SPARQL Endpoints Access Control. 12th International Semantic Web Conference (ISWC), Demo session, 2013.

Also, In sec 5.4, table 4, under column "flexibility...", Costabello et al. have also implemented and tested their framework with Jena Fuseki (see [19]).