Analysis of Ontologies and Policy Languages to Represent Information Flows in GDPR

Tracking #: 2857-4071

Authors: 
Beatriz Esteves
Víctor Rodríguez-Doncel

Responsible editor: 
Guest Editors ST 4 Data and Algorithmic Governance 2020

Submission type: 
Survey Article
Abstract: 
This article surveys existing vocabularies, ontologies and policy languages that can be used to represent informational items referenced in GDPR rights and obligations, such as the ‘notification of a data breach’, the ‘controller’s identity’ or a ‘DPIA’. Rights and obligations in GDPR are analyzed in terms of information flows between different stakeholders, and a complete collection of 57 different informational items that are mentioned by GDPR is described. 13 privacy-related policy languages and 9 data protection vocabularies and ontologies are studied in relation to this list of informational items. ODRL emerges as the language that can partially represent the highest number of rights and obligations in GDPR if complemented with DPV and GDPRtEXT, since 39 out of the 57 informational items can be modelled. Online supplementary material is provided, including a simple search application and a taxonomy of the identified entities.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Tassilo Pellegrini submitted on 20/Aug/2021
Suggestion:
Accept
Review Comment:

I want to thank the authors for improving the paper according to the reviewers' comments. The paper is in a very nice shape.
The work is innovative and well suited to be considerred an introductory text at an advanced level for the specific (academic and industrial) research communities given that it requires some profound knowledge about the fundamentals of semantic web engineering and knowledge modelling. BUt the authors do a good job in explaining the problem area and connecting it to real world scenarios in the context of GDPR governance and compliance. Hence the work also contributes to the wider semantic web community. The provided accompanying materials are appropriate to fulfill the publication criteria.

One minor improvement could be made: the line spacing at the bottom of page 9 deviates from the style sheet.

----
This manuscript was submitted as 'Survey Article' and should be reviewed along the following dimensions: (1) Suitability as introductory text, targeted at researchers, PhD students, or practitioners, to get started on the covered topic. (2) How comprehensive and how balanced is the presentation and coverage. (3) Readability and clarity of the presentation. (4) Importance of the covered material to the broader Semantic Web community. Please also assess the data file provided by the authors under “Long-term stable URL for resources”. In particular, assess (A) whether the data file is well organized and in particular contains a README file which makes it easy for you to assess the data, (B) whether the provided resources appear to be complete for replication of experiments, and if not, why, (C) whether the chosen repository, if it is not GitHub, Figshare or Zenodo, is appropriate for long-term repository discoverability, and (4) whether the provided data artifacts are complete. Please refer to the reviewer instructions and the FAQ for further information.

Review #2
By Guido Governatori submitted on 17/Nov/2021
Suggestion:
Major Revision
Review Comment:

This manuscript was submitted as 'Survey Article' and should be reviewed along the following dimensions: (1) Suitability as introductory text, targeted at researchers, PhD students, or practitioners, to get started on the covered topic. (2) How comprehensive and how balanced is the presentation and coverage. (3) Readability and clarity of the presentation. (4) Importance of the covered material to the broader Semantic Web community. Please also assess the data file provided by the authors under “Long-term stable URL for resources”. In particular, assess (A) whether the data file is well organized and in particular contains a README file which makes it easy for you to assess the data, (B) whether the provided resources appear to be complete for replication of experiments, and if not, why, (C) whether the chosen repository, if it is not GitHub, Figshare or Zenodo, is appropriate for long-term repository discoverability, and (4) whether the provided data artifacts are complete. Please refer to the reviewer instructions and the FAQ for further information.

I have some concerns with the current version of the paper, more specifically, I have some doubts about the method used to evaluate the selected languages, specifically on the capabilities of the various languages. It is not clear to me how the authors determined whether some languages are able to capture/represent/model different privacy aspects. While the positive side is simpler, an example of how to encode a GDPR aspect can be encoded (either in the language documentation or a new one by the authors of the paper), the negative (that a language is not able to model) is not so simple (in general an impossibility proof will be needed and these, in general, are hard, or the authors must specify how their evaluation is conducted and what are the parameter (for example, saying that they were not able to model is not enough). In any case, I believe they must be provided either directly or with an explicit reference. None of them is present in the paper. This also affects the conclusion of the paper.

Page 2, column 1, line 12 It seems to me that the exact formulation of the research question is too broad, namely: "can ... be used and extended", in particular, the "extend". I would suggest rephrasing it as "ARE existing. policy languages and vocabulary SUITABLE for meeting ..."

Figure 1: Instead of a double arrow (with the same tip for both directions) I would use a different notation for cases where a request needs a corresponding notification. So, for instance, if we consider RE one reading is that the data controller has can ask the data subject to erase the data Obviously, this is a non-sense, but the semantics of the double arrow is not clear.

Page 8, c1, l14-15: Why on one side do you consider obligations and duties (and what is the difference between them?), on the other you consider rights but not permissions? Also, in my jurisdiction duty is a special subclass of obligation (and according to legal theory, rights are a subclass of permissions). So, it would be important to start with defining the meaning of key terms (so what is a right, what is an obligation, what is a duty, ...)

Figure 2, colour coding. Why LegalRuleML is not designed as an extension of RuleML?

Page 13, c1, l12-13: after the submission of the paper, LegalRuleML has been adopted as a full OASIS Standard. Same section, I've seen that for other languages you have references to work where the language is for GDPR. Here is one for LegalRuleML: M Palmirani, G Governatori, Modelling Legal Knowledge for GDPR Compliance Checking. JURIX 2018, 101-110. Also, this paper uses the PrOnto ontology described in Section 4.3.5.

Page 16, table 5: why is the last item separated by the others?

Page 2, c2, l37-47: After reading Articles 15 and 20 of the GDPR, I don’t understand why you wrote that LegalRuleML is not able to capture RPD and RDM. These two sections can be easily represented in LegalRuleML Similarly for the restrictions in Table 7. I don’t see any reason why it is not able to do it.

Page 25, c1, l1-13: Not sure here why LegalRuleML (and PrOnto) has(have) not been included here. The deontic concepts in LegalRuleML can be linked with whatever (deontic) ontology one wants, so one can create a specific deontic ontology for the obligations, permissions, rights, prohibitions of GDPR. The combination of LegalRuleML and PrOnto has been successfully applied to GDPR use cases, and G. Governatori and R. Iannella. A modelling and reasoning framework for social networks policies. Enterprise Information Systems, 5(1):145–167, 2011 shows that ORDL 2.O can be modelled by FCL and the RuleML serialisation of FCL (or Defeasible Deontic Logic) has been one of the inspiration/sources for LegalRuleML (G. Governatori. Representing business contracts in RuleML. International Journal of Cooperative In- formation Systems, 14(2-3):181–216, 2005)