Enhancing Data Use Ontology (DUO) for Health-Data Sharing by Extending it with ODRL and DPV

Tracking #: 3459-4673

Harshvardhan J. Pandit
Beatriz Esteves

Responsible editor: 
Cogan Shimizu

Submission type: 
Ontology Description
The Global Alliance for Genomics and Health is an international consortium that is developing the Data Use Ontology (DUO) as a standard providing machine-readable codes for automation in data discovery and responsible sharing of genomics data. DUO concepts, which are encoded using OWL , contain only the textual descriptions of the conditions for data use they represent, and do not specify the intended permissions, prohibitions, and obligations explicitly - which limits their usefulness. We present an exploration of how the Open Digital Rights Language (ODRL) can be used to explicitly represent the information inherent in DUO concepts to create policies that are then used to represent conditions under which datasets are available for use, conditions in requests to use them, and to generate agreements based on a compatibility matching between the two. We also address a current limitation of DUO regarding specifying information relevant to privacy and data protection law by using the the Data Privacy Vocabulary (DPV) which supports expressing legal concepts in a jurisdiction-agnostic manner as well as for specific laws like the GDPR. Our work supports the existing socio-technical governance processes involving use of DUO by providing a complementary rather than replacement approach. To support this and improve DUO, we provide a description of how our system can be deployed with a proof of concept demonstration that uses ODRL rules for all DUO concepts, and uses them to generate agreements through matching of requests to data offers. All resources described in this article are available at: https:// w3id.org/duodrl/repo.
Full PDF Version: 

Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 12/Jun/2023
Review Comment:

The authors have reviewed the paper, enhanced its clarity in various sections, and added examples. Table 1 has been revised and typos fixed, while the matching algorithm at p. 13-14 has been illustrated.

Review #2
Anonymous submitted on 13/Jun/2023
Minor Revision
Review Comment:

The paper has been improved in this new version.

However, my specific comments seem not being addressed.

In particular, I suggested:

"Things to improve include:
- Table 1 is not fully comprehensive.
- Evaluate the effort needed in non-directly mapable concepts and how useful they are.
- Define the updated ontology as such."

Concerning the first suggestion, although being mentioned in the cover letter, no changes have been done. It is clear what the intention of the table is, but explaining the meaning of the different columns would help.

The other 2 comments seem to be ignored. I would appreciate an explanation of why there is no need to change anything on those.

Review #3
Anonymous submitted on 16/Jul/2023
Minor Revision
Review Comment:

The authors have addressed my comments satisfactorily. Only (hopefully) minor modifications are needed:

1 - the meaning of one phrase is still unclear to me;
2 - some of the conditions in Algorithm 1 may have to be refined/corrected;
3 - the new text contains some typos.

Concerning point 2, I would like to have a quick look at the new version before publication, because the matching algorithm might be wrong.

Detailed comments:

1) On page 5, line 11 the sentence "where the concepts to be matched in a policy are pre-determined" is unclear to me. Does it refer to DUOS or to [19]? The former is correct, the latter would not (because [19] is vocabulary-agnostic).

2) IMHO if a prohibition denies data usage with some spatial condition SC1 (eg Ireland and Germany) while the data request wants to use the data with another spatial condition SC2 that *overlaps* SC1 (eg Ireland and France), then the answer should be DENY. This is because a GRANT decision permits data processing using SC2, that includes also forbidden spatial conditions (Ireland). On the contrary, Algorithm 1 would not DENY the request because SC1 and SC2 are not equivalent (line 30).

Similarly, the request is denied only if offer:purpose is equivalent to request:purpose (line 36), but it should also be denied if the two purposes ovelapped (eg when request:purpose is a subclass of offer:purpose).

Dually, for permissions, the request is denied whenever offer:purpose is not equivalent to request:purpose (line 46), but the request is acceptable if request:purpose is a subclass of offer:purpose.

In other words, according to my understanding of the matching process - and without the help/guidance of a formal semantics for the policy language - the current matching algorithm seems wrong.


Page 2, line 36: permissible -> permitted (?)

Page 3, line 8: the sentence "[ODRL] is *the* W3C standard [...] to model rules and policies" sounds too strong to me; please consider that there exist also W3C rule languages such as SWRL and RIF. I would write that ODRL is *a* W3C standard...
As a side note, also the OASIS standardisation organisation published policy languages such as XACML and Legal RuleML.

Page 18 lines 48-49: the sentence "the extent of what and how they wish to utilise our suggestions" sounds weird, it may be better to reorder it (eg what to utilise and how).

Finally, in my previous review I forgot to suggest to add a reference for "sticky policies" on p.2, line 33.