The RESCS Ontology: linking Open Research Data from multiple sources to support interdisciplinary investigations

Tracking #: 2746-3960

Authors: 
Kurt Baumann
Andrea Bertino
Laura Rettig
Sebastian Sigloch
Daniela Subotic
Ivan Subotic

Responsible editor: 
Stefan Schlobach

Submission type: 
Ontology Description
Abstract: 
The availability of open repositories and the application of Semantic Web techniques are paving the way towards new usage scenarios for research data. This paper describes the ontology developed within the first phase of the Connectome project. The goal of the Connectome project is to make data from different providers interoperable and thus improve its use through both generic and discipline-specific services. On the basis of the RESCS (RESearch CommonS) ontology defined through an intensive exchange with various researchers, data providers and funders, we give a detailed description of the ontology. The paper concludes with a brief outlook on possible tools and applications which could take advantage of the Connectome knowledge graph in the future.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Reject

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Nuno Freire submitted on 10/Apr/2021
Suggestion:
Reject
Review Comment:

The paper presents an ontology for representing research assets at a general level, aiming to achieve interoperability of metadata at the interdisciplinary level. The paper presents how the ontology was developed in the context of the Connectome project. It presents how requirements were identified, the possible underlying ontologies from which RECS could have been based on, and the chosen ontologies. The paper also presents the main design aspects of the ontology and the technical details of how the ontology was specified formally, and the technical details of its deployment.

The RESCS ontology addresses an area that currently is of interest to many researchers that are addressing the same, or similar, research problem. The design of the ontology was done with the consultation of domain experts, which is a positive aspect. The paper, however, does not provide the adequate level of detail for being a solid contribution for others researchers addressing the same problem.

The paper does not describe how this work relates to the research done by others in ontologies for research assets. It mentions several ontologies that include the key types needed for research assets. These ontologies were analyzed by the authors and two ontologies were chosen as the base for RECS, but the paper does not present the details about the comparison of all the ontologies, which would be of great usefulness for other researchers.

The RESCS ontology should be described in more detail. Important aspects of its design are not possible to be understood by the reader of the paper. Figure 2 is the most detailed view of the RESCS ontology provided in the paper, and it should be complemented with more details about the properties and classes it comprises. In the particular vase of the properties that are defined by the RESCS ontology, the reader has no possibility to know their definition.
(Note: The types shown in Table 1 do not match those shown in Fig. 2 (recs:Organization and recs:Person / schema:Organization and schema:Person) )

The authors describe how the RESCS ontology is being applied for supporting the metadata aggregation and processing pipeline of the Connectome prototype. But the authors should undertake further evaluations of the ontology. The paper does not present how well the RESCS ontology answers the Ontology Competency Questions that were identified in the beginning of the paper, nor presents any evaluation metrics about the design of the ontology.

Review #2
By Michael Färber submitted on 26/May/2021
Suggestion:
Reject
Review Comment:

In the article "The RESCS Ontology: linking Open Research Data from multiple sources to support interdisciplinary investigations," the authors propose the RESCS ontology to facilitate the discovery and exploration of resources in interdisciplinary research. A description of the ontology is made available at htts://www.rescs.org.

Strong points:
* The article is clearly written and well structured.
* The authors of the article address the modeling of open research data---a timely and important research topic.

Weak points:
* The necessity of the ontology and the gap to related work remains unclear.
* The novelty of the ontology and its creation process seems to be very limited. No URI of the ontology is provided in the article.
* It is rather unclear in which specific applications and for which tasks the ontology could be used. The authors show neither any usage by third parties nor any substantial usage by themselves.

---

Detailed remarks:

1. Introduction
* Research question and research gap: As the authors write on a quite abstract level without specific examples and references, it remains unclear to the reader which specific research problem is tackled by the authors and why it is important and pressing. Consequently, when the authors write that the "construction of knowledge graphs" would solve this problem (which can be questioned because knowledge graphs might not necessarily solve all problems related to FAIR data and because a semantic web without ontologies is conceivable [W3C09]), the question remains what data the authors consider, why "new ontologies which can meet the needs of interdisciplinary research" are needed, and how this aspect is not covered in related work so far. In other words, a description of the necessity of the ontology and a description of the gap to related work are missing.
* Use cases: Similarly, the authors write that the ontology will increase the "retrievability and reuse of linked data for research," but it remains unclear in which applications and for which tasks the ontology can be used.
* Related work: A large variety of ontologies, knowledge graphs, and further schemas have been proposed for modeling scholarly data, such as publications (e.g., [F19][W21][JOFP19]), data sets (e.g., [FL21]), and repositories (e.g., [R21][WAWE17]). None of them is considered or referenced in the Introduction. Also, no related work section exists that would bring the proposed ontology into a larger context.
* Structure: Large parts of the Introduction describe the project Connectome, in which the ontology has been created. Instead of describing the project, it might be beneficial to focus on related work and on outlining important novel use cases, which are enabled by the new ontology.
* Maturity of work: According to the authors, the article presents "preliminary work." Given the SWJ review guidelines (http://www.semantic-web-journal.net/reviewers), the SWJ ontology description articles might not be the ideal venue for preliminary ontologies.

2. Methodology
* Use cases: Similar to the Introductions section, in Section 2.1 it remains unclear how "open linked research data" is defined by the authors and in which scenarios the ontology can be used (e.g., for recommender systems due to information overload?).
* Structure: Figure 1 does not seem to add any value and could be removed. Furthermore, Section 2.2.1 and 2.2.2 can be shortened in my view.
* Data quality dimensions: In Section 2.2, the authors outline a set of data quality criteria (e.g., coverage, long-term availability) that were used when creating the ontology. It remains unclear if this list of criteria is complete and how it emerged. It might be valuable to consider established frameworks regarding the data quality of ontologies and knowledge graphs (e.g., [FBMR18]) to cover a wide range and well-established data quality dimensions.
* Related work: In Section 2.2, the authors list several initiatives regarding the modeling of scholarly metadata. I would suggest using proper citations instead of URLs in footnotes. In addition, the listed ontologies were created for various use cases. Depending on the need for the proposed RESCS ontology (which remains unclear), other or additional ontologies and initiatives might be highly relevant (e.g., [F19][FL21][JOFP19][O21][PS20][R21][W21][WAWE17]).

3. The RESCS Ontology
* Structure: Section 3.1 addresses the aims of creating the RESCS ontology. In my view, this subsection is a repetition of the Introduction section and, thus, can be removed.
* Unique selling point: The authors mention that "domain-specific research methodologies" are important. However, it remains unclear how the authors address the various scientific disciplines with their ontology and, thus, ensure that the ontology can be used for interdisciplinary research particularly well.
* Related work: The proposed ontology, whose RDF files could not be found online, seems to be relatively small. All important entity types, such as dataset, research project, organization, and person, seem to be covered by existing ontologies on the Web (e.g., project information in [WAWE17], dataset information in [FL21], papers' metadata in [F19][JOFP19][W21]). If schema.org is used for some entity types, it would be obvious to use it for other entity types, such as person, as well.
* Real-world usage: It is unclear how the ontology is used as a schema for knowledge graphs. No information about knowledge graphs (i.e., instance data) is provided.

4. Prototyping through Linked Data Pipeline in Blue Brain Nexus
* Structure: The description of the Blue Brain Nexus functionalities seems to be irrelevant for the article. Instead, the reader would be interested in real-world applications and usage of the ontology by third parties. This information is not provided by the authors.

---

In the following, the article is evaluated by the criteria defined by the Semantic Web Journal (see http://www.semantic-web-journal.net/reviewers):

"(1) Quality and relevance of the described ontology":

As no URI of the ontology RDF/OWL file is provided, the quality of the ontology cannot be evaluated thoroughly. Given the description available at https://www.rescs.org and figures 2 and 3 in the article, the ontology is mainly a selection of specific classes and properties of existing ontologies. The added value seems to be low.

"(2) Illustration, clarity and readability of the describing paper, which shall convey to the reader the key aspects of the described ontology."

The article is written very clearly and well structured. However, the authors miss to point out the key use cases for this new ontology, as well as the gap to related ontologies and how the proposed ontology fills the gap.

"Please also assess the data file provided by the authors under “Long-term stable URL for resources”. In particular, assess
(A) whether the data file is well organized and in particular contains a README file which makes it easy for you to assess the data,
(B) whether the provided resources appear to be complete for replication of experiments, and if not, why,
(C) whether the chosen repository, if it is not GitHub, Figshare or Zenodo, is appropriate for long-term repository discoverability, and
(D) whether the provided data artifacts are complete."

It seems that no long-term stable URI was provided for the ontology.

References:
[F19] Färber, M. (2019): The Microsoft Academic Knowledge Graph: A Linked Data Source with 8 Billion Triples of Scholarly Data. ISWC'19, pp. 113-129.
[FL21] Färber, M. and Lamprecht, D. (2021): Creating a Knowledge Graph for Data Sets, 2021, http://dskg.org/publications/DSKG_QSS2021.pdf
[FBMR18] Färber, M., Bartscherer, F., Menne, C., Rettinger, A. (2018): Linked Data Quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO. Semantic Web 9(1), pp. 77-129.
[JOFP19] Jaradeh, M., Oelen, A., Farfar, K., Prinz, M., D’Souza, J., Kismihók, G., Stocker, M., Auer, S. (2019). Open Research Knowledge Graph: Next Generation Infrastructure for Semantic Scholarly Knowledge. K-CAP'19, pp. 243–246.
[O21] Open Research Knowledge Maps, https://openknowledgemaps.org/
[PS20] Peroni, S., Shotton, D. M. (2020). OpenCitations, an Infrastructure Organization for Open Scholarship. Quant. Sci. Stud., 1(1), pp. 428–444.
[R21] re3data, http://re3data.org
[W21] Wikidata, https://wikidata.org
[WAWE17] Wang, J., Aryani, A., Wyborn, L., Evans, B. (2017). Providing Research Graph Data in JSON-LD using Schema.org. WWW'17 Companion, pp. 1213–1218.
[W3C09] W3C Semantic Web Frequently Asked Questions, https://www.w3.org/RDF/FAQ

Review #3
By Paul Groth submitted on 30/May/2021
Suggestion:
Major Revision
Review Comment:

This ontology paper describes the RESCS ontology whose aim is to bridge research in multiple domains. I really like the aim of the ontology and I think offering such a bridging ontology would be helpful to the community. However, the documentation of the ontology within the paper needs to be better described and the ontology publication details need to be checked before this could be accepted.

# Ontology Quality
First, will respect to quality of the ontology. I liked the use of competency questions but these should be revisited in saying a bit more about how the ontology actually meets those questions. Also many of the questions seem generic to research and not to this sort of interdisciplinary bridging used as motivation. It would be good to highlight which questions are specific for this problem. Secondly, I wondered about some properties defined in the ontology for example https://rescs.org/prop-schemahasgrant.html the expected type is not given in the definition. Furthermore, the ontology use schema.org/hasGrant which doesn't exist. Instead I think you mean https://schema.org/fundedItem/ This was just a bit of clicking around so it makes me wonder about the definition of the rest of the quality.

# Related Work
The related work in this existing work needs to be improved. In particular, there is a high overlap with the http://www.sparontologies.net/ a widely used and well defined set of ontologies for the scholarship. How does RESCS relate to these ontologies? Likewise, there is quite some work on Research Objects (https://www.researchobject.org) how does this work relate to this packaging infrastructure. I was also interested whether there was role for Wikidata which has been increasingly used for science and research [1].

Also with respect to related work, the authors use URLs instead of citations for many ontologies and initiatives (e.g. Researchgraph.org, Scholix, DCAT, Dublin Core, OAI, OpenAIRE, Freya PID Graph, PROV-O1 and Schema.org, FAIR principles). It is good practice to cite these resources.

# Ontology Publication

The publication of the ontology needs to be improved. I like the navigation of the ontology at rescs.org but the ontology is missing some core things. First, ontology concepts need to be dereferencable (e.g. https://rescs.org/ResearchProject) does not dereference to a human readable or machine readable representation. Likewise, some domains are not even there for example the prefix https://provshapes.org/datashapes/.

Secondly, I could not find a statement about the long term durability of the domain. This is critical. How do we know that rescs.org will stay around? Good practice says to use a redirection service (purl.org, doi, or w3id). Otherwise, I would expect a statement about the longevity of the resource. Also where can I download the whole ontology?

I really like the use of the use of SHACL shapes as part of the ontology produce. I think the paper should say more about this.

In summary, I think there is something interesting but a lot of details need to be checked and best practices need to be adopted.

Minor Comments
- can you check the consistency of the title's casing. It just seems a bit odd.
- In the introduction, the link between the need for interoperability in research data, open standards, and knowledge graphs should be given more justification. There is a significant amount of work around FAIR data principles that provides this justification as well as various vocabularies in this direction.
- "On the basis of the RESCS (RESearch CommonS) ontology defined through an intensive exchange with various researchers, data providers and funders, we give a detailed description of the ontology." - this sentence is hard to understand. Maybe remove "on the basis of"?

[1] Waagmeester, A., Stupp, G., Burgstaller-Muehlbacher, S., Good, B. M., Griffith, M., Griffith, O. L., Hanspers, K., Hermjakob, H., Hudson, T. S., Hybiske, K., Keating, S. M., Manske, M., Mayers, M., Mietchen, D., Mitraka, E., Pico, A. R., Putman, T., Riutta, A., Queralt-Rosinach, N., … Su, A. I. (2020). Wikidata as a knowledge graph for the life sciences. ELife, 9. https://doi.org/10.7554/elife.52614