Linked Data for Science and Education

Paper Title: 
Linked Data for Science and Education
Carsten Keßler, Mathieu d’Aquin, Stefan Dietze
Sharing of resources and metadata is a central principle in scientific and educational contexts. With the emergence of the Linked Data approach as the most recent evolution of the Semantic Web, scientific and educational practitioners have started to adopt those principles. The communities working on Linked Data for science and education have since developed common schemas used for describing scientific or educational resources, substantial collections of structured data, bibliographic collections, domain-specific vocabularies capturing vast amounts of scientific domain knowledge, as well as baseline technologies used to expose and integrate linked datasets. In this paper, we give an overview of the current landscape related to the use of Linked Data in the academic sector. We look at the common challenges, prominent datasets, tools and applications, and conclude on the major directions for research in this area.
Full PDF Version: 
Submission type: 
Survey Article
Responsible editor: 
Pascal Hitzler
Reject and Resubmit

Submission in response to (editorial)

Review 1 by Peter Fox

The overall aim of this paper is to assess the state of the art/ research in Linked Data for Science and Education. It both achieves this goal on one hand and fails on another.

Starting around section 3, the paper becomes very good, highly quantitative and very well referenced. However, prior to that and starting in the abstract the material in the paper is poorly introduced (abstract is very poor) and many, many assertions are made to frame the work without reference or supporting argumentation. It is in this area the paper needs drastic improvement. Another decificiency is in the ending phrase in the abstract where major directions for research are promised but are not (at all) adequately delivered upon, presumably in Sect. 5?

Detailed comments:

Is Linked Learning a keyword?


Sharing of .. (the first sentence) must be supported by some references.

Bottom of P1, col 2, Given the proven capabilities of LD technologies - no references and some are needed.. Then ... scientific and educational practitioners have started to adopt those principles.
What principle? Data sharing and reuse or LD principles? Again no references are given and as such these assertions are hollow (am not disputing they are legitimate but this is a journal paper not a news article).

P2, top of col 1, first paragraph ending ... and integrate linked datasets. Again no references?

Sect. 2, first paragraph, no reference and many are needed in this opening paragraph.

2.1 Fragmented landscape of competing schemas? This again is an assertion that is not supported in the following text. Instead, several schemas are listed but no evidence is presented to support the "competing" claim. This paragraph needs substantial revision.

2.2. begins to get much better in the presentation, arguments and reference but even this paragraph, especially about mid-way (e.g. lab info. sys.) could benefit from more examples/ refs.

2.3 is okay. But are these (the three un-numbered sections beginning in BOLD) research challenges identified in the Abstract as what the paper would conclude with? I hope not.

3. mostly okay.

3.2 starts with another assertion. Transparency and reproducability are core principles for scienctific work.... is arguable, especially transparency which is increasingly an attribute being placed upon scientists rather than being a central tenet. Indeed the first half of this leading paragraph suffers from the Sect. 1 and 2 problem of not using examples sufficiently or providing adequate references. For example, which "disciplines have tried to tackle this challenge by setting up .."?

Rest of 3.2 is good.

3.3 is mostly good.

p7, col 2, 2nd paragraph. In contrast to all the other ontology/ vocabularies introduced the "Ontology of units of Measure" is curious. Why not Unified Code for Units of Measure (UCUM) or Units of Measure (UoM)? Especially as it is unpublished/ in press work.

4. First sentence: "For what concerns the publication of LD, in many cases, ..." - please reword this. E.g. "In many cases concerning the publication of LD, "

p9, 2nd paragraph. "Typical recommendation approaches ...". What approaches? Names, examples, references are needed.

5. Lead in sentence is poorly worded. Move the [18] citation to the end of the first sentence and drop "As mentioned in". However this section almost seems out of place and may be better between sect. 2 and 3.

6. Again starts with an assertion about traction that is only backed up later in the paragraph. If the research challenges are embedded in this section they are so deeply embedded as to be invisible.
If they are not and the research challenges are indeed those in 2.3 then they are not a novel or innovate contribution for this journal. The authors are encouraged to take a hard look at what the real challenges are and articulate them well in this closing section.

Review 2 by Vikram Sorathia

This is a well written survey paper capturing current state-of-the-art in the area of Linked Data (LD) applications in educational context. Authors do a commendable effort in compiling comprehensive account of LD efforts by education community across the world. The paper also identifies the challenges and teaching/learning centric applications for future work.
The material is easily comprehensible benefitting both researchers and practitioners willing to explore this research topic. It also contains extensive list of pointers to various LD enabled educational, library and scientific datasets that may act as single point references for the research, practitioners, teachers, and learning community. Authors also cover various aspects including standardization, tools, and practical challenges, which makes this paper relevant and interesting for broader semantic web community. Overall, the presentation is clear that makes paper readable.
However, the paper also has some opportunities of significant improvement.
First element missing as this work being a survey paper is the critical comparative evaluation of various approaches discussed in the paper. In addition to critical discussion, this can be summarized by providing (i) tabular representation of comparison with identified features or (ii) graphical representation of classification of various approaches in the form of taxonomies. This presentation approach is very effective not only in summarizing various approaches, but also helps revel the gaps and opportunities for researchers considering exploring this topic further.
Table 1, Table 2 (page 6) and Table 3 (page 8) provide useful information by listing selection of educational, library, and scientific datasets respectively. This information is useful but not critical to the discussion as it merely lists the source, SPARQL endpoints and licensing information. It will be more appropriate to include these tables in to Appendix. Same is applicable to Figure 1.
Section 4 that provides discussion on tools and technologies is relatively short and can be extended significantly to provide details with respect to various types of application scenarios, use cases and experiments that can be conducted in teaching/learning centric systems. Here authors can also provide list of features and capabilities that are currently not supported by existing tools.
As the nature of this paper being a survey paper, it is expected that potential readers may not have sufficient technical familiarity with various terms, standards, techniques from all underlying communities (semantic web, linked data, technology enhanced learning, library metadata and cataloging etc.). It will be useful to define, explain, and provide more contexts on such terms in the introduction section or when they are first introduced.