How to Create and Use a National Cross-domain Ontology and Data Infrastructure on the Semantic Web

Tracking #: 3142-4356

Eero Hyvonen

Responsible editor: 
Oscar Corcho

Submission type: 
Application Report
This application report presents a model and lessons learned for creating a cross-domain national ontology and Linked (Open) Data (LOD) infrastructure. The idea is to extend the global, domain agnostic ``layer cake model'' underlying the Semantic Web with domain specific and local features needed in applications. To test and demonstrate the infrastructure, a series of LOD services and portals in use have been created in 2002--2022 that cover a wide range of application domains. They have attracted millions of users in total suggesting feasibility of the proposed model. This line of research and development is unique due to its systematic national level nature and long time span of some twenty years.
Full PDF Version: 

Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 04/Mar/2023
Minor Revision
Review Comment:

The paper describes a set of services, ontologies and applications related to the semantic applications in Finland. The author reflects on almost twenty years of different developments with lessons learned. The paper also gives an overview of their SW infrastructure considering “Human” as an important component.

1- Quality, importance, and impact

The paper is of good quality as a report of many years of applying SW at a country level. It is clearly evidence with the number of papers, services still running and developed under the Semantic computing group. An impressive work in 20 years. Some of the tools, such as Skosmos are widely used in many organizations to publish SKOS datasets

2- Clarity and readability of the describing paper, which shall convey to the reader the key ideas regarding the application of Semantic Web technologies in the application.

The paper is easy to read, with clear references of application of Semantic Web technologies.

3- “Long-term stable URL for resources
The URL provided redirect to a research group on github. There are many repositories, and it would be great to have a landing page with all the resources, with links to the applications/services cited in the report. This will help the reader to access directly to the resources described, instead of manually checking them.

Detailed review
- It is not clear to the reader the difference between “data models” and terminologies as “vocabularies”. It is better to clearly define those terms to avoid confusion.
- What is the “cross-domain ontology” or is it a set of ontologies/catalog of ontologies?
- What makes the proposal of the “national SW infrastructure” different from an application of SemWeb at country level? Is the technology country dependence or it is more on the practical applications of SemWeb at national levels different and/or challenging?
- What does it mean “be maintained in a more sustainable” ? what do you expect for that that to be possible?
- “The ontologies in Fig. 1 », do you mean in Table 1? How many of those vocabularies are indexed in other popular vocabulary catalogs such as LOV, bartog or bioportal?
- In Table 1, Are you using the term “ontologies” to also thesauri using SKOS vocabularies? I am curious with the use of “concepts” instead of classic metrics such as classes, object properties or data properties.
- Please, add URIs to the ontologies in Table 1
- I suggest adding a timeline figure with the different projects and services developed.
- Have you considered having a Finnish LOD cloud? Are they some of the datasets available in LOD cloud? It would be created to mention that information in the report. Is FinnONTO playing this role ?
- There are two mentions of the same footnote (36 and 37). Please, remove one of the reference
- Wat is an RDFS ontology? An ontology developed using only RDFS axioms?
- It is not clear the need of transforming a SKOS thesaurus in Ontology as it is described for FinnONTO. Please, explain better the context and the goal of the conversion.
- I like the term “ontologization” but I wonder if the concept exists.
- Please, explain “the thesauri semantics were refined only a little using RDFS”. Do you follow any existing ontology development methodology?
- Regarding API services, have you tried to use the LDP recommendation as another approach in the services developed and deployed? Any reflection on this?
- I suggest to add also an RDF version of (adding json-ld for example) since it is a valuable resource to access to other services.
- In ” Sampo Model is an informal collection of principles », does it mean there is no ontology ? please, explain. Is there a chance to structure the model so as to describe in RDF the different portal developed?
- Reference 70 seems already from 10 years ago.
- For Sempo applications, what is the average time taken by the developers from day 0 to have a PoC running?
- In the discussion section, what could be minimum KPIs at different stage of the proposed SW infrastructure?
== Typos ==
- Rewrite “Tens of people ». Maybe just “Ten people” ?
- s/publihied /published

Review #2
Anonymous submitted on 13/Apr/2023
Minor Revision
Review Comment:

This manuscript was submitted as 'Application Report' and should be reviewed along the following dimensions: (1) Quality, importance, and impact of the described application (convincing evidence must be provided). (2) Clarity and readability of the describing paper, which shall convey to the reader the key ideas regarding the application of Semantic Web technologies in the application. Please also assess the data file provided by the authors under “Long-term stable URL for resources”. In particular, assess (A) whether the data file is well organized and in particular contains a README file which makes it easy for you to assess the data, (B) whether the provided resources appear to be complete for replication of experiments, and if not, why, (C) whether the chosen repository, if it is not GitHub, Figshare or Zenodo, is appropriate for long-term repository discoverability, and (4) whether the provided data artifacts are complete. Please refer to the reviewer instructions and the FAQ for further information.