Editorial Board

Editors-in-Chief
Krzysztof Janowicz

Managing Editors
Cogan Shimizu
Eva Blomqvist

Editorial Board
Mehwish Alam
Claudia d’Amato
Stefano Borgo
Boyan Brodaric
Philipp Cimiano
Oscar Corcho
Bernardo Cuenca-Grau
Elena Demidova
Jerome Euzenat
Mark Gahegan
Aldo Gangemi
Anna Lisa Gentile
Rafael Goncalves
Dagmar Gromann
Armin Haller
Aidan Hogan
Katja Hose
Eero Hyvönen
Sabrina Kirrane
Agnieszka Lawrynowicz
Freddy Lecue
Maria Maleshkova
Raghava Mutharaju
Axel Polleres
Guilin Qi
Marta Sabou
Harald Sack
Christoph Schlieder
Stefan Schlobach
Oshani Seneviratne
Cogan Shimizu
Ruben Verborgh
GQ Zhang

Former Editors-in-Chief
Pascal Hitzler

Editorial Assistants
Sanaz Saki Norouzi

Syndicate

Quality Metrics For RDF Graph Summarization

Submitted by Mussab Zneika on 10/29/2017 - 23:59

Tracking #: 1755-2967

A new version of this paper is available

Authors:

Mussab Zneika

Dan Vodislav

Dimitris Kotzinos

Responsible editor:

Guest Editors IE of Semantic Data 2017

Submission type:

Full Paper

Abstract:

RDF Graph Summarization pertains to the process of extracting concise but meaningful summaries from RDF Knowledge Bases (KBs) representing as close as possible the actual contents of the KB both in terms of structure and data. RDF Summarization allows for better exploration and visualization of the underlying RDF graphs, optimization of queries or query evaluation in multiple steps, better understanding of connections in Linked Datasets and many other applications. In the literature, there are efforts reported presenting algorithms for extracting summaries from RDF KBs. These efforts though provide different results while applied on the same KB, thus a way to compare the produced summaries and decide on their quality and bestfitness for specific tasks, in the form of a quality framework, is necessary. So in this work, we propose a comprehensive Quality Framework for RDF Graph Summarization that would allow a better, deeper and more complete understanding of the quality of the different summaries and facilitate their comparison. We work at two levels: the level of the ideal summary of the KB that could be provided by an expert user and the level of the instances contained by the KB. For the first level, we are computing how close the proposed summary is to the ideal solution (when this is available) by defining and computing its precision, recall and F-measure against the ideal solution. For the second level, we are computing if the existing instances are covered (i.e. can be retrieved) and in what degree by the proposed summary. Again we define and compute its precision, recall and F-measure against the data contained in the original KB. We also compute the connectivity of the proposed summary compared to the ideal one, since in many cases (like, e.g., when we want to query) this is an important factor and in general in RDF, datasets that are linked within are usually used. We use our quality framework to test the results of three of the best RDF Graph Summarization algorithms, when summarizing different (in terms of content) and diverse (in terms of total size and number of instances, classes and predicates) KBs and we present comparative results for them. We conclude this work by discussing these results and the suitability of the proposed quality framework in order to get useful insights for the quality of the presented results.

Full PDF Version:

swj1755.pdf

Revised Version:

Quality Metrics For RDF Graph Summarization

Previous Version:

Quality Metrics For RDF Graph Summarization

Tags:

Reviewed

Decision/Status:

Minor Revision

Solicited Reviews:

Click to Expand/Collapse

Review #1

Anonymous submitted on 05/Nov/2017

Suggestion:
Minor Revision

Review Comment:

The authors have addressed all my concerns. I think the manuscript can be accepted after the following issues are fixed.

(1) Formalization requires more efforts.
- In Definition 2, N_i and \lambda_i should also involve class URIs.
- In the definition of Types(x), not y but \lambda_i(y) should be used, because classes are not nodes but node labels according to Definition 2. Similarly, Definition 3 and 4 should be corrected.
- In Definition 4, not y but x should be used to form property instances.

(2) The following recent papers should be cited and properly classified in the Related Work section.
- Marek Dudas et al., Dataset Summary Visualization with LODSight.
- Blerina Spahiu et al., ABSTAT: Ontology-driven Linked Data Summaries with Pattern Minimalization.
- Gong Cheng et al., HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization.

Minor issues:
- Page 4: We follow the Definition ?? of the Knowledge Pattern
- Page 8: In Eq. (4), '))' should be ')'

Review #2

By Melike Sah submitted on 09/Nov/2017

Suggestion:
Accept

Review Comment:

Authors adequately revised the paper. Especially with the provided example and quality metrics calculations of three different RDF Summarization algorithms, it is easier to follow the contibution of the paper. Therefore, my suggestion is the acceptance of the paper without any further revisions.

Review #3

By Dhaval Thakker submitted on 15/Jan/2018

Suggestion:
Accept

Review Comment:

The authors have taken into account all my comments about originality/novelty, and evaluation satisfactorily.

Log in or register to post comments
5725 reads

Main menu

Editorial Board

Syndicate

Quality Metrics For RDF Graph Summarization

Tracking #: 1755-2967

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles

Search form

Main menu

Login

Editorial Board

Syndicate

Quality Metrics For RDF Graph Summarization

Tracking #: 1755-2967

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles