Editorial Board

Editors-in-Chief
Krzysztof Janowicz

Managing Editors
Cogan Shimizu
Eva Blomqvist

Editorial Board
Mehwish Alam
Claudia d’Amato
Stefano Borgo
Boyan Brodaric
Philipp Cimiano
Oscar Corcho
Bernardo Cuenca-Grau
Elena Demidova
Jerome Euzenat
Mark Gahegan
Aldo Gangemi
Anna Lisa Gentile
Rafael Goncalves
Dagmar Gromann
Armin Haller
Aidan Hogan
Katja Hose
Eero Hyvönen
Sabrina Kirrane
Agnieszka Lawrynowicz
Freddy Lecue
Maria Maleshkova
Raghava Mutharaju
Axel Polleres
Guilin Qi
Marta Sabou
Harald Sack
Christoph Schlieder
Stefan Schlobach
Oshani Seneviratne
Cogan Shimizu
Ruben Verborgh
GQ Zhang

Former Editors-in-Chief
Pascal Hitzler

Editorial Assistants
Sanaz Saki Norouzi

Syndicate

The debates of the European Parliament as Linked Open Data

Submitted by Astrid van Aggelen on 11/09/2015 - 15:21

Tracking #: 1229-2441

A new version of this paper is available

Authors:

Astrid van Aggelen

Laura Hollink

Max Kemman

Martijn Kleppe

Henri Beunders

Responsible editor:

Natasha Noy

Submission type:

Dataset Description

Abstract:

The European Parliament represents the citizens of the member states of the European Union (EU). The accounts of its meetings and related documents are open data, promoting transparency and accountability, and are used as source data by researchers. However, the official portal of these documents provides limited search facilities. This paper presents LinkedEP, a Linked Open Data translation of the verbatim reports of the plenary meetings of the European Parliament. These data are integrated with a database of political affiliations of the Members of Parliament, and enriched with detected topics from the EU’s topic hierarchy and as well as links to three other Linked Open Datasets. The results of this work are available through a SPARQL endpoint as well as a user interface with extensive browse and search facilities. It is now possible to combine in one query information about the time and topic of the debate, the spoken words - in any available translation - and information about the speaker uttering these, such as affiliations to countries, parties and committees. This paper discusses the design and creation of the vocabulary, data and links, as well as known use of the data.

Full PDF Version:

swj1229.pdf

Revised Version:

The debates of the European Parliament as Linked Open Data

Previous Version:

The debates of the European Parliament as Linked Open Data

Tags:

Reviewed

Decision/Status:

Minor Revision

Solicited Reviews:

Click to Expand/Collapse

Review #1

By Konrad Höffner submitted on 11/Nov/2015

Suggestion:
Minor Revision

Review Comment:

This review refers to the revision of the dataset description of "The debates of the European Parliament as Linked Open Data" following an earlier review.

Most of the requested changes have been made, which includes:

- removing the graphs of the search logs and homepage visits
- compressing the analysis of page visits
- adding specific use cases
- adding the version date
- adding the version number
- adding update plans

I request minor revisions, however, because of two shortcomings:

(1) The conversion process is described very briefly and should be expanded.
(2) All figures have low quality after printing and also on some zoom levels in a PDF viewer. Please make sure that all images are high quality vector images and not bitmaps.

Corrections: Please unify the capitalization of "LinkedPolitics" vs. "Linkedpolitics".

Review #2

By Alvaro Graves submitted on 16/Dec/2015

Suggestion:
Minor Revision

Review Comment:

The authors describe how they converted the transcripts of the debates of the European Parliament into RDF. They provide details on different aspects of the conversion process, from the creation of URIs, vocabularies used and how it has been published. The authors also show some numbers showing how many times the data was queried and provide use cases based on people who used the data.

There are a couple of issue with this paper. First, it is not clear how the RDF representation of the data makes it easier for interested parties (mostly non-semantic web experts) to consume and take advantage of this data. Second, the statistics of use only show that the data has been queried, but showing "7.5 thousand times" doesn't mean much; it would be recommended to give some other measure to compare with. Based on the web interface available I would suggest to provide form for political scientists and other researchers to query the data that does not require knowledge of SPARQL.

It is also worth adding a few lines on how this dataset is going to be maintained in the future. Making the code available is something very valuable indeed. A few minor issues are also indicated below:

"The content and provenance of the data and vocabulary are described using the void, prov and omv vocabularies."

Citations?

"The metadata are collected in a single graph on the server and as a turtle file in the well-known directory."

Citation to well-known

"over 5.5 thousand times and the dataset was queried through our service about 7.5 thousand times, of which 3,654 times"

Please be consistent how you present numbers. Also, write something like "more than 5 thousand", decimals look weird in that context

"Dataset quality One way to describe the quality of a Linked Dataset is the star system by Berners-Lee [2]. LinkedEP is a five-star collection"

Please remove that. The 5-star classification does not describe the quality of the data itself, only the format and eventually use of common vocabularies. People can still publish trash data using a 5-star scheme.

Review #3

By Adegboyega Ojo submitted on 21/Dec/2015

Suggestion:
Accept

Review Comment:

In my first review, I pointed out three basic shortcomings of the work which I encouraged the authors to address for completeness. The comments include elaboration on specific patterns employed in the publishing process; information on stability, updates, and maintenance of the dataset and information on the shortcomings of the dataset. I also suggested providing concrete user stories for practical use of the datasets.
In the revised article, the authors provided information on the type of ontology pattern used (end of Section 4). They have also indicated the frequency of updates for the dataset in Section 5 and elaborated on the shortcomings of their work in Section 7. The suggestions on providing concrete user stories have also been addressed in Section 6 through the discussions on the two use case patterns 1 and 2. These use cases are more interesting than ones given in the first version of the article. Given that the authors have satisfactorily addressed the issues raised in the earlier manuscript, I recommend that the article is accepted and that it is publishable in the current form.

Log in or register to post comments
7602 reads

Main menu

Editorial Board

Syndicate

The debates of the European Parliament as Linked Open Data

Tracking #: 1229-2441

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles

Search form

Main menu

Login

Editorial Board

Syndicate

The debates of the European Parliament as Linked Open Data

Tracking #: 1229-2441

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles