Editorial Board

Editor-in-Chief
Krzysztof Janowicz

Managing Editors
Cogan Shimizu
Eva Blomqvist

Editorial Board
Mehwish Alam
Claudia d’Amato
Stefano Borgo
Boyan Brodaric
Philipp Cimiano
Oscar Corcho
Bernardo Cuenca-Grau
Elena Demidova
Jerome Euzenat
Mark Gahegan
Aldo Gangemi
Anna Lisa Gentile
Rafael Goncalves
Dagmar Gromann
Armin Haller
Pascal Hitzler
Aidan Hogan
Katja Hose
Eero Hyvönen
Sabrina Kirrane
Agnieszka Lawrynowicz
Freddy Lecue
Maria Maleshkova
Raghava Mutharaju
Axel Polleres
Guilin Qi
Marta Sabou
Harald Sack
Christoph Schlieder
Stefan Schlobach
Oshani Seneviratne
Cogan Shimizu
Ruben Verborgh
GQ Zhang

Former Editors-in-Chief
Pascal Hitzler

Editorial Assistants
Michael McCain

Syndicate

JRC-Names: Multilingual Entity Name variants and titles as Linked Data

Submitted by Maud Ehrmann on 01/26/2016 - 08:14

Tracking #: 1307-2519

Authors:

Maud Ehrmann

Guillaume Jacquet

Ralf Steinberger

Responsible editor:

Philipp Cimiano

Submission type:

Dataset Description

Abstract:

Since 2004 the European Commission's Joint Research Centre (JRC) has been analysing the online version of printed media in over twenty languages and has automatically recognised and compiled large amounts of named entities (persons and organisations) and their many name variants. The collected variants not only include standard spellings in various countries, languages and scripts, but also frequently found spelling mistakes or lesser used name forms, all occurring in real-life text (e.g. Benjamin/Binyamin/Bibi/Benyamín/Biniamin/Беньямин/بنيامين Netanyahu/Netanjahu/Nétanyahou/Netahny/Нетаньяху/نتنياهو). This entity name variant data, known as JRC-Names, has been available for public download since 2011. In this article, we report on our efforts to render JRC-Names as Linked Data (LD), using the lexicon model for ontologies lemon. Besides adhering to Semantic Web standards, this new release goes beyond the initial one in that it includes titles found next to the names, as well as date ranges when the titles and the name variants were found. It also establishes links towards existing datasets, such as DBpedia and Talk-Of-Europe. As multilingual linguistic linked dataset, JRC-Names can help bridge the gap between structured data and natural languages, thus supporting large-scale data integration, e.g. cross-lingual mapping, and web-based content processing, e.g. entity linking. JRC-Names is publicly available through the dataset catalogue of the European Union's Open Data Portal.

Full PDF Version:

swj1307.pdf

Previous Version:

JRC-Names: Multilingual Entity Name variants and titles as Linked Data

Tags:

Reviewed

Decision/Status:

Solicited Reviews:

Click to Expand/Collapse

Log in or register to post comments
13205 reads

Main menu

Editorial Board

Syndicate

JRC-Names: Multilingual Entity Name variants and titles as Linked Data

Tracking #: 1307-2519

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles

Search form

Main menu

Login

Editorial Board

Syndicate

JRC-Names: Multilingual Entity Name variants and titles as Linked Data

Tracking #: 1307-2519

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles