DBpedia - A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia

Tracking #: 558-1764

Jens Lehmann
Robert Isele
Max Jakob
Anja Jentzsch
Dimitris Kontokostas
Pablo N. Mendes
Sebastian Hellmann
Mohamed Morsey
Patrick van Kleef
Sören Auer
Christian Bizer

Responsible editor: 
Krzysztof Janowicz

Submission type: 
Tool/System Report
The DBpedia community project extracts structured, multilingual knowledge from Wikipedia and makes it freely available on the Web using Semantic Web and Linked Data technologies. The project extracts knowledge from 111 different language editions of Wikipedia. The largest DBpedia knowledge base which is extracted from the English edition of Wikipedia consists of over 400 million facts that describe 3.7 million things. The DBpedia knowledge bases that are extracted from the other 110 Wikipedia editions together consist of 1.46 billion facts and describe 10 million additional things. The DBpedia project maps Wikipedia infoboxes from 27 different language editions to a single shared ontology consisting of 320 classes and 1,650 properties. The mappings are created via a world-wide crowd-sourcing effort and enable knowledge from the different Wikipedia editions to be combined. The project publishes regular releases of all DBpedia knowledge bases for download and provides SPARQL query access to 14 out of the 111 language editions via a global network of local DBpedia chapters. In addition to the regular releases, the project maintains a live knowledge base which is updated whenever a page in Wikipedia changes. DBpedia sets 27 million RDF links pointing into over 30 external data sources and thus enables data from these sources to be used together with DBpedia data. Several hundred data sets on the Web publish RDF links pointing to DBpedia themselves and thus make DBpedia one of the central interlinking hubs in the Linked Open Data (LOD) cloud. In this system report, we give an overview of the DBpedia community project, including its architecture, technical implementation, maintenance, internationalisation, usage statistics and applications.
Full PDF Version: 


Solicited Reviews:
Click to Expand/Collapse
Review #1
By Prateek Jain submitted on 17/Nov/2013
Review Comment:

The authors seemed to have addressed the comments which were pointed out in the previous round of the reviews. There are some sections where I had pointed out some changes/clarifications. It looks like they have been removed, instead of the clarification/changes (for example question regarding AST, Official DBpedia Chapters).

Overall, I think the paper looks good for acceptance.

Review #2
By Aba-Sah Dadzie submitted on 29/Nov/2013
Minor Revision
Review Comment:

The paper is much easier to read and my main concerns have been addressed. I've only a few minor points/questions.

The text says "Table 6 lists the 10 data sets … along with the used link predicate …" - table 6 has only the counts, not the predicates.

Table 8 - incoming links to what?

Fig. 9 - for consistency - caption should state units for both bars or none - as is, implies counts are also in GB

A few grammatical errors - should be picked up with a proof-read or auto-check.

******* Questions about responses to review

Point about improvement with use of Scala - response says rewritten - the paragraph is identical to the previous version (bar the footnote pointing to the timeline).

Re - status code 509 - my point appears to have been misunderstood - the issue isn't the code, but that it would be useful to mention in the text why this is appropriate. Only providing a URL pointing to code definitions just increases the reader's load. This is something authors (myself included) are often guilty of - forgetting that what is obvious to you, especially when you're very involved with your work, is not necessarily so to your readers.
FYI - HTTP status codes at w3 - http://www.w3.org/Protocols/HTTP/HTRESP.html

Fig 2 - my question was why the mapping example for Greek contains BOTH Greek and English labels for the mappings - I would expect them all to be Greek?

Table 7 is referenced before 6 - should therefore come before it. - response says this has been addressed - it's still the same.


Suggested references for [1] FREyA - rather than URL

Danica Damljanovic, Milan Agatonovic, Hamish Cunningham. (2010) Natural Language Interfaces to Ontologies: Combining Syntactic Analysis and Ontology-Based Lookup through the User Interaction. ESWC (1) 2010: 106-120

Danica Damljanovic, Milan Agatonovic, Hamish Cunningham. (2011) FREyA: An Interactive Way of Querying Linked Data Using Natural Language, In The Semantic Web: ESWC 2011 Workshops, LNCS Volume 7117, pp 125-138

Review #3
By Oscar Corcho submitted on 06/Dec/2013
Review Comment:

This new version of the paper has largely improved the previous version that was submitted, resulting in a more readable and balanced description of the DBpedia effort, and addressing most of the comments that I made to the initial version.

As a system paper, I am happy with it and consider that can be accepted as is.