Editorial Board

Editor-in-Chief
Cogan Shimizu
Eva Blomqvist

Editorial Board
Mehwish Alam
Claudia d’Amato
Stefano Borgo
Boyan Brodaric
Philipp Cimiano
Michael Cochez
Oscar Corcho
Bernardo Cuenca-Grau
Elena Demidova
Jerome Euzenat
Mark Gahegan
Aldo Gangemi
Dagmar Gromann
Armin Haller
Pascal Hitzler
Aidan Hogan
Katja Hose
Eero Hyvönen
Sabrina Kirrane
Agnieszka Lawrynowicz
Freddy Lecue
Maria Maleshkova
Raghava Mutharaju
Axel Polleres
Guilin Qi
Marta Sabou
Harald Sack
Angelo Salatino
Christoph Schlieder
Stefan Schlobach
Cogan Shimizu
Blerina Spahiu
GQ Zhang
Rui Zhu

Former/Founding Editors-in-Chief
Krzysztof Janowicz
Pascal Hitzler

Editorial Assistants
Michael McCain

Syndicate

Lost and Found: Enriching Knowledge Graphs with “NIL” Persons from Historical Documents

Submitted by Enrico Daga on 03/11/2026 - 05:22

Tracking #: 4044-5258

This paper is currently under review

Authors:

Arianna Graciotti

Nicolas Lazzari

Enrico Daga

Valentina Presutti

Responsible editor:

Guest Editors 2025 OD+CH

Submission type:

Full Paper

Abstract:

Vast community-driven knowledge graphs (KGs), such as Wikidata, are the primary reference data sources for Entity Linking (EL) applications. However, they exhibit significant coverage bias towards information that is widely popular on the Web, leading to underrepresentation of long-tail entities, particularly from non-contemporary contexts. Concurrently, the ongoing mass digitisation of cultural heritage resources reveals numerous named entities and associated knowledge that are currently missing from general-purpose KGs. Enriching such KGs with these ``NIL'' entities offers an opportunity to improve completeness and mitigate biases, such as gender disparities in the representation of historical figures. In this article, we investigate an approach based on retrieval-augmented generative AI to capture information about NIL entities and generate structured KGs suitable for integration into Wikidata. The approach is applied to the case of persons unknown to Wikidata who are mentioned in a collection of 19th-century musical periodicals. We empirically select 6 properties from Wikidata for entities of that type and create a manually annotated NIL-entities KG as the gold standard for evaluation. Through comprehensive experiments, we evaluate 6 State-of-the-Art Large Language Models (LLMs) from different vendors, combined with 6 different State-of-the-Art retrievers. Our results demonstrate significant variations in performance across model-retriever combinations, with a high accuracy for gender identification and family name, promising results for occupation and country of citizenship, and low accuracy for date of birth. We report a detailed error analysis and discuss the potential of our approach to mitigate historical bias in Wikidata.

Full PDF Version:

swj4044.pdf

Previous Version:

Lost and Found: Enriching Knowledge Graphs with “NIL” Persons from Historical Documents

Tags:

Under Review

Long-term Stable Link to Resources:

https://github.com/arianna-graciotti/KG_Construction_Historical_NIL_Entities

Log in or register to post comments
121 reads

Main menu

Editorial Board

Syndicate

Lost and Found: Enriching Knowledge Graphs with “NIL” Persons from Historical Documents

Tracking #: 4044-5258

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles

Search form

Main menu

Login

Editorial Board

Syndicate

Lost and Found: Enriching Knowledge Graphs with “NIL” Persons from Historical Documents

Tracking #: 4044-5258

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles