On assessing weaker logical status claims in Wikidata cultural heritage records

Tracking #: 3569-4783

This paper is currently under review
Alessio Di Pasquale
Valentina Pasqual
Francesca Tomasi
Fabio Vitali

Responsible editor: 
Guest Editors Wikidata 2022

Submission type: 
Full Paper
This work presents an analysis of the use of different representation methods in Wikidata to encode information with weaker logical status (WLS, e.g. uncertain information, competing hypothesis, temporally evolving information, etc.). The study examines four main approaches: non-asserted statements, ranked statements, non-existing valued objects, and statements qual- ified with properties P5102:nature of statement, P1480:sourcing circumstances and P2241:reason for deprecated rank. We analyse their prevalence, success, and clarity in Wikidata. The analysis is performed over cultural heritage artefacts stored in Wikidata divided into three subsets (i.e. visual heritage, textual heritage and audio-visual heritage) and compared with astro- nomical data (stars and galaxies entities). Our findings indicate that (1) the representation of weaker logical status information is limited, with only a small proportion of items reporting such information, (2) the representation of WLS varies significantly between the two datasets, and (3) precise assessment of WLS statements is made complicated by the ambiguities and overlap- pings between WLS and non-WLS claims allowed by the chosen representations. Finally, we list a few proposals to simplify and standardize the representation of this type of information in Wikidata, with the hope of increasing its accuracy and richness.
Full PDF Version: 
Under Review