Metadata Extraction from Tables and Charts in Scientific Publications: A Systematic Literature Review

Tracking #: 4000-5214

This paper is currently under review
Authors: 
Erick Cedeño
Daniel Garijo
Oscar Corcho

Responsible editor: 
Guest Editors ML and KR 2025

Submission type: 
Survey Article
Abstract: 
Extracting metadata from tables and charts in scientific publications is essential for enabling structured knowledge representation, automated retrieval of experimental results, and large-scale evidence synthesis. Such metadata includes descriptive information that goes beyond structural parsing and captures the roles and relationships of elements such as headers, units, variables, legends, and axis labels. Although numerous methods have been proposed for table extraction and chart understanding, prior work remains fragmented, meaning that most approaches are developed and evaluated independently for a single modality and therefore lack mechanisms for connecting information across tables and charts. This specific approach to each modality has led to heterogeneous processes and inconsistent assessment practices, limiting comparability across studies. In this systematic literature review, we analyze 68 peer-reviewed studies using a unified evaluation framework specifically designed to examine metadata extraction capabilities in both tables and charts. Guided by explicit research questions, we compare these systems in terms of their task definitions, model architectures, metadata outputs, and reporting practices. Rather than reproducing implementations, our analysis evaluates the extent to which each method supports metadata identification, variable interpretation, and multimodal alignment. The findings highlight unresolved challenges in linking related information across modalities (e.g., associating table headers with chart axes), interpreting variables beyond their superficial textual labels, and establishing standardized benchmarks that measure correctness at the metadata level.
Full PDF Version: 
Tags: 
Under Review