Editorial Board

Editor-in-Chief
Krzysztof Janowicz

Managing Editors
Cogan Shimizu
Eva Blomqvist

Editorial Board
Mehwish Alam
Claudia d’Amato
Stefano Borgo
Boyan Brodaric
Philipp Cimiano
Oscar Corcho
Bernardo Cuenca-Grau
Elena Demidova
Jerome Euzenat
Mark Gahegan
Aldo Gangemi
Anna Lisa Gentile
Rafael Goncalves
Dagmar Gromann
Armin Haller
Pascal Hitzler
Aidan Hogan
Katja Hose
Eero Hyvönen
Sabrina Kirrane
Agnieszka Lawrynowicz
Freddy Lecue
Maria Maleshkova
Raghava Mutharaju
Axel Polleres
Guilin Qi
Marta Sabou
Harald Sack
Christoph Schlieder
Stefan Schlobach
Oshani Seneviratne
Cogan Shimizu
Ruben Verborgh
GQ Zhang

Former Editors-in-Chief
Pascal Hitzler

Editorial Assistants
Michael McCain

Syndicate

Extensive Benchmark of Small Language Models for Datatype Properties Extraction and RDF Knowledge Graph Generation

Submitted by Célian Ringwald on 04/23/2025 - 07:25

Tracking #: 3845-5059

This paper is currently under review

Authors:

Célian Ringwald

Fabien Gandon

Catherine Faron

Franck Michel

Hanna Abi Akl

Responsible editor:

Guest Editors 2025 LLM GenAI KGs

Submission type:

Full Paper

Abstract:

The choice made for representing the inputs and outputs of generative pre-trained language models (PLMs) can impact their fine-tuning on a new task. This article focuses on the fine-tuning and linearization process to generate facts extracted from text. On a restricted relation extraction (RE) task, we challenged five encoder-decoder models including BART, T5, CodeT5, FlanT5 and PileT5 by fine-tuning them on 13 linearization variations, including RDF standard syntaxes and variations thereof. Our benchmark covers the validity of the produced triples, the model's performance, the training behaviour and the resources needed. We show these PLMs can learn some syntaxes more easily than others, and we identify a promising ``Turtle Light'' syntax supporting the quick and robust learning of the RE task.

Full PDF Version:

swj3845.pdf

Tags:

Under Review

Long-term Stable Link to Resources:

https://github.com/datalogism/12ShadesOfRDFSyntax

Log in or register to post comments
118 reads

Main menu

Editorial Board

Syndicate

Extensive Benchmark of Small Language Models for Datatype Properties Extraction and RDF Knowledge Graph Generation

Tracking #: 3845-5059

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles

Search form

Main menu

Login

Editorial Board

Syndicate

Extensive Benchmark of Small Language Models for Datatype Properties Extraction and RDF Knowledge Graph Generation

Tracking #: 3845-5059

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles