Editorial Board

Editor-in-Chief
Krzysztof Janowicz

Managing Editors
Cogan Shimizu
Eva Blomqvist

Editorial Board
Mehwish Alam
Claudia d’Amato
Stefano Borgo
Boyan Brodaric
Philipp Cimiano
Oscar Corcho
Bernardo Cuenca-Grau
Elena Demidova
Jerome Euzenat
Mark Gahegan
Aldo Gangemi
Anna Lisa Gentile
Rafael Goncalves
Dagmar Gromann
Armin Haller
Pascal Hitzler
Aidan Hogan
Katja Hose
Eero Hyvönen
Sabrina Kirrane
Agnieszka Lawrynowicz
Freddy Lecue
Maria Maleshkova
Raghava Mutharaju
Axel Polleres
Guilin Qi
Marta Sabou
Harald Sack
Christoph Schlieder
Stefan Schlobach
Oshani Seneviratne
Cogan Shimizu
Ruben Verborgh
GQ Zhang

Former Editors-in-Chief
Pascal Hitzler

Editorial Assistants
Michael McCain

Syndicate

A Framework for Assessing LLM Consistency in Knowledge Engineering

Submitted by Anna Sofia Lippolis on 05/01/2025 - 01:33

Tracking #: 3865-5079

This paper is currently under review

Authors:

Mohammad Javad Saeedizade

Reham Alharbi

Hamed Babaei Giglou

Anna Sofia Lippolis

Eva Blomqvist

Valentina Tamma

Floriana Grasso

Terry Payne

Jennifer D'Souza

Sören Auer

Andrea Giovanni Nuzzolese

Robin Keskisärkkä

Zebah Valeyil

Responsible editor:

Cogan Shimizu

Submission type:

Full Paper

Abstract:

Consistency, i.e. the degree to which a system, process, or its results produce similar outcomes when repeated under identical or different conditions, is a critical concern in knowledge engineering (KE). This is particularly the case given the increasing reliance on Large Language Models (LLMs) in various tasks. This paper introduces CoLLM, a framework designed to assess whether a system or process produces consistent results in LLM-based KE tasks through three tests: (1) the LLM Repeatability Test, which evaluates the level of stochasticity or non-determinism of LLMs in existing studies; (2) the LLM Update Impact Test, which examines the effect that LLM updates may have on results; and (3) the LLM Replacement Test, which explores the effect of using alternative LLMs to perform the same study. Through 59 different experiments taken from five separate, recent studies, and leveraging various LLMs and datasets, we investigate the consistency of the results to empirically validate the reliability of the original findings for each study. Our investigation shows that in the majority of cases (81.4%), a consistent behaviour with respect to the original studies can be observed, despite some variability across the individual outputs. Additionally, in some cases, changing the choice of LLM can result in a consistent improvement across different metrics. These results demonstrate the viability of the proposed framework in general to assess the consistency of LLM-based KE tasks.

Full PDF Version:

swj3865.pdf

Tags:

Under Review

Long-term Stable Link to Resources:

https://github.com/HamedBabaei/CoLLM

Log in or register to post comments
322 reads

Main menu

Editorial Board

Syndicate

A Framework for Assessing LLM Consistency in Knowledge Engineering

Tracking #: 3865-5079

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles

Search form

Main menu

Login

Editorial Board

Syndicate

A Framework for Assessing LLM Consistency in Knowledge Engineering

Tracking #: 3865-5079

Reviewed Articles

Authors & Reviewers

Links

Recent blog posts

Accepted Articles