Decoding Deception with TAXODIS - a Taxonomy of Disinformation Cues for Fine-grained Text Labeling

Tracking #: 3792-5006

Authors: 
Isabel Bezzaoui
Pavlos Fafalios
Jonas Fegert
Achim Rettinger
Konstantin Todorov

Responsible editor: 
Philipp Cimiano

Submission type: 
Ontology Description
Abstract: 
The ubiquity of disinformation on digital platforms poses a threat to democracy and social cohesion. Despite significant developments in machine learning for disinformation detection and more specific related tasks (such as fact-checking, check-worthiness detection, claim linking, propaganda and rumor detection), effectively applying empirical knowledge during the training of such models in a standardized and transparent way remains a challenge. In this paper, following the semantic web principles, we propose TAXODIS---the first of its kind openly available Taxonomy of Online Disinformation. It structures an interdisciplinary set of well-defined and analyzed linguistic features of online disinformation discourse and is meant to help annotate training data to nourish machine learning and computational models that deal with the above-mentioned tasks. The systematic clustering of linguistic features into a comprehensive and publicly available framework provides a basis for the empirically grounded training of models and enhances the understanding of disinformation on a textual and linguistic level. Demonstrating and evaluating the artifact, we find that it facilitates data labeling processes by offering annotators a compact yet empirically informed guide to identifying textual indicators of disinformation. This paper, proposing a structured taxonomy as a valuable tool for automated detection systems, contributes to disinformation detection by mapping nuanced linguistic characteristics in disinformation content.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 01/Apr/2025
Suggestion:
Major Revision
Review Comment:

--------
Summary
--------

This manuscript describes TAXODIS, a semantic taxonomy of disinformation cues aimed at supporting fine-grained textual annotation of online misinformation. TAXODIS was constructed through a systematic review of interdisciplinary literature, identifying linguistic, psychological, stylistic, and structural features that may signal deceptive or misleading content. The taxonomy is implemented in SKOS and released publicly via Zenodo for reuse in annotation tasks and potential integration into disinformation detection pipelines. The submission fits the Ontology Description track of the Semantic Web Journal, and the paper’s main focus is the conceptualization, structure, and potential usage of the TAXODIS ontology.

---------
Strengths
---------

• Relevant and Needed Resource: TAXODIS addresses an important and timely topic: the linguistic characterization of disinformation. With the prevalence of misinformation on digital platforms, a formalized and well-structured vocabulary for annotation and analysis is both relevant and needed.
• Systematic and Interdisciplinary Grounding: The taxonomy is the product of a structured and interdisciplinary synthesis, drawing from a wide swath of literature in NLP, media studies, journalism, and psychology. The coverage appears comprehensive, and the categories are arranged in a meaningful hierarchy spanning stylistic, psychological, content-based, and veracity-related cues.
• Well-Organized and Readable Paper: The manuscript is clearly written, logically structured, and benefits from helpful illustrations such as taxonomy tables and examples. Definitions for categories are consistently provided, and the paper communicates the ontology’s scope and rationale effectively.
• Openly Available and Standards-Compliant Ontology: The TAXODIS ontology is published as a SKOS vocabulary via Zenodo, with a stable DOI and resolvable namespace. The open license (CC BY 4.0) and use of Semantic Web standards facilitate reuse and integration.

-----------
Weaknesses
-----------

• Lack of Novelty Relative to Prior Work: A central concern with this submission is the significant overlap with the authors’ earlier work, specifically:
o The TAXODIS taxonomy appears closely related — and in large part identical — to the schema used in “DeFaktS: A German Dataset for Fine-Grained Disinformation Detection through Social Media Framing” [1].
o The methodological account of the systematic literature review process is largely similar to that in the earlier paper “Truth or Fake? Developing a Taxonomical Framework for the Textual Detection of Online Disinformation” [2].
o The empirical evaluation, namely the annotation study where participants used the taxonomy to label real social media content, was also conducted and presented in the DeFaktS paper.
Given this, the manuscript does not appear to contribute new empirical validation, nor does it substantially expand the taxonomy’s structure or scope beyond what has already been published. While it is appropriate for an ontology description paper to consolidate prior work, the manuscript currently does not clearly distinguish itself or articulate what is newly contributed. This raises concerns about redundant publication, which should be avoided.
• Missing Explicit Discussion of Reuse and Extensions: The paper does not sufficiently acknowledge that the empirical study and schema of the taxonomy were previously published, nor does it clearly articulate how this submission goes beyond those works. For example, is TAXODIS simply a formalized SKOS version of the earlier schema? Were there updates to any categories, definitions, or structure? Has the ontology been extended, or applied in new settings? These questions are left unaddressed, weakening the case for a distinct and publishable contribution in this track.
• Distinction and Overlap Between Categories: TAXODIS contains many features whose meanings may overlap or be difficult to distinguish without detailed annotation guidance. For instance, the distinction between “vagueness of phrasing” and “level of informality of language,” or between “emotional polarization” and “sensationalism,” is conceptually subtle and may result in inconsistent labeling. Similar concerns apply to categories like “propaganda” versus “one-sided content” or “conspiracy theory” versus “false context,” which are not always mutually exclusive. Furthermore, the relationship between content type and veracity grade is underexplained – for example, how satire or parody should be classified is unclear. While the authors provide short definitions for each category, the lack of disambiguation guidelines may hinder consistent annotation. It is also not explicitly stated whether multiple labels can be assigned from each dimension to a single content item (e.g., can something be both “clickbait” and “propaganda”?). Providing more extensive annotation guidance and clarification of category boundaries would greatly improve the taxonomy’s usability.
• Limited Evaluation of the Ontology as a Standalone Resource: Because the empirical validation stems from prior work and is only briefly summarized here, this manuscript does not contain an evaluation of the SKOS implementation or user interaction with the ontology per se. For an Ontology Description track submission, it would be beneficial to include some assessment — even qualitative — of how the formalized resource was used, understood, or adopted. As written, the manuscript treats the ontology’s practical utility as established, but without showing new evidence of reuse or uptake.
• Repository Completeness and Documentation: While the Zenodo repository includes the ontology file and human-readable documentation (alphabetical and hierarchical PDFs), it lacks a clear README file describing the purpose, structure, and usage of the files. Additionally, the annotation guidelines, data, or outputs from the evaluation study are not included. This limits the ability to reproduce or build upon the prior empirical work and restricts the self-contained utility of the archive.

---------------------------
Suggestions for Improvement
---------------------------

1. Clearly Acknowledge and Cite Prior Work: The manuscript should explicitly acknowledge that the taxonomy and annotation study were introduced and evaluated in prior work — particularly the DeFaktS paper [1] and Truth or Fake [2]. These papers should be cited, and a paragraph should be added clearly describing how this submission relates to them. For instance, if the main contribution here is the SKOS formalization and open release of the previously defined taxonomy, that should be made transparent.
2. Highlight Novelty (if any): If the TAXODIS ontology differs in any way from the previous taxonomy — e.g., through refined structure, new categories, updated definitions, or technical implementation — the manuscript should detail these differences. A table comparing TAXODIS with the earlier taxonomy, or a short changelog, would help readers understand what is new.
3. Clarify Contribution in the Ontology Description Context: Since this is an Ontology Description track submission, it is acceptable to consolidate and formalize earlier work — but the manuscript must justify this and position it accordingly. The authors should focus on the value of formal representation, reusability, semantic interoperability, and potential for integration into broader knowledge graphs. If possible, include a discussion of potential alignments (e.g., with schema.org, DBpedia, Wikidata), even if tentative.
4. Clarify Usage Guidelines and Annotation Procedures: The paper would benefit from additional guidance on how to apply the taxonomy in practice, especially in light of subtle distinctions and overlapping categories. The authors should consider including a dedicated appendix, short annotation guide, or example-driven explanation that elaborates:
• How to decide between closely related features (e.g., “emotional polarization” vs. “sensationalism”).
• Whether and when multiple labels can be applied per dimension (e.g., both “propaganda” and “clickbait” as content types).
• How to handle ambiguous or edge cases (e.g., satire, mixed-content items).
• Possibly a visual workflow, decision tree, or checklist to help annotators choose appropriate labels consistently.
Such annotation support would significantly increase the practical utility of the ontology for both manual and semi-automated annotation tasks. It would also strengthen the claim that TAXODIS enables consistent and fine-grained labeling across annotators, by making the procedure reproducible and transparent.
5. Improve Repository Documentation: Add a README file to the Zenodo archive summarizing the purpose of TAXODIS, its format and version, usage examples, and a citation guideline. If feasible, include sample annotations or annotation guidelines used in the empirical study — even if only for illustrative purposes.
6. Avoid Overstatement of New Evaluation: Any statements implying that the user study or annotation workshop is new to this manuscript should be revised. Instead, refer to the prior publication and briefly summarize the evaluation outcomes as context, without framing it as a new result.
7. Consider a Forward-Looking Perspective: If no new empirical or structural contributions are added, the authors might consider discussing plans for ontology maintenance, community feedback, or potential applications — e.g., plans to integrate TAXODIS into disinformation detection tools, annotation platforms, or academic corpora. This would support the ontology’s long-term value and fit with the SWJ track.

--------------
Recommendation
--------------

Major Revision

While TAXODIS is a valuable and well-founded ontology, the manuscript in its current form lacks sufficient novelty and differentiation from previously published work by the authors. To merit publication in the Ontology Description track, the submission must clearly articulate its unique contribution — whether in terms of formalization, accessibility, interoperability, or reuse — and must acknowledge the foundational prior work on which it builds.

I recommend major revisions that include:

• Clearer positioning of the work with respect to prior publications;
• Transparency about reused content (especially the evaluation);
• Justification for the distinct value of this submission;
• Enhanced documentation and completeness of the public archive.

With these changes, the paper would more effectively fulfill the aims of the SWJ Ontology Description track and would provide a meaningful, clearly scoped contribution to the community.

----------
References
----------

[1] Ashraf, S., Bezzaoui, I., Andone, I., Markowetz, A., Fegert, J., & Flek, L. (2024, May). DeFaktS: A German Dataset for Fine-Grained Disinformation Detection through Social Media Framing. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) (pp. 4580-4591).
[2] Bezzaoui, I., Fegert, J., & Weinhardt, C. (2022). Truth or Fake? Developing a Taxonomical Framework for the Textual Detection of Online Disinformation. International journal on advances in internet technology, 15(3/4), 53-63.

Review #2
Anonymous submitted on 15/Jul/2025
Suggestion:
Minor Revision
Review Comment:

In this paper the authors introduce TAXODIS, an openly available Taxonomy of Online Disinformation whose aims it's understanding disinformation on a textual and linguistic level and helping systems for its detection.

It would be interesting to reimplement the approach described in th reference [45] but using the categories of TAXODIS instead of LIWC.

How this taxonomy could be enriched in order to include, apart from linguistic cues, also disinformation features of multimodal posts? It would be interesting if this could be mentioned in the final version of the paper because, especially during tragic events such as the recent floods in Valencia, the viral spread of multimodal posts showed to pose a threat and negative emotions were trigged (Arcos et al., 2025).

The paper is quite interesting although it needs to be proof-read in order to fix several minor things such as: In section 2 -> In Section 2 etc. (page 2); Table 4.1.2 doesn't exist (page 7); fake articles (pages 7, 8 etc.) shouldn't be used because articles are not fake: they exist (maybe better saying, for instance, articles containing false information); usage of uppercases (e.g. page 12): Obtaining feature values -> Obtaining Feature Values.

References should be double checked: for usage of uppercase in titles; instaed of arXives their peer-reviewed versions of the papers should be used ([9], [26], [33], [56], [76]); many papers lack of the information of where they have been published ([10], [11], [12], [20], [29], [30], [34], [38], [55]).

Arcos I., Rosso P., Salaverría R. (2025) Divergent Emotional Patterns in Disinformation on Social Media? An Analysis of Tweets and TikToks about the DANA in Valencia. In: ICAART-2025, Proc. 17th Int. Conf. on Agents and Artificial Intelligence, Feb. 23-25 https://www.scitepress.org/Papers/2025/133928/133928.pdf