Abstract:
We present AIDOC-AP, an application profile for representing technical documentation of AI systems in accordance with technical documentation requirements of the EU AI Act. Our methodology adapts the NeOn-GPT pipeline, an LLM-powered ontology engineering framework, and extends it with two steps: (i) LLM-assisted alignment to reference ontologies and (ii) LLM-based iterative coverage evaluation of competency questions. The resource comprises an OWL ontology for AI system lifecycle documentation and an ontology of Annex IV requirements and competency questions. We validate AIDOC-AP in a real-world use case from the CERTAIN project and publish all artifacts under open licenses.
License: CC BY 4.0
DOI: 10.5281/zenodo.17787787
URI: https://w3id.org/aidoc-ap
Comments
state of the art links
Hi Sebastian, all. Thanks for the work (and that this is in a Horizon project so I look forward to more work). Always good to see more semantics used in legal informatics for data/AI etc. My comments are more about grounding the work in what exists and to share lessons learnt from our work so that it helps approaches like yours.
1) AIDOC-AP: the naming is quite similar to the "-AP" style used in SEMIC, I don't know if this was intentional as the paper does not mention DCAT-AP which is used by SEMIC, and which has been extended as MLDCAT-AP for AI models. In particular, MLDCAT-AP is a specification that also models several of the concepts and is something to take a look at for reuse of concepts.
2) The paper mentions use of AIRO and VAIR. Of these, in table 1, AIRO is stated as not having regulatory grounding, which is incorrect. It was in fact the first ontology to model AI Act concepts! We were following the AI Act text from its initial drafts up to the final publication, and you can check the definitions of concepts referencing specific clauses in the AI Act. The table also states that both AIRO and VAIR lack data requirements and dataset concepts, which is also unclear as to how because AIRO does have concepts such as Data and quality characteristics and VAIR has concepts for input/output modelling. Maybe its a partial match. The criteria wasn't clear to me.
3) The paper doesn't include Data Privacy Vocabulary (DPV) https://w3id.org/dpv which IMHO is the most comprehensive vocabulary for EU laws (GDPR, AI Act, etc.) We integrated AIRO and VAIR I think in version 2.1 or 2.2 last year, and now we've released v2.3. Please take a look at the TECH extension https://w3id.org/dpv/2.3/tech/ for a general modelling of technologies, AI extension https://w3id.org/dpv/2.3/ai/ for AI-specific concepts, and most importantly the AI Act extension https://w3id.org/dpv/2.3/legal/eu/aiact/ for AI-act specific concepts. Most of what I see in AIDOC is present in these extensions, and in the interest of interoperability IMO its better to use DPV where available. We have been maintaining this work for 8 years now with updates, and we also cover other laws besides the AI Act so the concepts work across multiple regulations. In the sustainability part of the paper, it is mentioned that you plan to continue the work within the lifetime of the project. I invite you and colleagues to add this work to DPV vocabularies, and where we also discuss creating guides -- and where how to represent technical documentation as a specification would be in scope.
4) In terms of related work, in a previous paper (and in Delaram's PhD) we already looked at creating documentation for AI Act, including the technical document, see https://osf.io/preprints/osf/43rq2_v1 section 4.3. You can also take a look at Delaram's thesis which should have more details/resources.
5) In terms of modelling, I see several concepts as properties, e.g. intendedPurpose is a property with a string as range/value. This is bad practice IMHO, and I also see this is the case in MLDCAT-AP. When modelling things as semantics, we should strive to generate sets/classes/concepts where we can because that's what creates value in the use of semantics. For example, for two intendedPurposes A and B, if both are string, we cannot express any relationship between them, whereas if we express them both as instances of class IntendedPurpose, we can do all sorts of stuff like make links, use skos:broader/narrower, or even create a subclass/SKOS hierarchy. We've done this for GDPR's purpose concepts, see https://w3id.org/dpv/modules/purposes Additionally, the concept Intended Purpose in the AI Act is going to be doing a lot of heavy lifting in determining obligations, including tracking where it has changed. This implies a need to compare two concepts, which is not as rich when using just strings. We have this concepts in DPV's AI Act extension, and also have discussions on the use of it in compatibility assessments https://github.com/w3c/dpv/issues/300 Same applies to other concepts (which are also likely to be present in DPV vocabularies)
6) The paper only focuses on technical documentation as Annex IV, however, the AI Act Article 11 is the primary obligation for the documentation, and you can see specific criteria in para 1. Even if it overlaps with Annex IV, the reference to the documentation should arise from Article 11 in the paper first, and then use the information list in Annex IV. Separately, the Article 11 also notes that Annex IV may be modified in the future, which requires resources to be adaptable -- something to mention/consider in the work as well.
7) I understand the need to have a rigid scope -- in this case only modelling AI Act technical documentation. But at the same time, this same information is necessary elsewhere and one question I always have with hyper-focused ontologies is how will they work with these other compliance/governance approaches. That's also why I suggest using DPV (or something similar) whose concepts are intended to be used across many laws and obligations, so someone else can target another obligation and there will be interoperability between (your) technical documentation and other uses of DPV.
8) For the project, I definitely think you should discuss working to enrich DPV vocabularies. Especially since now its being used in HealthDCAT-AP for EHDS and MLDCAT-AP for AI Act (but AI in general). So all these cases help us make the case stronger for why semantic web and why interoperability (essentially to convince non-semweb usecases to adopt this).
Thanks and happy to answer any questions you may have -- feel free to reach me at harshvardhan.pandit@adaptcentre.ie
Regards,
Harsh
response on state of the art links
Dear Harsh,
Thank you for the constructive comments! They help us ground the work in the broader semantic web ecosystem on AI regulations, which clearly has more depth than we initially acknowledged in the paper.
On the naming and MLDCAT-AP: The -AP suffix was indeed intentional, following the SEMIC convention. We should have referenced MLDCAT-AP explicitly, and we will do so in a revised version. The relationship we see is complementary: MLDCAT-AP covers AI/ML models and datasets as catalog entries, while AIDOC-AP targets the structured representation of Annex IV lifecycle documentation. A system could reasonably use both and making that interoperability explicit is worthwhile.
On the AIRO characterization in Table 1: You are of course correct, the AIRO ontology explicitly sources specific AI Act articles and was indeed the first to model AI Act concepts. Our "regulatory grounding" dimension was meant to capture explicit anchoring to Annex IV documentation requirements specifically, not regulatory grounding in general. But we communicated this poorly in the paper, and the checkmark assignment for AIRO should be stated differently. We will correct the table and clarify the criterion.
On DPV: Thank you for the detailed pointers to the DPV extensions. Your point on intendedPurpose (and similar) as a string property is well-taken. To model it as a string property was a pragmatic choice, given how providers currently document it (in particular within the project pilots we are working with). But you are right that this forfeits the semantic value that class-based modeling provides, especially for the compliance use case you describe in issue #300.
On your interoperability concern more broadly: using DPV where available is the most direct answer to the "hyper-focused ontology" problem you raise. AIDOC-AP's contribution is the Annex IV structural depth, including the competency questions, lifecycle stage mappings, and the tooling for iterative coverage evaluation; which DPV does not currently provide at that level of granularity. If the Annex IV-specific terms are grounded in DPV classes where possible, the result would plug into the broader DPV ecosystem, giving exactly the interoperability you describe in point 8.
This also connects to what we consider an underemphasized contribution of the paper: the methodology itself. The iterative LLM-assisted pipeline (terminology extraction, alignment to reference ontologies, and repeated coverage evaluation against competency questions) is designed as a maintenance mechanism, not a one-time engineering exercise. When Annex IV is modified (as Article 11 explicitly anticipates, per your point 6), the competency questions can be updated and the coverage evaluation re-run to systematically identify gaps. The same pipeline could be applied when DPV evolves, to check alignment drift. This makes the adaptability concern in point 6 not just an architectural note but a methodological answer: Article 11's flexibility is one reason the paper's primary contribution is a process as much as an artifact.
On Article 11 (point 6): Agreed. The paper should ground the documentation obligation in Article 11(1) first, with Annex IV as its current specification. We will adapt this in a revision of the paper accordingly.
We are very open to contributing to DPV where AIDOC-AP adds Annex IV-specific depth, and we appreciate the invitation.
Best regards,
Sebastian