Generation of Semantic Knowledge Graphs from Maintenance Work Order Data

Tracking #: 3658-4872

Authors: 
Farhad Ameri
Renita Tahsin
Yunqing Li
Mohammad Sadeq Abolhasani

Responsible editor: 
Guest Editors KG Gen from Text 2023

Submission type: 
Full Paper
Abstract: 
Industrial maintenance activity data is typically stored in unstructured form within the databases of maintenance man-agement systems. For this reason, effectively exploring the data and uncovering valuable patterns concealed within it is often highly challenging. Consequently, historical maintenance data is seldom analyzed or reused for purposes such as failure prevention, maintenance history reconstruction, or maintenance diagnostics. If the knowledge embedded in maintenance data is liberated and formalized, it can significantly improve the intelligence of maintenance management systems by enabling knowledge reuse. This research aims to help advance the progression from data to information and knowledge through data-driven creation of knowledge graphs built from the unstructured data available in maintenance work orders. A Simple Knowledge Organization System (SKOS) thesaurus is used to support automated entity extrac-tion from text. The thesaurus is extended with the aid of a fine-tuned Large Language Model (LLM). A formal ontology provides the semantics of the knowledge graph. A software tool is developed to streamline the semi-automated text-to-graph translation process. The proposed framework was validated based on 100 work orders extracted from the com-puterized maintenance management system of a construction equipment manufacturer. The experimental validation proved that graph-based representation of work order data could effectively enhance information retrieval, analysis, and pattern extraction particularly if it is supported by formal ontology and rule-based reasoning methods.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Reject

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 01/Jul/2024
Suggestion:
Reject
Review Comment:

The article reports on generating a Knowledge Graph from textual maintenance work orders. To achieve this goal, it proposes a taxonomy, an ontology, and an open-source Java-based tool. According to my opinion, the work and its presentation are not mature enough to be accepted.

First, the authors do not position the proposed work concerning related ones, and they fail to clarify which is the novelty of the proposed approach. The conclusive paragraph of *related work* states some limitations of the state-of-the-art without explicitly reporting which of these is addressed by the proposed contribution.

An explicit *research question* needs to be included, making it easier for readers to understand why this solution should be better than related work.

A stronger *evaluation* should be documented. It is marginally documented in the last paragraph of Section 6, missing a comparison with related work, a usability test of the software, and a performance evaluation of the proposed approach.

I need help finding access to the open-source framework. Moreover, I expect the link to the proposed ontology. Contributions should be described in more detail, such as detailing the architecture of the Java-based framework, the expected interaction model, and the design process of the ontology. A maintenance plan of the ontology and the tool would be appreciated.

Finally, authors should clarify why they refer to knowledge graphs authored by text, while in the introduction and in the abstract, they state that most of the maintenance data are stored in databases. Do they mean the proposed approach can structure orders usually stored within maintenance databases?

As minor comments, I am unclear about the distinction between the first and third contributions. Regarding the article's structure, having a section as short as the third one is awkward. Tables should not be split into consecutive pages, mainly if they are so short, as in Table 2.

Review #2
Anonymous submitted on 01/Aug/2024
Suggestion:
Major Revision
Review Comment:

The authors of this manuscript worked on extracting knowledge from maintenance work order data. This usually comes as a short textual description, containing terminology related to the domain. However, as the authors note, the top-level scheme is the same in all domains and only term specializations change depending on the domain, which means that this work can be more generally applicable with small adjustments.
Having a way to (semi)automatically extract valuable knowledge from existing maintenance work order text has practical applications in many industries and is worth exploring further.
The authors claim four main contributions in their work: The development of KnoWo, the LLM-assisted thesaurus extension, the implementation of WOO and the RDF extraction software.
I found the process/workflow the authors describe, very practical with presumably good results. I have the following remarks.

If the authors were not the creators of the tools "Nestor" and "SKOSTools", the description of those tools could be moved to another section, to make the contribution of this work more clear.
In section 4.3 providing additional details on the evaluation dataset and the way the "un-tuned" models were invoked could help the reader understand the process better. In addition, GPT is a proprietary model, including some non-proprietary models like Mistral AI and/or LLAMA [1] would offer the reader more well-rounded insights.
Regarding the tool KnoWo, providing more details on how this tool works would be valuable to the reader. Currently, it is not very clear if the tool is just performing label lookup of some additional NLP/LLM process for extracting the RDF triples. Either way, an evaluation table with precision/recall/F1 scores would help us better understand how good the step is.
Did the authors consider using some relatively recent advancements in LLMs named "function calling" / "data extraction"? In that case, the user specifies (in their prompt) what data structure they want to extract from an input, and the LLM tries to extract it from the text. This appears to have some overlap with KnoWo. Some discussion if not a comparison of KnoWo with LLM data extraction would bring the last step more up-to-date.
WOO appears a good addition to this work and the reasoning capabilities the authors added can help enrich existing data. Some discussion on how generally applicable this ontology can be in different domains could also help the readers.

Review #3
By Fatima Zahra submitted on 15/Oct/2024
Suggestion:
Accept
Review Comment:

This article presents a framework for generating semantic knowledge graphs from unstructured maintenance work order data. The approach integrates large language models for thesaurus expansion with formal ontologies, enhancing automation and scalability in KG development. The article also introduces the KnoWo tool, validated with real-world maintenance data, demonstrating its practical application in industrial settings.

- The article focus on semantic KG generation, formal ontologies, and rule-based reasoning.
- LLMs are used for thesaurus expansion, advancing automation in KG development.
- The KnoWo tool is validated with real-world maintenance data, highlighting its usability.
- The hybrid method balances human expertise with automated ontology and thesaurus creation.
- Generally the article is well written. However, improvements are needed in sentence clarity and brevity, as some long and complex sentences make technical details difficult to follow. Paragraphs in sections like the methodology could be more focused, and smoother transitions between sections would enhance readability.

- The contribution would benefit from stronger quantitative evaluation metrics and a more detailed discussion on scalability.
- A direct comparison with other knowledge graph generation methods would better highlight the article’s contributions.
- Expanding on how the proposed methods outperform traditional approaches, especially in complex cases, would enhance the technical depth.

The article makes a noteworthy contributions to the fields of industrial maintenance and knowledge graph generation. Its hybrid methodology, integration of large language models for thesaurus expansion, and the creation of the KnoWo tool are commendable achievements. Nevertheless, enhancements in scalability, comprehensive evaluation metrics, and automation are necessary to strengthen its impact. By addressing these areas, this framework could serve as a vital resource for industries aiming to harness historical maintenance data more efficiently.