Abstract:
Answering natural language questions over knowledge graph data is challenging due to the vast number of facts, which can be difficult to process and navigate. One potential solution to this issue is to use of graphs related to a query. This research presents a solution for extracting subgraphs related to entity candidates from a question-and-answer set, obtained by inferring a large language model via shortest-path calculations between entities. The proposed approaches provide rich features that can be extracted from the subgraphs and reranking models to select the most probable answers from a list of candidates. Experiments were conducted on Wikidata knowledge graph to evaluate the effectiveness of the proposed methods. We thoroughly tested all main features that can be extracted from subgraphs and conducted a detailed analysis of the proposed feature and reranking method combinations. In addition, a public web application that provides a useful web tool for studying the graph space between question and answer entities has been developed to support subgraphs. This includes visualizing the extracted subgraph and automatically generating natural-language text to describe it.