AXIOpedia: Enriching DBpedia with OWL Axioms from Wikipedia

Tracking #: 1615-2827

Authors: 
Lara Haidar-Ahmad
Amal Zouaq
Michel Gagnon

Responsible editor: 
Philipp Cimiano

Submission type: 
Full Paper
Abstract: 
The Semantic Web relies on the creation of rich knowledge bases which link data on the Web. Having access to such a knowledge base enables significant progress in difficult and challenging tasks such as semantic annotation and retrieval. DBpedia, the RDF representation of Wikipedia, is considered today as the central interlinking hub for the emerging Web of data. However, DBpedia still displays some substantial limitations such as the lack of class definitions and the lack of significant taxonomical links. The objective of this work is to enrich DBpedia with OWL-defined classes and taxonomical links using open information extraction from Wikipedia. We propose a pattern-based approach that relies on SPARQL to automatically extract axioms from Wikipedia definitions. We run the system on 12,901,822 Wikipedia pages (including disambiguation pages). The resulting knowledge base, AXIOpedia benefits from a rich and consistent ontology with complex axioms, rdf:subclassOf and rdf:type relations.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Reject

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Heiko Paulheim submitted on 20/Jun/2017
Suggestion:
Major Revision
Review Comment:

The paper discusses an approach for extracting ontological axioms and class definitions from definitionary sentences in Wikipedia, using a set of hand-crafted patterns. While the paper tackles an interesting problem, I miss a comparative evaluation against state of the art systems and baselines.

The related work section seems to be incomplete, both for surveys of the area (e.g., [1,2]) as well as for recent approaches (e.g., [3-5]). A crisper problem definition and clearer distinction from the state of the art would be beneficial. Furthermore, related works should be taken into account for the evaluation.

Remarks w.r.t. the methodology:
* As far as the heuristic for telling instances from classes is concerned: the current method used in DBpedia's heuristic typing looks resource labels up in WordNet. If there is a non-capitalized word, the resource is assumed to be a class, otherwise, it is assumed to be an instance. This would be a baseline to compare against.
* What I miss is a unification of equivalent relations. The extracted relations seem to be quite straight forward translations of the original words. That said, properties like authorOf and writerOf etc. seem to be treated as separate properties, without any attempt to unify them.
* At least in the outlook section, I miss a discussion of extending the approach to multi-lingual settings.
* As far as the corpus is concerned: I would assume that Wiktionary is the better source of definitory sentences than Wikipedia.

Remarks w.r.t. the evaluation:
* I miss baselines, even trivial ones. For example, for type extraction, one could use every noun phrase that co-occurs with the subject at hand.
* I miss comparisons to state of the art approaches, e.g., FRED or ReVerb for the relation extraction, the approach by Zirn et al. for distinguishing classes and instances, Tipalo for type prediction, etc.
* Unclear: does table 5 only refer to mapping-based types or also to heuristic types?
* What does table 3 actually depict? I am unsure about the semantics of "Precision/Recall"
* F-measure should always be depicted along with recall and precision to foster easier comparison against state of the art systems
* I have some doubts about measuring the number of axioms correctly found. In a complex construct, single missing axioms might invalidate the whole definition. Consider, e.g., Elephant subClassOf livesIn(Africa or Asia). If we remove the term Asia, this would not just be worse class definition, but a wrong one.

Minor remarks:
* the paper mixes set theory notation (rounded symbols) and DL notation (rectangular symbols). Please be consistent.
* for the example in section 1: "the term country is defined" -> this should be "island"

In general, the paper, in its current state, is too weak for publication in SWJ.

[1] Biemann, Chris. "Ontology learning from text: A survey of methods." LDV forum. Vol. 20. No. 2. 2005.
[2] Barforush, Ahamad Abdollahzadeh, and A. L. I. Rahnama. "Ontology learning: revisted." Journal of Web Engineering 11.4 (2012): 269-289.
[3] Navigli, Roberto, and Paola Velardi. "Learning word-class lattices for definition and hypernym extraction." Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2010.
[4] Tamagawa, Susumu, et al. "Learning a large scale of ontology from Japanese Wikipedia." Web Intelligence and Intelligent Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM International Conference on. Vol. 1. IEEE, 2010.
[5] Weichselbraun, Albert, Gerhard Wohlgenannt, and Arno Scharl. "Refining non-taxonomic relation labels with external structured data to support ontology learning." Data & Knowledge Engineering 69.8 (2010): 763-778.

Review #2
By Basil Ell submitted on 07/Oct/2017
Suggestion:
Major Revision
Review Comment:

The authors provide an approach that derives the types of entities mentioned in text and that extracts OWL axioms from Wikipedia pages. The axioms make use of DBpedia classes. Definitions are dependency parsed and the dependency parses are then represented as RDF graphs which are manipulated in several steps. A set of manually developed patterns are presented that can be matched against this RDF graph representation to generate the axioms.

As I provide details in the comments below, several aspects of the approach are unclear. Moreover, it would be important to have a concrete use case where the axioms improve the capability of a tool to reach a certain goal. Given the comparison to LEXo, which tackles the same problem but creates different axioms, it is not clear which axioms are better since better should mean more suitable and helpful in a certain context. What exactly this context is is not addressed in the paper and needs to be discussed.

I was not able to check the material provided online since the page http://www.westlab.polymtl.ca/AXIOpediaWebApp cannot be found (Error 404 - October 7th, 2017).

The problem of ontology extraction from text is a difficult and relevant problem. Therefore, an approach in this direction is a valuable contribution. However, the authors do not tell how difficult it was to manually create the patterns and the approach is not very original. More original would in my opinion be an approach to learning the patterns automatically. The quality of writing is very good, but some details need more explanations. Also, the discussions could be extended. I believe that an ontology of definitions of DBpedia classes would be a nice contribution to the research community as well as to practitioners.

However, I consider this work in its current form not mature enough and thus propose major revision.

MAJOR COMMENTS:

1. It is not explained what you mean with a Wikipedia definition. I guess it is a part of a Wikipedia article, but how do you extract it? Or is it the abstract, i.e., the first paragraph of an article? Probably, since you manually developed the patterns, if a pattern matches a text then that means that the text contains a definition? On p7 "we process the sentence" it sounds like a definition is a single sentence?

2. Section 3 "Overview and Motivation of this Work" has a strong focus on typing entities. The need for axioms defining classes could be motivated more strongly.

3. The page located at http://www.westlab.polymtl.ca/AXIOpediaWebApp cannot be found (404 - October 7th, 2017). Thus I could not have a look at how the axioms look like and what the webservice returns.

4. What kind of axioms do you create? In the abstract you mention rdf:type and rdfs:subClassOf relations. Which other OWL terms may appear in the axioms that you derive? Do you create restriction classes? It would be helpful if you could clearly state which OWL terms you are using in your patterns so far. Or are there any conceptual limitations so that not all terms that could make sense for class definitions are not used?

5. There are some issues regarding properties that are not mentioned in the paper.
a. Properties are created but they are not matched to DBpedia properties.
b. It needs to be discussed that many properties can be created that are identical and this needs to be taken care of by others that want to use the axioms created by your approach.
Furthermore, p3. "it also creates URIs for classes and resources and links them with taxonomical relations, properties and restrictions". That means: no URIs for properties are created? What does linking to restrictions mean. Linking to restriction classes?

6. In the related work section, beyond ontology extraction from Wikipedia you should also take into account ontology extraction from text and ontology learning from text in general.

7. What is specific in your approach about Wikipedia? Although it would be out of scope to analyze how these patterns help to derive axioms from other text sources, it would be nice if you would mention that as part of conclusions / future work, or comment on that elsewhere in the text.

8. Which namespace do you use for classes that you create? This should not be any of the DBpedia namespaces since then you cannot control that the URIs would be resolvable.

9. p2. "none of the existing works try to extend DBpedia with a more expressive ontology" -> What about the links that were created to YAGO/Wikidata/Opencyc etc.? Doesn't linking to these ontologies help to have a more expressive ontology?

10. p3. At the end of the page you state that you enriching the layer of taxonomical relations, the layer of axioms, and the layer of synonyms. Don't you also enrich the layer of relations?

11. p4. "FrenchPoet rdfs:subClassOf Poet". Why not also FrenchPoet rdfs:subClassOf French?

12. p9. Pattern 1: one could also create the class NotAvailableInMarket in addition to Available and NotAvailable. It would be interesting to discuss in the paper why this is not done. Maybe the pattern would become too specific? This would be a valid reason.

13. What exactly is te role of disambiguation pages?

14. p11, Section 4.4. Why didn't you use an existing tool for Named Entity Recognition and Classification? Also, you could make use of the links between Wikipedia pages.

15. When turning tokens in the RDF graph representing the dependency parse into classes, then it wouldn't it make sense to perform lemmatization beforehand? Otherwise, deduplication becomes necessary later. Another related problem: tokens are turned into classes via the construct queries. Now a term that represents a class can have dependency relations with other tokens which does not make sense for the class definition. But maybe, this intermediary graph is not stored - only a certain subgraph of that graph is. But how is this done?

16. p14. It could be made more clear what you mean with "filtering". I understand it as what to add to the gold standard and what not. Also, why is the header of Table 3 "Precision/Recall" and not "Precision"? Below the table you refer to Table 4, but should refer instead to Table 3. Also, you could also show absolute values.

17. It is not clear to me how recall is measured. Do evaluators manually create the complete axioms?

18. p15. For the long sentence ("Moorland and moore...") it would be interesting to see the complete axiom as well as the axiom created via your approach.

19. It is good that there is a system, LExO, which you can compare the axioms build by your system to. Although limitations of both systems are mentioned, the section leaves me with the some question. Is it better to have an axiom that defines an equivalence (Data \equiv ...) rather that an axiom that defines a subset (Data \subseteq ...)? In which situation would one or the other form be preferable? Is "2by" (p16, axiom 6) a good way of expressing the cardinality? Is it better to have the class YoungFish as subclass of Fish instead of the intersection between the classes Young and Fish? Wouldn't it make sense to not create new classes (YoungFish) and instead reuse existing classes (Young and Fish) or to prefer to create classes that can be reused more often (such as Young and Fish)? Why exactly is the axiom built by your approach for sentence 9 more favourable? Wouldn't it be necessary to have some concrete use case where axioms would help as well as some quality criteria and metrics related to the use case? One could then show that the performance of a system applied for this use case benefits from these axioms according to the metrics.

20. p18. Below Table 6 you mention Table 7, but that might rather be Table 6. In the discussion of the table you note that certain patterns were excluded. Why is that the case and what difference would it make?

21. p18, Section 5.3.3: Did the evaluators only consider those missing types that exist in DBpedia? Also, there could be cases of missing types such as: the classes American and Actor were found, but the class AmericanActor is missing. Would it make sense to regard the class AmericanActor as missing?

22. Why do the kappa values differ so strongly for the three datasets? Maybe the evaluation methodology / the task is not specified clearly enough and thus leads to different evaluation behaviour for each dataset? Below Table 12, p20, you explain where the disagreement comes from. It really seems like it is necessary to create clear evaluation guidelines and to repeat the evaluation.

23. The conclusion could contain some more discussion about the axioms.

MINOR COMMENTS:

1. p1. It should be pieceOf instead of pieaceOf

2. p2. "building a rich ontology [...] requires a thorough stody in those four fields". I do not understand why all of these fields are relevant. Maybe you could explain that point further.

3. p2 "this work is an extension of the second task of the 2016 OKE challenge". Rather, you do not extend the task but provide an approach ...

4. p2. You mention a set of works [8, 9, 10, 11] but only discuss two of them. It would be great if you could discuss all works mentioned.

5. p4. it should be dbr:Jules_Verne dbo:occupation dbo:Poet instead of dbo:Poet dbo:occupation dbr:Jules_Verne.

6. p4. Rather rephrase "which are similar to [28]" to "which are similar to those presented in [28]"

7. p6, Figure 2: Instead of "rdf:type", the outcome of the Instances box could be named "Type Statements". The classes are at that point only natural language terms, right? Also, in the figure you mention Wikipedia pages and definitions. Are they the same? This related to the major comment #1 above.

8. p13, Section 4.6 "Axiom and Type Disambiguation". This title is a bit misleading, since the section is not about axiom disambiguation.

9. p13. The end of Section 4.5 is not easy to read. Maybe this could better be explained i.t.o. pseudocode or via a flow chart.

10. p19. What is the "mathematic difference"?

11. p20. "We also generated a complete ontology": I recommend to remove the word "complete".

Review #3
Anonymous submitted on 10/Oct/2017
Suggestion:
Reject
Review Comment:

The paper under review presents an approach to enrich DBpedia with OWL-defined classes and taxonomical links using open information extraction from Wikipedia. Two are the objectives presented by the authors: to extract AXIOPedia, a knowledge base built by learning axioms defining classes and taxonomical links; and, to provide a web service for the automatic extraction of OWL axioms from Wikipedia. The paper builds on, and extends, previous works by the authors (e.g., [4],[5]).

** Dataset? **

Despite the title, no description of AXIOpedia is provided in the paper, not even a quantitative description of its content (i.e., number of axioms). I would have expected some quantitative analysis of the resource produced, with information such as:
- how many axioms? split in tbox and abox assertions;
- how many axioms of a certain type (e.g., simple subclassof axioms, universal / existential restrictions)?
- number of new concepts with respect to DBpedia;
- number of new axioms (split in tbox and abox assertions) with respect to DBpedia;
- classes defined in AXIOpedia not having a page in Wikipedia or pages having multiple concepts / instances associated (see the "natural satellite or moon running" example in the paper)
- how axioms are distributed (per pattern) in the knowledge base;
- ...

It is not even clear how AXIOpedia was built. The authors write that they applied the approach on ~13M Wikipedia pages... How were "Wikipedia definitions" identified within the textual Wikipedia page? Which text was used to create the axioms? The whole Wikipedia page? Just the first sentence(s) or the first paragraph(s)?

Furthermore, none of the links to the resource (footnote 3 and 7 in the paper) were working at the time of reviewing the paper (tried for few days both at the beginning of September 2017 and at the beginning of October 2017).

** Approach **

My main concern with the paper is however on the scientific novelty of the proposed approach. Several sub-task are performed:

4.1. Identification of Instances and Classes in Wikipedia
Here some elementary filtering heuristics are applied.

4.2. Processing the Textual Definitions
Stanford Dependency parsing is applied, and its output is translated in an RDF graph; here the authors proposed a specific vocabulary for representing the output of Stanford, but they could have leveraged NIF (NLP Interchange Format - [Ref1]) for this.

4.3. Processing the RDF Graph to Extract Axioms for Classes
The approach here is basically the same as the one applied by LExO: pattern-based rules are applied on the output of a dependency parser to generate the axioms. While the technicalities (LExO uses MinPar for dependency analysis, and rules where applied on this output - AXIOpedia uses Stanford CoreNLP and rules are defined as SPARQL construct queries) and the defined rules/patterns may be slightly different/better/worse, the approach is basically the very same.

4.4. Processing the RDF Graph to Extract rdf:type Triples for Instances
This part was already presented in [5].

4.5 Hearst Patterns
Hearst Patterns have been already long applied for extracting taxonomical relations from text.

4.6 Axiom and Type Disambiguation
As observed by the authors, a rather elementary (and error prone) techniques is applied.
I'm also wondering about the semantics implied by the alignment to DBpedia. Even looking at the example of the authors, it ends up with the disjunction between dbo:Planet (i.e., the class in the DBpedia ontology) and dbr:Minor_Planet (i.e., an instance in the DBpedia dataset). I think even worse situations could arise by freely mixing ontological classes and instances. And as the (main) purpose of injecting axioms is actually to use them to perform reasoning, I'm wondering what meaningful implications can be derived from an alignment performed according to this strategy.

Summing up, it is difficult to grasp what is the scientific novelty and value of the work described in the submission, and if it actually advances the state-of-the-art in ontology learning.

** Evaluation **

The authors evaluated some of the above subtasks against a manually annotated gold standard built by 3 pairs of evaluators.

5.1. Evaluation of the process of distinguishing Instances and Classes
Since the axioms where selected as 25 classes and 25 instances according to the system, it would useful to add in Table 3 the actual number of classes and instances in the resulting gold standard for each dataset.

5.2. Evaluation of the Axiom Extraction
The authors proposed a measure to evaluate precision and recall of extracted axioms. It would be interesting to know how many axioms overall where actually perfectly extracted (i.e., with 100% precision and recall).
Considering the processing of adjectives, on which the author commented in the paper, the choice of creating taxonomic relations from them is quite tricky, and can easily led to undesired effects (thinks for example to "hot dog" [Ref2], which is definitely non a subclass of dog...) Limitations of such choices should be discussed.

5.3. Type Extraction Evaluation
I wonder why the "statistics based on DBpedia" part was done only on 1000 instances, instead of the whole set of instances from Wikipedia.

** Related Work **

It may be good to compare the proposed approach with other recent related works such as [Ref3] and [Ref4].

Refs:

[Ref1] Integrating NLP using Linked Data. Sebastian Hellmann, Jens Lehmann, Sören Auer, and Martin Brümmer. 12th International Semantic Web Conference, 21-25 October 2013, Sydney, Australia, (2013)
[Ref2] https://en.wikipedia.org/wiki/Hot_dog
[Ref3] H Safwat, N Gruzitis, B Davis, R Enache "Extracting Semantic Knowledge from Unstructured Text Using Embedded Controlled Language" in 2016 IEEE Tenth International Conference on Semantic Computing (ICSC), 87-90, 2016
[Ref4] Giulio Petrucci, Chiara Ghidini, Marco Rospocher "Ontology learning in the deep" in Knowledge Engineering and Knowledge Management: 20th International Conference, EKAW 2016, Bologna, Italy, November 19-23, 2016


Comments