Review Comment:
This paper represents an extension of a paper published at the Extended Semantic Web Conference (ESWC) 2013. I also performed a review of a (preliminary) version of the framework in 2012.
The main technical contributions and the bases of the framework seem to have been already published in previous papers from the authors. The paper contains, however, an extended evaluation with respect to the ESWC 2013 paper.
The topic of the proposed paper is very interesting as there is an increasing demand for tool support to involve the domain expert within the matching process as well as for a more iterative matching process. However, I have the following concerns about the current status of the paper (I believe some of the review comments I wrote back in 2012 are still valid):
- The paper should include an extended preliminaries or background section to introduce the techniques or definitions used along the paper (e.g. segments of an ontology).
- The workflow of the framework is a bit confusing (Figure 2) and in my opinion it would be better presented in the paper. Perhaps, providing concrete scenarios as example would help, e.g.: where the three types of sessions complement each other or only two of them take place.
- The framework, as the paper title states, is intended to deal with large ontologies. However, the experiments only involved the medium size ontologies from the OAEI's anatomy track. Larger ontologies like SNOMED or FMA from the OAEI's LaregBio track may have an important impact in the computation phase (e.g. billion of candidates/suggestions) and in the validation phase (a very large number of questions to validate by the user). A session based approach will definitely help matching large ontologies, but the framework should also consider performing as few questions as possible to the user.
- The proposed approach tries to reduce the search space by performing a partition of the ontology. However, I have the feeling that the search space may be reduced too much with the current used method, specially when ontologies are structurally poor or with disparate classifications.
- In Section 4.3 authors state: "After validation a reasoner is used to detect conflicts in the decisions". Which kind of conflicts are detected? Is a complete reasoner used? Complete reasoning may be time consuming or infeasible for rich and large ontologies. Fortunately, there are (approximate) mapping repair techniques that can do the work (e.g. Alcomo [1], AML [2] or LogMap [3]).
- The evaluation is very comprehensive with respect to the recommendation and computation sessions, perhaps too comprehensive and it is not easy to follow and interpret all the results provided in the tables. I suggest to focus (and comment) on the more important results and add an Appendix with the rest of the tables/results.
- On the other hand, the paper lacks an evaluation about how many questions/feedback the user is required to provide and their impact (how many mappings are affected/validated). The paper also lacks information about how questions are presented to the user. Are questions ordered with respect to a given heuristic or just with respect to availability? I think the use of sessions are very interesting to involve the user, however the paper does not state how important is indeed the potential user involvement.
- It would also be very interesting to test if a recommendation based on a matching task (e.g. the OAEI anatomy track) could be applied to another matching task (e.g. the OAEI's LargeBio track).
- Framework VS System. The presented approach seems to fall in between a general framework and a fully-fledge system. On the one hand if the authors aim at providing a general framework it would be very interesting if state of the art matching algorithms/systems could be plugged-in (e.g. OAEI ones). On the other hand, this session-based SAMBO could also be seen as a ready to use system that could participate in the OAEI campaign, specially in the Interactive track where there is a validation routine or Oracle.
- I miss a link to the implemented system. I could only find the following link which is not available: http://www.ida.liu.se/~iislab/projects/SAMBO/online.html
[1] http://web.informatik.uni-mannheim.de/alcomo/
[2] http://somer.fc.ul.pt/aml.php
[3] https://code.google.com/p/logmap-matcher/
|