Review Comment:
Overall evaluation
Select your choice from the options below and write its number below.
== 3 strong accept
== 2 accept
== 1 weak accept
== 0 borderline paper
== -1 weak reject
== -2 reject
== -3 strong reject
1
Reviewer's confidence
Select your choice from the options below and write its number below.
== 5 (expert)
== 4 (high)
== 3 (medium)
== 2 (low)
== 1 (none)
4
Interest to the Knowledge Engineering and Knowledge Management Community
Select your choice from the options below and write its number below.
== 5 excellent
== 4 good
== 3 fair
== 2 poor
== 1 very poor
5
Novelty
Select your choice from the options below and write its number below.
== 5 excellent
== 4 good
== 3 fair
== 2 poor
== 1 very poor
4
Technical quality
Select your choice from the options below and write its number below.
== 5 excellent
== 4 good
== 3 fair
== 2 poor
== 1 very poor
4
Evaluation
Select your choice from the options below and write its number below.
== 5 excellent
== 4 good
== 3 fair
== 2 poor
== 1 not present
3
Clarity and presentation
Select your choice from the options below and write its number below.
== 5 excellent
== 4 good
== 3 fair
== 2 poor
== 1 very poor
3
Review
Please provide your textual review here.
The paper proposes a novel feedback model for ontology matching where a community of users provides feedback on the mappings in an incremental fashion, i.e. the users are not requested to validate all the mappings generated by an alignment system, but only some of them, and the process can be stopped at any point in time.
The paper illustrates the method for multi-user validation that extends the AgreementMaker alignment system and presents a comprehensive evaluation on the OAEI benchmark datasets that shows the behaviour of the system's F-measure and error tolerance when different iterations of the algorithm are considered.
The approach presented in the paper is sound and well motivated. The scores concerning the clarity of presentation and the evaluation are motivated by some issues that could easily be addressed by an extended journal submission.
The paper is clearly written, and yet given the very nature of the proposed method, its presentation could try to help more the reader to remember the different formulae (AMA, CSQ, SSE, CON, PI, DIA, REV) and their characteristics. Some formulations are also somewhat counterintuitive, e.g. CSQ is defined as 1- (sum of all similarity scores \sigma_{i,j} in the same row and column / max sum of scores per dimension in the matrix), if then CSQ^- is used, i.e. 1- CSQ?
The notion of propagation is also somewhat not intuitive. The similarity of validated mappings (i.e. 0 or 1) is propagated to all the mappings that are most similar to the mapping just validated, where the similarity is represented by the signature vector of the mapping. Does this mean that the similarity of validation is propagated to those mappings for which different matching algorithm provide similar scores? Or is this just capturing the fact that there are mappings with similar level of confident assigned by the matchers?
The g used to calculate the propagation gain should be defined more precisely.
The experiments evaluate the approach by highlighting the behaviour in terms of f-measure and robustness (error tolerance) as the number of iterations grows. The evaluation randomly simulates the labels assigned by the users. It is not clear whether the experiments are repeated or run only once. Repetition would eliminate any bias intrinsic in the randomisation of the feedback simulation. it is also not clear whether the experiments take into account the distribution of the label (majority of 1s vs 0s and viceversa), and whether this is a factor that could affect performance.
The parameters of the 12 configurations chosen should be also motivated in more detail.
Regarding conclusion 2, it seems to presuppose that one can assess the error rate at every iteration run in order to decide whether revalidation is needed, but is this actually practical?
A final issue is whether the method assumes that the user is always willing to provide feedback, or whether there could be a point when the user looses concentration and does not provide any more feedback. Is this kind of trend possible for the user community, and would this lack of concentration affect the system?
|