|Review Comment: |
This paper introduces a formalised approach to developing aesthetically pleasing semantic web interfaces based on sound research. Considering that there has been no thorough study of the usefulness and potential benefits of more aesthetically designed interfaces I think this paper clearly scores on originality and would be a welcomed addition to the literature of Semantic Web Interfaces.
A useful set of guidelines for building Linked Data interfaces was extracted from the literature identified in tables. This will definitely be very useful. My only comment is that the tables do not read very well (which is a little ironic considering the aim of the research). The tables could be produced much more clearly with good LaTeX formatting. If this paper is to be distributed in a stand alone form I would recommend re-doing the tables to be more aesthetically pleasing and readable. I do take notice that typography was not part of this study and that it will be explored in future work which is conforting.
Regarding the design decisions of color use in pie charts (for graph nodes), I do not think this would scale to a large graph with many nodes (containing many concepts): the more colors you get the harder it gets (for me anyway) to make any sense of what's on the screen (e.g. some colors might become too close leading to correlation confusion, overall clutter which is against your own design principle 7. of Table 1). However, I think that when focusing (or zooming) these color design decisions are beneficial (as shown in Fig 5.). At a large scale, one idea to explore could be to add the ability choose (say by clicking nodes, or check marking) which node would to "colorize" providing a quick glimpse of regions of interest.
Looking at much of the interesting comments from users in the evaluation I was left to believe that users were fully aware the goal of the experiment was to evaluate the Affective Graph approach. Now, maybe those comments were included "because" the paper is about Affective Graph and they were not necessarily representative of all comments, but this was not clear to me. However, if users were aware, this in itself will result in bias; it has been demonstrated by several people that participants have a tendency to forge the results (so to speak) in favour of people doing the hard research work (response bias). I'm not saying this was the case, just that this is an impression I was left with. A simple line like "users were completely unaware of which system was being compared to others" or even better, introduce several control groups (e.g. some led to believe they were evaluating NLP interfaces others the graphical approaches). I take notice that it was a test leader who did the experiment, but that in itself is not enough to rule out all types of biases.
Another evaluation that would be interesting is re-doing the second evaluation (described in 10.2) after the learnability evaluation (described in 10.3). In other words, it would be nice to know if the results of the usability of Affective Graph in comparison to other approaches would change once users have more experience with the systems. Of course, to be fair, the same amount of experience should be given to other querying approaches as well. Just a thought.
Presentation and quality of writing was excellent. My only (minor) comments are the following:
- In 10.1 a cross reference link is missing (see Section ??)
- In 10.2.2 "graphical highly approach" would better read has "highly graphical approach".
- The Table 1 and 2 could be clearer. Proper use of LaTeX would do a better job.
- The dash symbols do not seem to be a real dash but looks like a minus or hyphen symbols.
If this paper would have been available when we wrote  it would have scored very high on the quality assessment criteria evaluation. The quality assessment criteria is based on the Gold Standard for Evidence-Based Medecine and adapted to UI evaluation in software engineering. Following is a few points that could be further strengthen:
- Provide alternative research (experiments) designs and justify why
chosen method is better to address the aim.
- Explain why the chosen sample of participants were the most appropriate to extract information sought.
- Is sample size large enough, if yes clearly justify it.
- No explicit control group (if so, I could not see it).
- Maybe take more contradictory data into account (if any exists).
- Include statistical quality control data (e.g. Kappa, ICC, Cronbach, etc.)
In summary, this is without a doubt an original paper with a high quality presentation. The significance of the results in my opinion are extremely important to the semantic web community and beyond. There are some areas that could be strengthen but I would recommend including this paper in the Semantic Web Interfaces Special Issue preferably with minor changes but even without change--the paper certainly help in raising the standard.
 Hachey, G. Gasevic D Semantic Web User Interfaces: A Systematic Mapping Study Semantic Web Journal, July 2012 http://www.semantic-web-journal.net/sites/default/files/swj316.pdf