Review Comment:
This manuscript was submitted as 'Survey Article' and should be reviewed along the following dimensions: (1) Suitability as introductory text, targeted at researchers, PhD students, or practitioners, to get started on the covered topic. (2) How comprehensive and how balanced is the presentation and coverage. (3) Readability and clarity of the presentation. (4) Importance of the covered material to the broader Semantic Web community. Please also assess the data file provided by the authors under “Long-term stable URL for resources”. In particular, assess (A) whether the data file is well organized and in particular contains a README file which makes it easy for you to assess the data, (B) whether the provided resources appear to be complete for replication of experiments, and if not, why, (C) whether the chosen repository, if it is not GitHub, Figshare or Zenodo, is appropriate for long-term repository discoverability, and (4) whether the provided data artifacts are complete. Please refer to the reviewer instructions and the FAQ for further information.
* Making Linked-Data Accessible: A Review
In this paper the authors present a review, or survey, of user interfaces for accessing Linked Data, or RDF databases, on the Web. The authors present a list of criterion for comparing these interfaces, and an exhaustive list of the existing systems, describing and classifying them within the previous category list. At the end of the paper the authors provide a summary of the methods used to evaluate the interfaces and some challenges that arise from the analysis of these interfaces.
*** Strengths
- S1 Exhaustive list of systems (even though some famous ones are missing, such as the QA system Aqua or SemFacet)
- S2 Clear criterion for comparing the systems
*** Weaknesses
- W1 Some tool analysis seem shallow, since it seems the authors do not look at the actual interface but at the paper describing it.
- W2 I miss looking at specific user interface theory, such as the relation with mental models and user interfaces, understanding how the users actually interact with the interface, etc. That would make a really set of interesting conclusions.
Detailed review:
Regarding the Shallowness of the systems analysis I miss several questions about some specific systems, that make me think that this is a shallow analysis:
- Konduit VKB: why is considered the system focuses at expert and tech users? why is potentially too cryptic for lay users? Did the authors ask to the users? I think there is no proof about it, specially looking at the Konduit paper.
- BioGateway is a plugin for Cytoscape, which is a system for biological data visualization. This is missing in the paper.
- VizQuery: this system seems a student's project, it just asks for a vocabulary property and the interface does not check anything else. I do not understand why this interface is in the list. It could be that the system is importan for historical reasons (i.e. was the first in doing X), however I am not sure about that.
- From SPARQLFilterFlow I miss what data it is possible to visualize (i.e. what data can you load into the system), it seems that the . Also, you load in SPARQLFilterFlow? What tasks did the users worked on? were they hard? They lookd at it, quickly they used the first result from the interface (Food).
- Are tabular queries faceted browsing systems?
- Where is the linked data in the systems? most of them only query one single SPARQL endpoint or dataset, and thus these systems only access a single dataset without linking these data to others.
- ExConQuer seem to me a faceted browsing system, wouldn't faceted browsing be an entire category since it is widely used in commercial systems? [1,2]. Also there is no code nor a demo.
Regarding graph based systems, I also have comments about some of them:
- GQL is an old system without code and demo. What is the LD ontology the authors are referring to? also, the authors say that "The authors thus claim that the tool can be used to create very complex queries" can you verify that? From my point of view you are only stating what is in the paper, however and from my point of view, in a survey paper I would expect some verification. If there is no verification I still have to go to the original paper and read it through.
- SPARQLinG: from my point of view the point of this work is that it defines a complete visual query language, rather than a user interface, which is the demo they provide. I think that this is one of the few exceptions that actually provide a language. It should be in the paper due to that for not providing a working system, since I think it is not available. Also, was this paper cited by others that later also provide a language?
- QueryVOWL: the tool clearly says that "The web demo provides a prototypical implementation of QueryVOWL. It has mainly been developed to demonstrate the QueryVOWL approach and should not be considered a mature tool (e.g., it contains some known bugs).". I think that the authors should highlight that.
- RDFExplorer: the process the authors describe for drawing queries is not true. If we look at the paper it says "the user must then start by adding a new node, be it a variable node (η(G)) or a constant node (η(G, x));" and it is possible to query DBpedia or other datasets too (https://dbpedia.rdfexplorer.org/). Also, the difficulty from using the interface, comes from the data or due to the interface? the authors claim at the beginning of the paper that it is due to the data, however they claim now that "any difficulty using the tool initially tends to decrease over time". In the end the most important question to me is to know whether the difficulty comes from the data and why. Also, the idea of RDFExplorer is "to simultaneously navigate and query knowledge graphs". Also, this system poses a question, do the other systems scale with data?
NLP based
Regarding these systems, why the authors do not classify them by the use of modern NLP architectures (i.e. using neural networks) vs the rest? the accuracy using these techniques is way higher, and if there is none, that means a possible improvement over the state of the art, which is what I personally look for in this type of papers. Also, I guess that the more accurate the tool is the more satisfaction the users will have when using it.
Alternative User Interfaces
In this section I do not understand why YASGUI is compared WQS since WQS is also in another section. Yasgui is the most popular user interface for SPARQL endpoints since it comes shipped with Apache Fuseki and is part of WQS. From my point of view Yasgui should be in another section, of WQS in this section.
Linked Data browsers
Again, why the authors do not classify the systems by those using neural networks, which offer better identification of sentence components than others? same for Question Answering systems, which are also part of virtual assistants.
Why Web APIs are present in this paper? these are not User interfaces. I think this section should be in another paper.
User study
From my point of view this is the most important caregory in the paper, since it allows to understand why one interface is usable, i.e. why users actually use it. Rather than listing whether a specific technique was used to evaluate the paper it would be great to understand whether the evaluation helps in that regard. For instance, NASA-TLX helps in understanding the user's workload using the interface. What interfaces measured that and how they compare with others that did not use them? Furthermore, only a few interfaces were actually evaluated, and thus the comparison should be brief. The same apply for other evaluation techniques.
Findings and discussion
This section is more a summary of what the interfaces do in an aggregate manner. It is a nice to have, however I do not see much findings nor discussion. Also, the term user-friendly should be replaced by the term usable [3].
Summary
In general this is a very interesting work, it compiles in a single place many of the user interfaces so far for accessing RDF data, not linked data since none of the previous interfaces accessed more than a single dataset (this should be corrected in the paper).
However I think the system's descriptions are too shallow (as I described before). What I would like to learn from a survey paper are the problems the surveyed interfaces have, and why. Statements like "the authors from X paper say.." are not for a survey paper since I think that there should be a bit more of analysis.
Regarding the systems in general I miss knowing what datasets can they access, whether I can load DBpedia or Wikidata on them (or point to these datasets) or there are some limitations regarding the amount of data they can visualize. I would like to know whether the systems asked end users during the interface design phase, i.e. if the systems are actually solving someones problem and whether the users provided some feedback about the interface.
Also, the authors assume that the difficulty for accessing and querying RDF data is due to SPARQL and the intricate structure of the data. Do any interface look at these weaknesses? specifically about how to overcome the intricate data structure? Did any of the studies described asked users about the difficulty of accessing and understanding the data?
And last but not least, how many of these systems are available online? do they provide any code? any license? User interfaces are for using them, thus, can I?
Another point of view that could be useful is to present the systems in a "historical context" showing how the interfaces have evolved during the last 15 years.
[1] Marti Hearst, Ame Elliott, Jennifer English, Rashmi Sinha, Kirsten Swearingen, and Ka-Ping Yee. 2002. Finding the flow in web site search. Commun. ACM 45, 9 (September 2002), 42–49. https://doi.org/10.1145/567498.567525
[2] Hearst, M. (2006, August). Design recommendations for hierarchical faceted search interfaces. In ACM SIGIR workshop on faceted search (pp. 1-5).
[3] Nielsen, J. (1996). Usability metrics: Tracking interface improvements. IEEE software, 13(6), 1-2.
|