Modeling visualization tools and applications on the Web

Tracking #: 946-2157

Authors: 
Ghislain Atemezing
Raphael Troncy

Responsible editor: 
Guest editors linked data visualization

Submission type: 
Ontology Description
Abstract: 
Publishing data on the Web is getting more and more adoption with the release of datasets by governments in Open Data. Governments and local authorities are giving access to their data free of charge, although sometimes in heterogeneous formats. Linked Data principles enables the use of common model to make things interlinked on the Web. While this is true for publishers, consumers are involved in this loop of publication by building visualizations (re)-using existing tools from the information visualization community (InfoVis). However, there is a need to describe applications built with those tools in order to easily discover applications on the Web. Besides, Open Data events usually contain application contests helping to showcase the usefulness of data, without structured data available in the web-pages. The contribution of this paper is three-fold. First, a review of tools for creating applications is provided, with a classification based on domains. Second, the DVIA vocabulary is proposed to annotate applications with a common semantics. Third, a design and implementation of a universal plugin to generate RDF data from application contests is described in detail.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Reject

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Jan Polowinski submitted on 16/Jan/2015
Suggestion:
Major Revision
Review Comment:

Formal aspects

Regarding the type of paper, the paper seems to be a bit more than a „Ontology description“. It has a survey part (4 pages), incl. a review of existing surveys, an ontology and modeling description part (6) and a tool (more than a prototype) description part (4 pages). The overall length is 17 pages, which seems to me to go beyond a "short paper“ as required for ontology descriptions. Since there is no better suitable paper category though, the authors should eventually try to shorten the paper a bit. Splitting one of the parts completely is also not desirable in my opinion.

Some options to shorten the paper could be to cut details, e.g. in:
… 2.1.1 - details on Google App Engine usage and alternatives
… 2.1.4 - abstract from some details like „using output=gvds“ ?

Summary and Structure

The authors describe web-based RDF modeling approaches (DVIA vocabulary + improvements to „Apps for Europe“ vocabularies) and tooling for web-based visualization tools for (linked, RDF) web data. Although there are good reasons given (ease reuse to stop reinventing the wheel and share and save resources and efforts) such projects tend to be hard to understood by outsiders due to their meta-character. The more important I consider a very clear structure and style to convey the usefulness and relevance of their modeling approach. I do not doubt it's usefulness - on the contrary, I'd second that there are many similar visualization tools invented. However, while the structure seems to follow a certain logic, this logic is not directly reflected in the abstract nor in the conclusion, which confused me. I was missing some overview how the various parts are related - DVIA - Apps for Europe - the JavaScript plugin. This was not clear on first reading. A different structure of sections and subsections could better support the understanding of the content. In the abstract, the authors clearly announce the following three parts: (1) review of tools (2) DVIA vocabulary (3) plug-in design. This is repeated similarly in the conclusion. However, the main section structure is not three-fold. I’ll try to summarize what I think what the three parts comprise and what problems I see:

1. Introduction
#####################################
### FIRST PART - review of tools ###
#####################################
2. Survey on Visualization Tools
2.1 Tools for vis. structured data (general structured data ; non-RDF)
2.2 Tools for visualizing RDF data (used synonymously with „Linked Data“ and „graph data“ (Conclusion) ; problematic: RDF graphs are no structured data? Do you rather mean „tabular data“?)
2.3 Related work (should maybe not called „related work“, since this suggest that alternative approaches for modeling vis. tools are discussed here)
### END OF FIRST PART ###
3. Describing Applications on the Web (This should be renamed not to be confused wit the title of Section 4)
3.1 Motivation (for what? -> be more explicit?)
3.2 Catalogs of Applications (this is a bit like related work ; you conclude at both the end of Section 3.2.1 and 3.2.2 -> combine conclusion and move to the end?)
3.3
4. Describing and Modeling Applications
4.1 Typology of Applications
4.2 On Reusable Applications
#####################################
### SECOND PART - DVIA vocabulary ###
#####################################
4.3 DVIA …
#####################################
## END OF SECOND PART ###############
#####################################
5. Improving the Discovery of Applications in Open Data Events
5.1 Background (here the first time „Apps for Europe“ is mentioned, but not really introduced. In the following it is hard to understand how this is related to your work.)
5.2 Modeling Approaches (The modeling seems not to be limited to DVIA, but also includes an improvement of the Apps for Europe modeling - why is that not stated in the abstract/summary?
#####################################
### THIRD PART - the JS plugin #####
#####################################
5.3 Implementation and Application (I suggest to separate this from the modeling sections)
#####################################
6. Conclusion (very short, actually not a conclusion, but a summary).

Content - Quality and relevance of the described ontology

The ontology is well documented, available on the web and reuses four existing vocabularies. It’s an (intentionally) small ontology with 3 classes. The number of 18 properties, however could have been reduced. Why is there a new property for relations like „url“ , „author“? Aren’t there as well existing properties for keyword and license? (There is no standard yet to formally suggest the use of existing properties in a vocabulary, but it could be done in natural language.)
(Not influencing the review, just as an idea: Concerning the property „view“ (range = String now) - if you think about (one day) using an object property here instead, the class viso:Graphic_Representation [1] may be of interest for reuse ; Disclaimer: I’m one of the authors)
The relevance of the ontology has been illustrated by use cases and usage in the JavaScript tooling described in last section, but not by external usage by third parties (which I consider optional).

[1] https://github.com/viso-ontology/viso-ontology/blob/master/src/main/reso...

Illustration, clarity and readability of the describing paper

Besides the problems arising from the structure (mentioned above) and the too large number of minor issues (see below), the paper well describes the DVIA vocabulary. More difficult too read is the description of the modeling of the apps4eu-vocabulary. For example, it remains unclear what is the difference between the odapps: and apps4x namespace. The names of the properties connectedApp and connectedEvent sound like very general, weak names (similar to „relatedApp“). Maybe there is a more specific name? In Fig. 5. the usage of prize and wonPrize needs to be checked.

Further content-related notes:

- If possible avoid using data and information as synonyms (in one sentence ; Admittedly, the difference is sometimes difficult). Example: 2.1.2

4.1 Application typology - Is IsaViz really a vocabulary-specific application ? If it’s called vocabulary-specific, because it works on RDF then also Tabulator is. (A look at the referenced blog post showed that you must have mixed something up - IsaViz does not show up in this list. Check all references?)

- In section 5.2.2 some repetition happens (notes on the „submission“ class). I suggest not to create a new sub-section, rather let 5.2 be one section without subsections at all.

Minor Issues

Since there are still quite a lot of smaller issues (typos, English, avoidable errors made in haste), they are annotated directly in the paper. I can send a scanned PDF if desired. I would recommend to ask a native speaker for a review though, since I as well am none.
- Intro: evtl. better : „… classifying data into quantitative and categorical …“ (by the way - what about ordinal? it’s rather not either quantitative XOR categorical)
- „nice visualizations“ sounds very unscientifically - what are there needs in more concrete terms?
- In order to avoid very long sentences, try not to concat sentences that do not really share a topic, but use a second sentence. Example: 2.2.4 - „ … and an understanding of RDF, and runs the query on the server side.“ I sometimes needed to re-read sentences because of that. Further example: 3.2.2, first sentence.
- Check if you are consistent with which URLs to put as a footnote and which not. Maybe use a footnote all the time?
- I had severe problems to understand the following sentences:
- 3.1 : „How are the types of information that can help create a vocabulary for annotating web-based visualizations online?“ -> „What are …“? Still I don’t fully understand it. I suggest to rewrite the last third of 3.1 not to loose the reader during the motivation!
- Reduce the font-size of listings a bit to avoid too much line breaks
- be consistent with names like „JavaScript“, upper vs. lower case
- Is „algorithm“ the best term for the installation instructions? Of course, its correct, but maybe still misleading. Another question is if this or the whole Appendix A shouldn’t be moved completely, e.g., to your GitHub Wiki. This is not required to be part of the final paper, I think. But maybe this wasn’t intended?
- Please check the literature references, there is at least one duplicate and some clean up is required
- One example of avoidable errors can be found on page 6 - the last sentence of 2.3 is incomplete. I conclude from this that the paper is in quite a raw shape and the authors may have been able to fix many issues on their own given more time (which, I know, is often running out quickly, but so is the reviewer’s ;) ) Similar issue on page 7, last sentence

Review #2
By Bernhard Schandl submitted on 02/Feb/2015
Suggestion:
Reject
Review Comment:

This paper attempts to make three contributions that relate to the discovery of applications on the Web: a review of existing tools, a vocabulary for describing applications, and a prototypical implementation of a Web CMS extension to publish application descriptions according to Linked Data principles.

The whole paper is rather a mess. The intention and scope of the paper is not clear at all. The first part (Section 2) is a rather random collection of data visualization libraries, Linked Data libraries, and RDF renderers. The point of this part is not clear, and the relation to the works presented in subsequent sections remain unspecified.

The following sections (3 and 4) introduce a vocabulary for the description of "visualization applications" -- to what extent such applications differ from "normal" applications remains an open question. The vocabulary description is clear, but it is not at all clear whether the vocabulary fulfils its requirements — mostly because these requirements are not presented or discussed. As such, it is very hard to judge whether the vocabulary is of any particular use. I strongly suggest to describe the intended use case and to derive a clear list of requirements based on analyzing this use case. Further, lots of space is dedicated to an example description using Turtle syntax, which is of no particular use to the reader (the URL of an example is more than sufficient). I suggest to instead use this space for presenting concrete examples of usage (e.g., search queries) of data based on that vocabulary.

The final Section 5 describes the implementation of a plugin for a Web CMS, but it does not at all refer to the contents of previous sections. In particular, suddenly a completely different vocabulary is being used, and some (unmotivated) changes that were made to this vocabulary are being described.

There is no user evaluation or anything else that would demonstrate the feasibility of the presented approach.

On top, the paper has lots of typos and grammar errors. There are also several layout issues (e.g., bottom of page 7).

In total, I suggest to completely rewrite the paper. The authors should focus on one particular problem that they are trying to solve, and to demonstrate in the paper that their approach actually solves that particular problem. As it currently stands, this paper is not sufficiently mature for publication.