Reality Mining on Micropost Streams
Submission in response to http://www.semantic-web-journal.net/blog/special-issue-semantics-microposts
Revised paper following an accept pending minor revisions, again accepted with minor revisions. The reviews of the first round are beneath the second round reviews.
Solicited review by Jelena Jovanovic:
This is the second time I am reviewing this manuscript and I am pleased to see that the authors have done an excellent job in revising and improving the manuscript. I would also like to thank the authors for the detailed answers to the questions and comments I had in the first review cycle.
As I stated in my initial review, the research work and the technical solution presented in the paper are highly interesting and relevant. The authors' work demonstrates how semantic technologies (both text analysis and opinion mining, and semantic web technologies) can be applied in practice and offer real benefits to end-users. Since my comments in the previous review cycle were related primarily to the presentation of the research work and the authors have improved the presentation, I would suggest accepting the paper for publication.
There is just a few very minor things I think need to be corrected (mostly typos):
- The order of mentioning figures in the text and the order of their appearance in the paper are not always aligned. For instance, Figure 2 is referenced before Figure 1 (p.2); Figure 4 before Figure 3 (p.3); Figure 3(c) before Figure (a) and (b).
- p.3: "tries to parse the text of the tweet" => "… of a tweet"
- p.3, Figure 4: "sentiment expressed in message" – since you opted for using the term "opinion" instead of "sentiment" throughout the paper, it would be good to substitute "sentiment" with "opinion" here, as well
-p.7: "Each plug-ins involved" => "… plug-in …"
- p.7: "SLD SERVER that continuously evaluate a network of C-SPARQL queries" – I would suggest adding here a reference to Section 5.2, since here it is not clear what this sentence means, while after reading Section 5.2 it becomes fully clear.
-p.9: what are WINDOWERs? Are they part of the SLD Server? Their position and role in the system's architecture are not fully clear
- p.9: "required as input by machine learning approaches including SUNS" => "…included in…"
- p.15: "displays a significant better performance" => "…significantly…"
Solicited review by José Morales del Castillo:
The vocabulary used in this new version of the paper is less confusing and the content is better explained. The role of different elements in the system has been clarified (as in the case of the ontology) and authors provide an extensive evaluation of the performance and scalability of the system.
Nevertheless, there are some minor corrections that should be carried out:
1.- In order to maintain the coherence of the vocabulary used throughout the paper, in figure 4 (internal architecture of BOTTARI's opinion miner) there's a box where it can be read "sentiment expressed in message". Since sentiment is not used or defined in the paper anymore I'd suggest using the term "opinion" instead.
1.-Page 7/Sec 4.3/ Para 1-> Each plug-ins involved ->Each plug-in involved
2-Page 7/Sec 4.3/ Last Par-> ...while in the Popular case its spans three years-> its span is three years
3.-Page 16/ Para 2->
Even if BOTTARI is currently just a research prototype, its potential as a commercial product is clear and encouraging and Saltlux decided to continue the BOTTARI development for its Korean customers.
I'd suggest: Even though BOTTARI is currently just a research prototype, its potential as a commercial product is clear and encouraging. Furthermore, Saltlux has decided to continue the BOTTARI development for its Korean customers.
4.-Page 16/ Para 3-> The acronym for Linked Open Data has not been properly introduced ->Linked Open Data (LOD)
First round reviews:
Solicited review by Jelena Jovanovic:
The paper presents BOTTARI, an Android application which provides users with location-aware and personalized recommendation of restaurants located in one district of Seoul, Korea. The application makes use of several Web services to collect data about restaurants, as well as microposts originating from Twitter to gather users' opinion about those restaurants. Natural language processing and opinion mining techniques are used to identify microposts that contain restaurant reviews as well as to differentiate between positive, negative and neutral opinions expressed in those posts. Deductive and inductive stream reasoning technologies, developed in the scope of the EU FP7 LarKC project, are used for generating personalized recommendations. The application also makes use of Augment Reality technology to display the recommended restaurants, though this aspect of the application is out of scope of the paper. In a nutshell, BOTTARI is a very nice demonstration of how semantic technologies – especially the latest developments in the area of large scale stream reasoning – can be applied in practice and offer benefits to end-users. The fact that it won the Semantic Web Challenge at the latest International Semantic Web Conference (ISWC 2011) is another guaranty of the quality of the technical solution it is based upon.
The paper is well organized and well written; it is easy to read and follow. References are fully sufficient.
I have no major comments; the majority of comments and questions listed below are aimed at improving the presentation of this research work, in terms of clarifying some parts of text that are not clear enough or providing additional explanations of some aspects of the presented work that were not covered enough. There are also a few questions/comments related to some design decisions. In what follows, I list my comments in the order of their appearance in the text
Presentation-related comments and questions:
How do you see the difference (if any) between Semantic Web and Linked Data? I ask this as I've noticed that you refer to them as separate entities. For example, I see Linked Data as a part of (i.e., a kind of precondition for) Semantic Web (as it was initially envisioned). How do you perceive the connection between them? In any case, I would suggest that you either clarify the meaning of these terms or use one of them consistently in the paper.
I would suggest rewriting Section 2.3 to improve its readability – currently, the flow of narration is broken with almost each sentence.
In Section 2.4, it is stated that "Saltlux's proprietary sentiment analysis solution is applied to microposts"; could you provide some kind of reference to this solution? For instance, URL of the web site that gives more information about it or a reference to some technical documentation?
In Section 4, it is stated that BOTTARI collects data about POIs from popular Web services such as Yelp and Yahoo! Local; all these services offer data about users' ratings of restaurants; however, according to the description of the recommendations offered by BOTTARI, these ratings are not used; is that true or it is my misunderstanding? It would be beneficial if you could cover that part; if that data is not used, why not since it is a rich source of users' opinion; if yes, how it is used when generating recommendations?
In Section 4, page 6, authors make a comparison between traditional tourist guide books and BOTTARI; I'm not sure that this is a fair comparison; the list of restaurants in travel guides is a list of recommended restaurants, not all restaurants in a certain city; Yellow Pages is something that gives a detailed list of all the restaurants, not a travel guide. I'm sure that when recommending restaurants in BOTTARI, you are not recommending all 319 restaurants you have in the data store. I understand the point you want to make here but still think that this comparison is not adequate.
The authors also make a comparison with travel oriented Web sites (like Trip Advisor) and point a scarce number of restaurants covered by these sites when compared to BOTTARI; however, this is again something to be expected since BOTTARI is focused on one particular area within one particular city, whereas Web travel sites tend to be very broad in coverage; so, again, I think this is not a fair comparison. In my opinion, a better comparison would be with some of the location-aware mobile apps for recommendation of restaurants
Page 7 - how is the incompleteness different from high sparsity? Based on their descriptions given in the text, I cannot identify the difference
Page 7 – regarding multiple ratings, what is the rationale behind the decision to treat multiple ratings in the given way? Why neutral ratings were valued as slightly positive?
The ontology shown on Figure 6 is for the representation of tweets, but what about POIs - how are they modeled? I suppose that you use an ontology for modeling restaurants, as well; however, that was not mentioned in the paper
As a part of the explanation of the SPARQL query given in Listing 3, the following is stated "the query matches also POIs that are described with a sub-category of InterestingForForeigners, because the required category can be deduced;" what kind of deduction is done here? Could you explain that better?
Regarding recommendations For Me:
- what kind of data is used for training the model of the inductive reasoning component? It is just said that the data is in RDF format, but it is not clear which data is used (i.e., what that data is about)
- it is said that opinions of other users are considered for generating recommendations - is some kind of collaborative filtering applied (i.e., considering only the opinion of other users who are "similar" to the given user according to some metric) or opinions of all other users are considered?
- does the system keep some kind of user model where it stores the user's interaction with the system, preferences and the like? Or everything is generated on the fly?
General level comments and questions:
Why Twitter is chosen as the primary source of users' likes and dislikes (i.e., positive and negative opinions) about restaurants? Why not a more review-focused service/application? You would have much less noise in the data and less need for data processing (the stream of data is not that intensive as it is on Twitter)
Any plans for the inclusion of microposts from services/apps that are more location-aware and/or are more typical for sharing restaurants reviews/ratings, like Foursquare? I would have expected some future developments along this line, but nothing was mentioned in the Conclusion
You compared BOTTARI with traditional travel guides (paperback) and travel web sites like Trip Advisor; what about the new location-aware restaurant recommendation mobile apps (e.g., http://www.likeness.com/); how do you see BOTTARI compared to those applications?
Related to the previous question, it would be good to list the advantages of using semantic web technologies in a system like BOTTARI; what was made possible or easier or more efficient thanks to their usage? As a long-time semantic web enthusiast, I believe that such advantages do exist and it would be great to make them explicit.
A few very minor comments:
- Figure 2: the output is labeled "Twitter message reputation"; I wouldn't say the output is the reputation of a message, but the sentiment expressed in it
- Figure 3 - I would suggest extending the figure caption with a short description of each screenshot.
- The order of figures 4 and 5 should be changed - figure 5 is referenced in the text and appears in the paper before Figure 4. The same comment applies to listings 9 and 10
- Table 1 - I would recommend extending the table caption to explain the meaning of the presented data
- Page 7, col. 1: in average -> on average
- The legend of Figure 12 should be corrected; according to the description in the text, what the figure presents is the number of rated POIs, not number of ratings per POI
Solicited review by Pablo Mendes:
The authors present an excellent read describing an augmented reality application for localized recommendation of points of interests (POIs). The work is well motivated and relevant to the Microposts 2011 Special Issue.
The evaluation presented by the authors demonstrates that the results beat the baselines, however, the claim of "very good quality of the produced recommendations" needs further justification in the evaluation section. Most nDCG values on Table 3 are very close to zero. Therefore it would be helpful for the readers if authors explained the low absolute values, therefore smoothing the connection to the interpretation of high quality recommendations.
Furthermore, in section 8.4, it was not directly clear to me how the authors arrived to what they called "the correct interpretation". It would be very helpful to work through the rationale that took them to the number of 87,500 areas. Also, recalling in that section what it means to have 700 tweets/second for Insadong in terms of observed number of tweets in practice, number of opinions and number of mentions of POIs could help to put numbers in perspective.
For other researchers interested in comparing their own approach with the one presented in this article, it would be useful to give clear statements of which parts of the system are available (freely or otherwise), open source, under which licenses, where to obtain the software, etc.?
Writing, typos, etc.
...such as found in Twitter is the basis of novel... -> ...such as those found in Twitter is the basis of many novel... ?
31,000 user -> 31,000 users
citation to "bandwagon effect" or brief explanation
claims that "SUNS significantly outperformed all baselines" but does not report significance tests.
"the best ranking ever" -> the best ranking overall
catched MOSTLIKED referred to-> caught MOSTLIKED with regard to
was very closed -> was very close
referred to -> with regard to
while the time window being -> with a time window of
Listing 2 says the query is in SPARQL, but uses non-standard WITH PROBABILITY and ENSURE PROBABILITY. It would be make it clearer if the caption mentioned "extended SPARQL" or something along these lines.
"Here, we attempt to interpret the results of these scalability experiments." -> suggest removing sentence
tweets per seconds -> tweets per second
Conclusions and Future Works -> Future Work
Throughout the text, bracketed citations "" are used as part of the text. This reviewer personally finds it more comfortable to read papers where the cited author names are part of the text and brackets are merely pointers. For example: In AuthorName et al. , bla bla.
This comment does not constitute recommendation, merely an expression of opinion.
Solicited review by José Morales del Castillo:
Authors present a AR application that recommends POIs in a restricted area. The paper presents a thorough description and evaluation of the prototype.
I only miss a more detailed reference to "sentiment" analysis since in figure 3(d) the trend of people's sentiment is not only provided in terms of positive and negative comments, but in terms of taste, comfort and service. I'd suggest a brief comment about this point.