Semantically enriched microposts in the enterprise: A practical approach
Solicited review by Uldis Bojars:
The paper is an application report describing a framework for semantically-enriched microblogging for knowledge management in the enterprise. The Mikrow system allows employees to post microblog entries which are enriched using the enterprise ontology, Linked Open Data (via the OpenCalais service) and information from other enterprise information systems. This information is used to present relevant matches when a user initiates information searching by posting a new status update.
It is an important topic and very relevant to this special issue of the journal. However the paper has shortcomings described below which would require major improvements before accepting it for publication.
An earlier version of the paper was presented at the In Use session of the ESWC 2011 conference. The submitted journal paper is almost a verbatim copy of the ESWC paper with some small changes. While previously published articles are not expressly forbidden one would normally assume that a significantly revised version is submitted to a journal. What would be the value of publishing this paper as is considering that the ESWC 2011 paper is publicly available on the Web and that this paper does not add anything to it?
The CFP says that application reports should be brief and pointed, indicating clearly in what sense and to what extent semantic technologies have been used in the application. Among other things, they are evaluated based on impact of the described application, for which a convincing evidence must be provided. This paper is not brief or pointed. Its length is fine but I would suggest the authors to think how you can make it more concrete, providing more insight into how the application works and illustrating its impact (e.g. via an expanded evaluation section).
The following paper, in my view, provides a good level of detail describing a deployed application: http://icwsm.org/papers/2--Passant.pdf
There is a nice video on the Mikrow site (good work!) showing the application and a live demo may have been done at the ESWC conference. Yet a journal paper would need to describe the application in more detail as the readers won't have a live demo. Try to compress the existing content and add more detail re inner workings of the application.
The information to add to the paper would be examples of a concrete walk-through of a process (1) of posting an update and what follows after that (examples throughout the paper already cover some bits of it), what exactly is stored in the data store; (2) of searching for relevant information (via posting an update) including what information is discovered using each of the knowledge boosting techniques. If the semantic engine is exposed via a REST API, how does the API work and what information does it provide?
The screenshot on Fig. 5 should be made larger and, preferably, use English. Would it be useful to show a screenshot of an expert/user page as well?
When the need for a precisely modelled knowledge base is mentioned (and that it can be semi-automatically enhanced with new concepts) it looks like there's quite a bit of manual work involved and that while easier for end users the system requires the up-front cost of providing the information for the system to start working. What if you don't have an ontology describing your enterprise?
The evaluation part needs to be enhanced. While a comprehensive user study may be a subject of another paper, the evaluation needs to include things learned from how the application is implemented and used. The evaluation should provide some light on how the theoretical contribution works in practice. Things that would be useful to know include overall usage statistics (how many users are using it; keep using it day-to-day; the number of messages posted; ...), user feedback (which features were rated as helpful and which were not), observations of user behavior (how often users followed links to suggested experts or microposts, etc.).
E.g., the search process is launched when posting a new status update. Is it more useful than a search box (based on user feedback)? To me, the process of searching and posting are two distinct activities and I would interested to know if it requiring to post an update before searching for entities is better than just searching.
This application is positioned as an expert-finding tool. There is related research in expert finding and some references and the analysis of how your approach is related to them would be useful. In particular, it appears that a number of posts on the topic is used as a key measure to recommend experts ("the one who talks a lot about something must be an export"). How well does it perform? Do you also use the information from other systems (e.g., who authored what documents) and the information on who follows whom in order to rank experts?
The paper is well-written but its clarity suffers the paper not being "brief and pointed".
Additional questions:
- how are linked data used in the application? are they used just for showing more information about an entity (a video showed a DBpedia screen) or does the system makes use of linked data in more ways that one?
- when a list of recognized entities is shown, does it contain links to pages with more information about these entities?
- is information on hashtags and reply thread structure used?
-- people seem to be fine with adding hashtags and other information when they see it worth it and there may be value in letting them add some semantics at the post creation time (e.g., a version of what SMOB does) to lead to a more precise retrieval later on.
-- what about cases when an expert helps other employees with answers on how to do something but does not mention the entity in the reply (it is mentioned in the original post asking the question)? how do we discover that this person is an expert?
- is the application open source?
- when the term "semantically indexed" is mentioned in 3.1., what does it really mean? the term is quite vague and requires an explanation in order to understand what exactly is meant there.
- could figures 1-4 be merged into one or two figures? they show parts of the process and look redundant.
Summary: an interesting application but it needs to be described in more detail.
Solicited review by Mischa Tuffield:
The paper is well written, and goes to some length to describe a Microblogging System designed to aid knowledge management within organisations.
The main drawn back of the paper is that it doesn't provide any evidence, other than anecdotal to support the proposed system design. It does seem that "semantically enriching" microposts could be a good idea, but there is no data to support this in the paper.
When context/semantic tags are extracted from the microposts, how many of them are correct, what is the overhead required to perform this analysis? How do these new indexes/semantic annotations help the task of recall?
I think that the paper presents what seems to be a solid system, but I feel that before this should be considered for journal publication the authors need to present some data regarding how well their system works. Something which would make another reading think that they might want to adopt a similar technique for the task of enriching microposts.
Solicited review by Alexandre Monnin:
This paper presents an interesting perspective on microblogging solutions for enterprise.
The paper is very clear, the solution straightforward...
Some statements need to be grounded though, like: "Knowledge and information management are not scalable unless formalisms are adopted." This should receive some justification.
Also, I found the the part dealing with the domain ontology and how it should be conceived was a bit weak.
Should the curation of the ontology be done by the engineer? Is he the one best able to judge which concepts should enrich the domain ontology? Maybe the authors should look at systems like SRTag that have already tackled the issue.

