Publishing Bibliographic Data on the Semantic Web using BibBase
This is a revised submission following a "reject and resubmit", which has now been accepted. The reviews below are for the resubmission, followed by those of the original submission.
Reviews for the resubmission:
Solicited review by Kai Eckert:
I still think that BibBase needs much more also conceptional work and especially a clearer message about the goals to the user. However, this is an interesting approach and hopefully the publication not only motivates you, but also gives you additional feedback that helps you to optimize bibbase. All in all, I would accept the paper now.
Solicited review by Antoine Isaac:
I am happy with the modifications the authors made to the first submission.
There is still no evaluation report, and the wiki page of the project still presents old material. But it is good to see that the authors are still working and supporting their system, and willing to advertise it to community pages like http://www.w3.org/wiki/ConverterToRdf
Reviews for the original submission:
Solicited review by Kai Eckert:
The authors present BibBase, a web application that fetches bibliographic entries (Bibtex) from external URLs, transforms the data into linked data and provides several views on the data: RDF, Bibtex and HTML pages. The latter are intended to be used for integration into own webpages. A SPARQL endpoint is provided. Fetching from Mendeley is announced, but did not work at the time I tested the service.
I would like to start this review in an unusual way: BibBase works as described, the description is solid, the topic relevant, so from a formal point of view, I would accept the publication. It is appropriate for a tools description and certainly not too long.
I rather have some concerns with (the presentation of) BibBase than with the article to be reviewed, which I would strongly recommend to improve until the final publication of the article:
1. The welcome page is not very welcoming and does not explain much. Beside the fact that Mendeley did not work, it is unneccessary to let the potential users create a URL instead of just letting them insert a URL of their Bibtex file in a form field.
2. At first I missed a possibility to upload a Bibtex file. I understand now that the explicit strength of BibBase is to work with external sources and as such, probably the missing feature to host a bibtex file is intended. Nevertheless, as the files are cached on your server anyway, I would provide this possibility, probably it would not hurt.
3. A nice addon would be to assist Bibsonomy users to select Bibsonomy Bibtex exports. Also not a big deal, but would welcome Bibsonomy users nicely (like it seems to be intended for Mendeley users).
4. I don't quite understand the effect of the login. Is it really only that you value votes and corrections higher. That is confusing. When I login, I expect that I somehow can do something with "my" data at least, but I could not find a difference, "my" data is just cached Bibtex files, like everone else's. Here I see a huge potential to help a logged-in user to curate the own data, e.g. by selecting the desired versions of the various bibtex representations, maybe even merge them further and allows to get back an enriched Bibtex file. If this is not desired (e.g. to keep the service simple and strictly focussed on the external sources), then I would try to communicate this in a clear way and at least let the logged-in user select own sources and provide direct hints how to enrich these sources.
6. The Linked Data HTML representation should present the automatically derived links that are available in RDF (and described in your paper), as these might be interesting for just-browsing visitors.
7. A major concern: Please change the data license to a real open data license, at least without commercial restriction, better and more practicable would be a public domain (CC0 or PDDL) waiver. At least the users should have the choice to select an open data license (maybe directly in the Bibtex file?).
I would like to read even more about the technical details that you implemented behind the scenes, especially for automatic linking and duplicate detection. This is not necessary for a tools presentation, but properly described would make a nice full paper. To summarize, BibBase is a nice and clean approach to publish and link bibliographic data, but currently does not yet uses it's full potential, especially with respect to the ease-of-use which is claimed to be the main priority by the authors. I hope we will see further improvements soon. As a first measure, I would concentrate on the presentation and documentation of the whole project and make it as intuitive, appealing and usable as possible for new users.
Solicited review by Jan Brase:
The paper describes an interesting system to publish bibliographic information as linked data. The paper is clear written and includes a good overview on the current state of the art. The whole system is very ambitious and it has to be proven, if it can fullfill its expectations. Concerning duplicate detection of author names for example, as this still is a very open problem to most bibliographic data systems. It is also not clear, how the stability of the links in bibbase can be guaranteed.
Generally the paper is acceptable. It could be discussed, whether a revised version of the paper at a later stage with more actual user experience and proof of concept should be considered.
Solicited review by Antoine Isaac:
The paper is well written, the Bibbase functionality described can be much useful to researchers, and the system seems to be working.
One of the first issues, however, is the situation of Bibbase wrt. existing work.
As the technical level, it would be useful to acknowledge the many Bibtex to RDF work done before, and explain what the differences are with Bibtex. To name a few:
Also, is good to relate to established ontologies like BIBO. Further, the connection seems not to be implemented in a proper way. For instance I don't see that hasTitle is mapped to anything at http://zeitkunst.org/bibtex/0.1/bibtex.owl .
In the same line it could be interesting to see whether connections can be established with the more recent CiTO http://purl.org/net/cito/ from the SPAR ontologies.
More worrying is the lack of reference to the more-and-more visible VIVO project. VIVO focuses on researchers, but of course their publications are a key aspect of it, and VIVO serves linked data for it...
Another source of frustration is the part on duplicate detection and matching (e.g., to DBpedia). These are interesting, in fact they could be a key contribution of the paper. Unfortunately there is not much technical description (or links) to how they have been implemented. And no evaluation of the precision of the approaches.
Also, before the paper is accepted it should be re-written to really show what is done and not. There is not much point accepting a paper if more functionalities are coming in the next months, and are described in the main body of the paper (as opposed to the usual "future work"). For example it seems that users are NOW able to vote for identified duplicates. But I have not checked other "will be able"-like sentences. And no item has been published on the "news" part of wiki.bibbase.org...
I find it also worrying that the resulting RDF is Creative Commons Attribution-Noncommercial-Share Alike 2.5. This means that Bibbase claims rights over data that was contributed by users. This may be prove very deterring for the field to start using Bibbase...
- I'm not sure I understand the difference between "offline' and "online" in section 4.