Review Comment:
> Summary of paper
The paper describes two tools for completing data on Wikidata. The tools rely on Wikipedia. The first tool, Wwwyzzerdd, provides a UI-addon for adding Wikidata data from Wikipedia web interface. The second tool, Psychiq, is more of a prediction model for Wikidata type and subclass statements based on Wikipedia articles.
> Feedback of paper
>> Title and Abstract
- The name Wwwyzzerdd seems unusual. Just asking: Is there any specific reason behind the naming?
- "Hundreds of thousands of articles on English Wikipedia have zero or limited meaningful structure on Wikidata." -> How important are these articles/items? How often are these articles/items read by people, or used in apps?
- "Wwwyzzerdd has been used to make over 100 thousand edits to Wikidata." -> Any (more detailed) quantitative as well as quality analysis on the edits made?
- The abstract could be made clearer on the relationships between the two proposed tools as to how they could complement each other.
>> Introduction
- Typo: ".. widely used as a source of structured data[1]." -> ".. structured data [1]."
Also, please do check other places where similar issues as above appear.
- In general, the author of the paper is knowledgeable on the background and issues in Wikidata (based on the introduction section).
- "526,297 (54%) edits in the 24-hour period were made using QuickStatements which is a tool for bulk editing." -> There should be a comparison of the proposed tools to QuickStatements.
- The problem presented is well motivated.
- "Psychiq is a machine learning model that is integrated into the Wwwyzzerdd UI that suggests, on the basis of Wikipedia’s content, new statements to be added to Wikidata." -> It's a bit misleading as this might suggest that Psychiq is already able to add arbitrary statements, while at the moment it focuses still on type and subclass information.
- The caption and in-text narration of Fig. 1 could be further improved. The wording could be improved, the current form is misleading as it suggests there are two lines/curves, one for growth of number of WD items vs. number of active users.
>> Related Work
- At the end of related work, I would expect a discussion on the summary of missing gaps and requirements in order to motivate the needs of the proposed tools.
- Typo: ".. the Wikidata Distributed Game framework ^6" -> The footnote numbering should follow directly the 'framework' without a space.
>> Implementation
- "Wwwyzzerdd is a manifest V2 .." -> Could it be clarified a little bit on manifest V2?
- Typo: The use of a comma in connecting subsentences could help in improving understandability: "If the author is notable enough to have an English Wikipedia article their name in the book’s article will almost always be a link to the author’s article."
- There seems to be an issue wrt. the scalability of Wwwyzzerdd, in particular since the author has mentioned the limited number of edits by humans. Any thoughts on this?
- Fig. 3 seems to be limited on how the tool really works. What about adding a short video on the tool usage?
- What (web) technologies are used to develop Wwwyzzerdd? This could be made more explicit in the paper.
- "If there is a property connecting the Wikidata item for the article to the item linked, the corresponding orb is green." -> This could be problematic if there are > 1 properties between A and B. It could create, say, a false positive (seemingly green, but actually for another property.
- More use cases (motivating scenarios/running examples) can be added to the paper to show various scenarios as to how the proposed tools could be used.
- If there could be a diagram/architecture of Psychiq, that could help improve readability.
- "In the future the first sentence of the article or category hierarchy information may also be added as a feature to the model." -> Why not the whole article text also?
- The ML and language model of Psychiq could be described in more detail. Basically, more introduction and motivation could be helpful.
- The training and testing sets are somewhat poorly described. How are they created? How are they labeled? Furthermore, where did the 5.6 million examples come from?
- Why only a single epoch training?
- Actually, Fig. 5 shows an inherent problem of Wikidata: Ambiguities in class naming. In the fig, there could be a number of valid classes to be applied. Which one is the best one (linguistically, practically, according to community, or any other criteria)?
- One thing, Fig. 4 does not really have good explanation on how "Guessing" works. Moreover, can there be a slightly stronger baseline than Guessing?
>> Discussion
- As per my previous comment, the 111655 edits of Wwwyzzerdd could be better investigated for any insights/issues.
- A discussion on future work could be nice.
> Overall
I believe the tools have contributed to completing Wikidata. I'd suggest however some revisions to the current version of the paper.
|