Review Comment: 
 > Summary of paper 
The paper describes two tools for completing data on Wikidata. The tools rely on Wikipedia. The first tool, Wwwyzzerdd, provides a UI-addon for adding Wikidata data from Wikipedia web interface. The second tool, Psychiq, is more of a prediction model for Wikidata type and subclass statements based on Wikipedia articles. 
> Feedback of paper 
>> Title and Abstract 
- The name Wwwyzzerdd seems unusual. Just asking: Is there any specific reason behind the naming? 
- "Hundreds of thousands of articles on English Wikipedia have zero or limited meaningful structure on Wikidata." -> How important are these articles/items? How often are these articles/items read by people, or used in apps? 
- "Wwwyzzerdd has been used to make over 100 thousand edits to Wikidata." -> Any (more detailed) quantitative as well as quality analysis on the edits made? 
- The abstract could be made clearer on the relationships between the two proposed tools as to how they could complement each other. 
>> Introduction 
- Typo: ".. widely used as a source of structured data[1]." -> ".. structured data [1]." 
Also, please do check other places where similar issues as above appear. 
- In general, the author of the paper is knowledgeable on the background and issues in Wikidata (based on the introduction section). 
- "526,297 (54%) edits in the 24-hour period were made using QuickStatements which is a tool for bulk editing." -> There should be a comparison of the proposed tools to QuickStatements. 
- The problem presented is well motivated. 
- "Psychiq is a machine learning model that is integrated into the Wwwyzzerdd UI that suggests, on the basis of Wikipedia’s content, new statements to be added to Wikidata." -> It's a bit misleading as this might suggest that Psychiq is already able to add arbitrary statements, while at the moment it focuses still on type and subclass information. 
- The caption and in-text narration of Fig. 1 could be further improved. The wording could be improved, the current form is misleading as it suggests there are two lines/curves, one for growth of number of WD items vs. number of active users. 
>> Related Work 
- At the end of related work, I would expect a discussion on the summary of missing gaps and requirements in order to motivate the needs of the proposed tools. 
- Typo: ".. the Wikidata Distributed Game framework ^6" -> The footnote numbering should follow directly the 'framework' without a space. 
>> Implementation 
- "Wwwyzzerdd is a manifest V2 .." -> Could it be clarified a little bit on manifest V2? 
- Typo: The use of a comma in connecting subsentences could help in improving understandability: "If the author is notable enough to have an English Wikipedia article their name in the book’s article will almost always be a link to the author’s article." 
- There seems to be an issue wrt. the scalability of Wwwyzzerdd, in particular since the author has mentioned the limited number of edits by humans. Any thoughts on this? 
- Fig. 3 seems to be limited on how the tool really works. What about adding a short video on the tool usage? 
- What (web) technologies are used to develop Wwwyzzerdd? This could be made more explicit in the paper. 
- "If there is a property connecting the Wikidata item for the article to the item linked, the corresponding orb is green." -> This could be problematic if there are > 1 properties between A and B. It could create, say, a false positive (seemingly green, but actually for another property. 
- More use cases (motivating scenarios/running examples) can be added to the paper to show various scenarios as to how the proposed tools could be used. 
- If there could be a diagram/architecture of Psychiq, that could help improve readability. 
- "In the future the first sentence of the article or category hierarchy information may also be added as a feature to the model." -> Why not the whole article text also? 
- The ML and language model of Psychiq could be described in more detail. Basically, more introduction and motivation could be helpful. 
- The training and testing sets are somewhat poorly described. How are they created? How are they labeled? Furthermore, where did the 5.6 million examples come from? 
- Why only a single epoch training? 
- Actually, Fig. 5 shows an inherent problem of Wikidata: Ambiguities in class naming. In the fig, there could be a number of valid classes to be applied. Which one is the best one (linguistically, practically, according to community, or any other criteria)? 
- One thing, Fig. 4 does not really have good explanation on how "Guessing" works. Moreover, can there be a slightly stronger baseline than Guessing? 
>> Discussion 
- As per my previous comment, the 111655 edits of Wwwyzzerdd could be better investigated for any insights/issues. 
- A discussion on future work could be nice. 
> Overall 
I believe the tools have contributed to completing Wikidata. I'd suggest however some revisions to the current version of the paper. 
 |