NonFoodKG: A Non-Food to Food Product Knowledge Graph for Consumer Applications in Retail Environments

Tracking #: 3180-4394

Authors: 
Michaela Kümpel
Michael Beetz

Responsible editor: 
Guest Editors Interactive SW 2022

Submission type: 
Tool/System Report
Abstract: 
The Web offers plenty of product information that is valuable for supporting customer decision processes. On the one hand, food product knowledge has been shown to be useful for consumer applications like recipe or dietary recommendation, for example. On the other hand, product knowledge can be extremely useful for applications in retail stores when it is linked to a semantic environment map. That way, it can be used to help customers find products they are searching for or highlight interesting product information like awarded labels or contained ingredients and allergens. Unfortunately, the Semantic Web lacks a semantically enhanced knowledge graph unifying non-food product information while the existing shopping applications lack a standardised connection to environment information. This work proposes NonFoodKG, an open-source non-food to food product knowledge graph integrating modular product information from the Web as well as accurate environment information from sensor data that can be customised for different applications and used devices. We describe the design process and modularity of the knowledge graph as well as example applications of it, including an Augmented Reality shopping assistant to highlight all products containing a given preference like an ingredient/ allergen or label and a dietary recommendation. We show how a routing application to find product destinations can easily be used by different agents like smartphone or robot. Thus, NonFoodKG can be seen as an example of one point of data access in digitisation of retail stores towards omnichannel applications. NonFoodKG links to environment information, which we create using the KnowRob knowledge processing framework. This connection enables agents to link product to action knowledge and thereby generalise the product information to be used in different domains while the modular ontologies enable personalisation of applications. NonFoodKG is publicly available and will be maintained and extended over time in order to facilitate various applications such as in the retail and household domain.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Reject

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Michael Brückner submitted on 27/Aug/2022
Suggestion:
Major Revision
Review Comment:

General. The paper is titled "NonFoodKG: A Non-Food to Food Product Knowledge Graph for Consumer Applications in Retail Environments" and proposes a knowledge graph (KG) employed by a semantic digital twin leading to single data access for food and non-food products in a brick-and-mortar store. The results of the KG can be used by a human via a digital device or, alternatively, by a robot guiding a human to accessing products at a fixed location in the store. Data gathered through the KG include such product information as nutritional facts, food and non-food ingredients, and price.
The paper presents a work in progress at its infancy. It mentions many data sources, semantic applications, and data presentation tools (map building, AR) but it is not clear to the reader how these elements work together in reality. As an example, products have to be found by position in shelves, but what happens if a product has been misplaced? After all, the AR application seems to use markers to recognize product information. Another example is the stock management. Is the stock updated by robot stacking/monitoring or by cashier information?
These examples lead to the general problem of which data from the environment (robot, stock) is actually shared with the digital twin and how. The authors should consider providing an overview of the data flow to/from the robot.
Originality. Digital twins of supermarkets have been known for some time, but the interconnection between food and non-food data is a promising development.

Significance of results. Unfortunately, I could not access any of the SPARQL example queries linked from the GitHub site. On p.8, l. 2, ontology terms in German are mentioned. Does that mean, the ontologies are multilingual? In the European context, this would certainly be a plus.

Quality of writing. Although the paper is easy to read, some flaws make it advisable to consult a native speaker to go over the usage of English.
Ex.: Abstract (l. 26)/p. 2, l. 7 "used devices" is most probably "devices used";
p. 3, l.4-5 Consider rewriting the sentence.
p. 7, l. 41: is comprised of -> comprises
p. 10, l. 31: shelve -> shelf

Review #2
Anonymous submitted on 12/Sep/2022
Suggestion:
Major Revision
Review Comment:

NonFoodKG: A Non-Food to Food Product Knowledge Graph for Consumer Applications in Retail Environments

This paper presents “NonFoodKG”, which authors claims to be an open-source non-food to food product knowledge graph integrating modular product information from the Web. According to authors it also contains an accurate environment information from sensor data that can be customised for different applications and used devices.

Reviewer finds this article interesting and suggest the following to consider for further improvement:

1- The real contribution of this work should be highlighted earlier in the paper in a concrete bullet point as mentioned in Section 5 “Discussion”
a. NonFoodKG offers: – A non-food product taxonomy that can be reused for applications in different stores. – A framework of modular ontologies that can be accessed for different consumer needs and applications. – Connection to existing ontologies like wikidata, FoodOn or ChEBI, allowing for further applications based on the contained knowledge. – Connection to exchangeable, standardised Digital Twin environment models, thereby enabling environment independent applications.

2- In its current form contribution appears to be more as ontology/ data source integration/ mapping activity that leads to the development of number of potential applications as described in section 4.

3- The related work is completely missing i.e what were the gaps in the current state of the art that leads to the development of NonFoodKG. Authors cited number of articles but a dedicated “Related work” section would be beneficial to look into the current state in number of aspects of the contribution for example “Ontologies”, “Web Resources”, “Current applications” etc.

4- The current presentation and language makes it difficult to follow including the short hand phrases e.g. page 3, line 23 “Section 5 concludes” as well as cross reference in the writing at multiple places e.g page 2, line 27 “mentioned above a”. A good rewrite would be great from readability perspective.

5- In section 2.2 “Structured Web Information”, authors present the list of lined data sources. The rationale behind selecting certain datasets over others is missing. For example Drugbank over DailyMed both have drug data. Full rationale behind all the selection is required.

6- Author(S) mentioned on page 4, line 26 that “We do not, however, use the owl:sameAs property but the oboInOwl:hasDbXref annotation property to interlink two data sources, as proposed by the gene ontology (GO) consortium. This is due to the fact that owl:sameAs increases file sizes since it creates duplicate entries while the oboInOwl:hasDbXref not only avoids duplicate entries but also looks more descriptive to a user browsing the ontologies using Protegè.” This doesnot seem to be enough and legitimate argument for selecting the liking property i.e “owl:sameAs” over “oboInOwl:hasDbXref”. Moreover, did all the linkes were generated using the “oboInOwl:hasDbXref” predicate or other properties were also considered? A full list of properties that were used to make the links should also be presented.

7- Most of the time once the links are created using certain properties available in two different datasource with different name/ representation there is also a chance of wrong links being created. What was the mechanism of link creation? Was it Manual/ automated/ semi-automated with the help of domain experts and what is the evaluation criteria to find if all the created links are correctly made?

8- Page 5 line 7, “For a more generalised approach we use Information Extraction techniques on online store sitemaps of different retail domains to automatically create a more general product taxonomy”. What/ Which Information Extraction Techniques were used and how were those implemented in general?

9- Will be good to see the citation, URLs and famous abbreviations of the ontologies in the same table 1 to consider it as a single point of reference for further use.

10- Number of ontologies were used as mentioned in section 3.2 and table 1, reviewers would like to see an additional column and/ or separate table with the list of all the properties used to link these different ontologies (with each other is that is the case) or with at least NonFoodKB.

11- List of candidate queries available at https://grlc.io/api/K4R-IAI/NonFoodKG/SPARQLfiles/#/ provides a good understanding of the data generated and stored. Where is the link of endpoint itself to generate the queries apart from the provided queries? Moreover, some of these available queries are not working and giving errors e.g Time Out or wrong syntax. Need to be checked out

12- The biggest limitation of this work is the lack of any Evaluation to this work? The evaluation can be made/ planned at the multiple levels including the evaluation of the links created (i.e if there are wrong links generated) Application level i.e analysing the efficiency, performance and accuracy of different application generated, as well as at the usability level i.e , approaches like SUS etc.

13- The links for the technologies used in the development of different applications are listed as follow but where can all these created application can be able to accessed/used in real world?
a. Unity web requests library: https://docs.unity3d.com/Manual/UnityWebRequest.html
b. JSON serialization in Unity: https://docs.unity3d.com/Manual/JSONSerialization.html
c. SWI-Prolog: ttps://www.swi-prolog.org/
d. HoloLens application: https://github.com/michaelakuempel/HoloPreferenceDemo
e. https://microsoft.github.io/MixedRealityToolkit-Unity/Documentation/Spat...
f. Image targets with Vuforia: https://library.vuforia.com/features/images/image-targets.htm

14- There is a so called sustainability plan available in section 5.1 which appear more of a promise rather than a plan when the authors mentioned “and we expect to support and expand NonFoodKG actively for the next 3-8 years………”. There should be a concrete plan associated with such applications/ research outcomes. This should include different phases and steps taken to ensure the sustainability of your work.

Review #3
Anonymous submitted on 06/Oct/2022
Suggestion:
Major Revision
Review Comment:

The article describes a non-food to food product knowledge graph and its use in several consumer scenarios in a retail environment.

The topic and the application scenario are quite interesting and important to demonstrate the use of semantic and linked data technologies. The article presents a good overview of a comprehensive set of data, tools, and technologies that could be brought together to implement several advanced and interesting scenarios, including AR.

Yet, although the article clearly demonstrates a considerable technical effort to implement several interesting scenarios, the presentation and certain parts of the content are quite naive, especially when it comes to semantic technologies. Article does not really provide more insights than what the community already knows, and at several places it is very basic. Article needs to be consolidated considerably, more technical depth needs to be in place in terms of the use of semantics, and findings needs to be elaborated in depth.

Concretely:

- problem specification and the description of the approach are very hard to follow and diluted; it is too high level, and needs to focus on the actual problem and the role of KG and the tools provided

- the presentation of the article in general needs to be improved, much of content seems to be distributed all over the article. (Other simple stuff as well, paragraphs in the abstract, marking approach and problem separately rather than having a text that flows, etc.)

- there are several definitions and description spread around all over the article, such as semantic web, linked data, information extraction, simple description of a triple, those needs to be explained briefly (maybe in a background section) at a higher level considering the venue the article is submitted to

- Datasets, ontologies, and tools developed (or reused) need to be clearly described in a structured way. For example, there are several ontologies mentioned without any information regarding where they came from. Some might be developed by the authors, but then it needs to be explained why existing vocabularies and ontologies are not re-used (partially or as a whole)

- The technical details regarding the use of semantic technologies are very brief and therefore hard to see if the whole thing goes beyond a toy example

- The application scenario is not convincing and does not demonstrate a real adoption, rather prototypical, at least more details are required regarding the actual deployment (even it is just experimental)

- No evaluation is provided, which could be in terms of a user study, performance study, etc.

- An essential contribution with such an article could be by means of lessons learned; however, no substantial insights are provided here to cover up the lack of an evaluation

- A related work section would be needed to at least cover similar solutions (including those based on semantic technologies and those that are not)

- There are several strange claims throughout the article, which are hard to understand, these needs to be either explained or ironed out:

"Unfortunately, the Semantic Web lacks a semantically enhanced knowledge graph unifying non-food product information while the existing shopping applications lack a standardized connection to environment information"

"While these applications are impressive and a first step towards increased customer decision support, they often are too customized, meeting only certain consumer profiles"