Review Comment:
Overall evaluation
The topic of the paper is ‘hot’ and its content is definitely worth publishing it. However, due to presentation issues (mainly) it must be significantly revised. The structure of the paper could be improved, the language (syntax/grammar) is also a weak point that must be fixed. Regarding the survey methodology followed, there are a few points that need attention (pls see detailed comments). Finally, examples are missing from the paper; a few insightful examples on AI bias and bias in KGs would certainly improve readability and comprehension.
According to SWJ review criteria for survey articles, the following is my view and understanding:
(1) The paper is suitable as an introductory text, targeted at researchers, PhD students, or practitioners, to get started on the covered topic. However, insightful examples are missing, and language problems must be fixed.
(2) The presentation and coverage of the topic is comprehensive and balanced, but it can be significantly improved.
(3) In terms of presentation, the readability and clarity of the paper can be significantly improved.
(4) There is no doubt that the paper is of high importance to the broader Semantic Web community.
I would like to thank authors for contributing this survey to the SW/AI research community.
Detailed Comments
Title: The use of 'for' in the title is somehow vague or misleading. IMHO I believe 'and' would better reflect the interrelation of SW and AI bias.
Abstract: Revision is needed according to detailed comments.
Page 1:
- Pls avoid citing sources in your abstract.
-“to bringing solution…” -> 'to solve the problem of ...' but which specifically problem? the bias or data validity?
- ‘fulfil’ -> 'bridge' might be a better term here
- ‘…, there exists no previous work to bring together bias and semantics’ -> so, what about 'bias in KGs'? is this not related? There are plenty of existing works that are related to this topic. Also, I do recall a paper (Debiasing Knowledge Graphs: Why Female Presidents are not like Female Popes) by Janowicz et al. 2018 about this topic, especially linking bias at schema, data and reasoning level. Page 16 (Bias within semantic resources) is also referring to related works that do 'bring together bias and semantics'. So, I guess this statement should be corrected somehow.
‘…types and sources of bias addressed with…’ -> bias itself? you mean probably 'bias assessment or/and 'bias mitigation' or 'bias interpretation' (to be accurate) or something else e.g., bias representation?
-‘… improve frequent limitations in AI systems.’ -> such as? pls give an example here
- ‘…We find works…’ -> we 'research works' maybe?
Keywords
-conceptual semantics’ -> I would prefer the term 'conceptual model' or 'semantic model' or just 'semantics', since 'conceptual semantics' has been used in the past (mainly) by Ray Jackendoff as a framework for semantic analysis. Anyway, you also use this term (semantics) in several places in the paper.
1. Introduction: Please revise according to the detailed comments. Also, please provide examples of biases where suitable, for readers to be able to get a first idea of what you are presenting in this paper (what do you try to tackle). Examples are missing! (Note: A few good examples of KG bias at different levels are presented in Janovicz et al. 2018 (Debiasing KGs...).
Page1:
Line 34: ’However, other factors coming from the humans’ -> which ones? Pls be specific.
Line 39: ‘to fill in existing gaps of AI systems,’ -> which are? Pls be precise.
Page2:
Line 5: ’conceptualisation that accounts for these dimensions.’ -> which dimensions?
Line 15: ‘for developing solutions to different biases’ -> 'solutions to bias' must be explained further i.e., it is 'bias assessment solutions' or 'bias mitigation solutions' or 'bias interpretation solutions' or other
Line 25: ‘literature of semantics and bias.’ -> pls use the term ‘AI bias’ for precision
Line 33 and 34: ‘Section 6 is’, ‘Section 7 are’ -> pls rephrase and correct syntax/grammar
2. Background of conceptual semantics: Pls revise according to the specific details. Overall, this section should be enriched/extended with more background knowledge, beyond semantics. One such extension is the 'AI bias' topic. What is AI bias? How is it assessed? How is it mitigated? What is bias interpretation/explanation? What is fair AI?
Page 2:
Line 4: ‘SKOS’ -> As you know, SKOS is the technology to represent taxonomies, but an example of a taxonomy would be useful here also e.g., Yahoo taxonomy, AGROVOC taxonomy,…
Line 13: ‘Ontologies.’ -> Example ontologies are missing. Also, an additional sentence or two about the different types of ontologies would be insightful for readers.
Line 15-17: ‘conceptualisation that is characterised by high semantic expressiveness required for increased complexity.’ -> but what about lightweight ontologies? These are ontologies as well... I suggest to just keep Gruber's definition (or use one of other more recent ones).
Line 22: ‘A knowledge graph (KG) is based on’ -> why 'based on' and not 'is...'? There are many definitions of KGs today to cite here.
Line 32: ‘Linked Data.’ -> This is a weak description, mainly because of what I have provided as argument in the use of KB, KG and LD terms earlier in this section. If you keep it, I do not see how you can avoid mentioning RDF paradigm. Also, examples must be provided, as done in the other types of 'conceptual semantics'. By the way, it seems that a paragraph on KB is missing (I guess because you do understand its similarity to a KG or because of low importance in this context.
Line 36: ‘Many data objects’ -> I do not understand the use of this term and how it can 'fit' in the provided definitions you mention (which definitions, by the way?). I believe this paragraph needs elaboration or/and disambiguation.
Line 44,45: ‘ knowledge bases, knowledge graphs, and the linked data.’ -> I understand how and why you provide this distinction, however pls notice that a KG may be considered a KB, and a KG may be considered (a representation of) linked data. In my view, I would just use the term KG. Anyhow, KG is the key SW technology in respect to AI bias (as you also conclude in this paper).
Line 46-48: ‘There exists…[20]’ -> pls rephrase, not correct syntax
3. Survey Methodology: Pls revise according to the detailed comments. Also, pls consider extending the reviewed papers list with papers published in 2021 (first semester) due to the highly dynamic aspect of the topic.
Page 3:
Line 4: ‘this study is bias’ -> pls refer to 'AI bias' not just bias (all sections in the paper) to distinguish from other types of biases
Line 5,6: -> 'investigate' is used twice in the sentence, pls rephrase
Line 6: ‘the utility’ -> pls replace with ‘use’
Line 8: ‘which type of biases ‘ -> 'types of bias'
Line 10: ‘proposals and literature reviews)’ -> but why? Isn't this what you also have prepared to discuss the topic? If your review paper (when published) is excluded from future research, then all this significant knowledge you have provided will be ignored.
Line 11,12: ‘Finally, we explore how bias manifest to identify key challenges in AI’-> pls disambiguate/rephrase
Line 15: ‘Search string and’ -> ‘Search keywords’
Line 21: ‘Google Scholar’ -> What about Semantic Scholar?
Line 34: ‘3.3. Synthesis of the results’ -> Either elaborate/explain more this subsection or remove it (as it is now it does not contribute any important information in this section).
Line 43: ‘between 2010 and 2020.’ -> Since the topic (AI Bias and KGs) has been paid a lot of attention very recently, there are a few very promising related works already published in 2021 (most in arXiv.org as preprints). Pls have a look there also and append your list of related works.
Line 45: ‘as part of conference proceedings or workshop,’ -> Could you name a representative list? For instance, I trust that at least AAAI, IJCAI, ECAI, KR, ISWC and ESWC events (last 5 years) are included.
4. Dimensions of analysis: Pls revise according to the detailed comments. I would replace the title into 'Analysis approach'.
Page 3:
Line 49: ‘Section 4.2 defines different categories of bias according
to its type’ -> this is somehow redundant (categories=types), pls rephrase
Page 4:
Line 1: ‘ 4.2 Bias in AI categorisation’ -> Perhaps this subsection can be moved to Section 2 (Background knowledge), along with the knowledge on semantics.
Line 4: ‘definition as [4] of bias’ -> pls check syntax
Line 6-10: why in italics?
Line 19: ‘the bias types,’ -> ‘types of AI bias’
Line 21: ‘define their nature’ -> bias->its nature; biases-> their nature
Line 26: ‘4.1. Dimensions of conceptual semantic tasks’ -> The tile of this subsection is somehow misleading. Perhaps a better title is "AI bias and semantics".
Line 28: ‘main group of works’ -> 'groups of work'
Line 31: ‘Identifying bias’ -> Is this equal to 'assessing bias'? In any case (yes or no) pls state their relation. Also, replace ‘discover’ with ‘identify’ for consistency.
Line 36: At the end of paragraph, Pls provide an insightful example. In general, examples are missing from the paper. Please consider adding at least a few insightful ones related to the different types of bias.
Line 50: ‘based on a holdout set’ -> ??? pls explain
Page 5:
Line 1 to line 11 (left column): these two paragraphs seem not relevant to this subsection (2.2.2), pls check.
Line 10 and 18: Table 2 and 3 captions, please correct syntax (papers do not have bias type or origin…)
Line 12: ‘4.2.3. Bias impact’ -> the title refers to bias impact, but the content of this section presents bias types e.g., population bias, behavioral bias, etc.
Also, the use of word 'as' in several places in this subsection seem problematic in terms of syntax/grammar (and meaning eventually). Could you please check and disambiguate/fix?
Line 16: ‘challenges in some of the papers.’ -> which ones? Pls be specific (citations)
Line 30: ‘Two other categories’ -> which ones?
5. Analysis of results: Pls revise according to detailed comments. Not every syntax/grammar (language) issue is highlighted beyond this point (please ensure you check this and following section for related issues).
Page 5:
Line 42: ‘RS are aimed to discover’ -> syntax, pls rephrase (RS aim to recommend...)
Line 49: ‘specific methodology examples’ -> 'methodological'
Line 50: ‘could help extrapolate them’ -> pls use a synonym
Page 6:
Line 17: ‘ML groups works’ -> pls correct syntax and add references.
Line 19: ‘5.1.1.1. Bias at source (functional bias)’ -> four levels of subsections should be avoided for clarity and presentation reasons. Please consider re-structuring section 5. Perhaps a taxonomy or even a KG that represents the structure of the concepts presented in section 5 would be helpful.
Line 28: ‘NLP is used to comprise’ -> syntax
Line 34: ‘Finally, we refer as intelligence’ -> syntax
Line 45: ‘activity to group two groups’ -> syntax
Line 36: ‘and display of information’ -> ‘… and presentation of information’
Line 39: ‘Then, we introduce different AI system problems and’ -> syntax
Line 44-48: 5.1.1. -> this definition has already been provided (earlier section). Same for other subsections (5.1.2, 5.1.3)
Page 7:
Line 24: ‘user’s preferences to only the items mentioned’ -> syntax
Line 34: ‘word detectors to only the captions’ -> syntax
Page 8:
Line 17-19: syntax
Page 10:
Line 1: ‘5.2. Bias impact and use of semantics’ -> The most important section so far. In general, it is well written, however, a few points must be further elaborated, and arguments must be justified. Pls revise according to the detailed comments.
Line 7: ‘AI systems (Table 4). We only include in Table 4’ -> why you chose this approach? pls justify and argue on this. It seems (from the citations) that a number of related works are not researched.
Line 11: ‘We select the most appropriate papers’ -> what does 'appropriate' mean here? what is the criterion (or criteria) for this selection?
Line 19: ‘and therefore can be more representative’ -> not sure?
Line 28: ‘in [36] is based a’ -> ‘based on’
Line 48”: ‘We find’ -> pls rephrase e.g., A number of semantic approaches capturing bias… have been researched (pls correct this in all occurrences in the paper)
Page 11:
Line 1: ‘They proof’ -> ‘They prove’
6. Discussion: This section is the strong point of this paper. However, pls check for language problems (there are quite few) and fix appropriately.
Page 12:
Line 45: ‘at different stages of the AI pipeline,’ -> perhaps 'AI system/app pipeline'
Line 50: ‘Section 5 to bring to a discussion’ -> syntax
Page 14:
Line 6: ‘We see this in [4] and many other works,’ -> pls be specific (citations)
Line 41: ‘6.2’ title (and 6.3) -> pls change to a non-questioning title
Line 44: ‘To discuss with more’ -> discuss in
Page 16:
Line 31: ‘Bias within semantic resources.’ -> This topic is given little attention (space) in the paper; however, it seems to be (perhaps) the silver bullet in addressing AI bias using semantics i.e., KGs. If the used (external or internal) KGs are already biased, the rest of the process (identifying, mitigating, assessing bias) in AI models will be affected (include the already encoded bias of KGs), a kind-of bias-propagation effect.
Line 38: ‘bias can seek into’ -> ‘bias can be found’ perhaps? I might be wrong here, but in any case, the sentence needs disambiguation
Page 17:
Line 1: ‘However, this survey succeeds…’ -> pls remove ‘However’ and replace ‘succeeds’ with ‘aim to’
Line 4: ‘to help’ -> ‘to assist’
Line 10: ‘sense’ -> ‘meaning’
7. Conclusion
Page 17:
Line 17: ‘conceptual semantics to alleviate bias in AI.’ -> 'semantics' (eventually, since it has been used like this in the paper most of the times). Also, replace ‘alleviate’ with 'address'.
Line 30-32: ‘it. Comparing semantic methodologies to other state-of-the-art bias
mitigation approaches,’ -> where is this comparison presented in the paper? I might be wrong, but pls check and address appropriately.
Line 37: ‘use in the recent years of semantics,’ -> 'use of semantics in the recent years'
Line 38: ‘semantics are helpful to address’ -> 'are helpful in addressing' or 'help to address'
|