Review Comment:
Summary
I thank the authors for answering my questions. In this paper, the authors propose using a many-sort logic to formalize the semantics of qualifiers in Wikidata. In Section 1.1, exemplifies rules that apply to Wikidata statements without considering qualifiers, and then, in Section 1.2, they exemplify how these rules can be modified to consider also information provided in statement qualifiers. Since Wikidata consists of so many qualifiers (more than 9000 according to the authors), they propose to organize them in five categories, so qualifiers in the same category can be treated similarly, are in separate sorts. They describe some inference rules over these five sorts, and some operations that can be applied for each sort. For example, the intersection between time ranges can be done for the sort of statement context.
Originality
It is not easy for me to identify the main contribution of this paper. Thus, I cannot say this paper is original. The general idea of having multi-sort relations to represent the semantics of statement and qualifiers does not seem novel. For instance, tracking provenance from rules is well known. It appears that the provenance sort is not needed to be added to the paper. Similarly, the intersection used to combine the V sort is not new, since the conjunction of literals in a rule implies intersection of contexts.
Relevance
It is relevant because reasoning over qualifiers can improve our capacities to make inferences over Wikidata, or to validate current data.
Quality
After the revision, I still see some major issues:
Issue 1: No translation from qualifiers to sorts
Definitions in Section 3 improved regarding the previous submission. Now it is clear that a graph G is a set of statements of the form (s,p,v, {q₁:v₁,...,qₙ:vₙ}), which are interpreted as terms of the form st(s,p,v, t₁,...,tₖ}) where for each i ∈ {1,...,k}, tᵢ belongs to a sort Sᵢ. However, I do not see the specification for this translation for the concrete case of Wikidata, which includes so many qualifiers. The specification sounds so abstract if we want to solve concrete problems that require inference. I would have been satisfied even if such a translation had been described for a small number of relevant qualifiers.
Issue 2: SomeValue
Even if this function from statements to terms is given, and we only consider terms, it is unclear how inference is performed because the set defined by sorts is not fully specified. To see this, let us consider only the sort V, but restricted to the temporal qualifiers "start time" and "end time" in Table 1. And the sort V contains then only values defined as follows:
V :=
emptyValidity() |
timeValidity(time) |
setTime(V, time) |
inter(V, V)
where time denotes the set including the ordered set of times and additional values to encode values as NoValue and SomeValue that can be given in qualifiers, as is stated in lines 3-4 and 13-14, page 9. Since the interpretation function from statements to terms is not given, I wonder how NoValue and SomeValue are translated. I expect that the statements
S₁ := (a, p, b, {start time: 2022, end time: NoValue}),
S₂ := (b, p, c, {start time: 2021, end time: SomeValue}),
are respectively interpreted as the terms
T₁ := (a, p, b, I₁),
T₂ := (b, p, c, I₂),
where
I₁ := interval(2022, NoValue),
I₂ := interval(2021, SomeValue).
Doing so I mean that (a, p, b) is valid at every point of time since 2022, ant (b, p, c) is valid from 2021 to some value which exists, but it is unknown. From this semantics for terms T₁ and T₁ we can conclude that intervals I₁ and I₂ are not mutually contained. That is, includes(I₁, I₂) is false because interval I₁ includes all points of time since 2022, whereas interval I₂ does not, since there exists an unknown point of time t such that times after t are not included in I₂.
Values NoValue and SomeValue are not provided in Table 1, rules in Appendix A, nor code in Appendix B. Instead, the value undefined is presented in Appendix A and Appendix B. Then, I can imagine that statements S₁ and S₂ are codified as the following terms
T₃: (a, p, b, I₃),
T₄: (b, p, c, I₄),
where
I₃ := interval(2022, undefined),
I₄ := interval(2021, undefined).
However, in lines 37-39, page 9, it is said that undefined means it is valid everywhere. Under this "everywhere" semantics for "undefined" interval I₁ is included on interval I₂.
It is not clear that the proposed method supports the SomeValue semantics, as it is claimed in page 9. Furthermore, consider the following two time intervals:
I₅ := interval(SomeValue, 2022),
I₆ := interval(2010, SomeValue).
What is the expected intersection of these two time contexts? We cannot know. It can be empty, or it can contain some time. Hence, SomeValue introduces non-determinism which is not considered in this manuscript, and has been studied largely in the literature of relational databases with existential null values.
Section 3.2 says that SomeValue and NoValue are translated to terms with predicates sno and ssome. However, no rules are provided for these terms.
In the previous submission, I assumed a more simple semantics for contexts where restrictions can always be determined. By introducing SomeValue, this leads to problems that, I think, are not properly considered. I would suggest not introducing SomeValue to the reasoning. Otherwise, you should explain how you will handle these issues.
Issue 3: Multi-sort logic
The proposed method uses many-sorted logics to define representations of statements. There are two cites regarding the notion of many-sort: [4] (line 8, page 7) and [6] (line 14 page 8).
In [4], sorts are determining the range of variables. Sorts can be Animal, Wolf, and Fox, where sorts Wolf and Fox are subsumed by sort Animal, and sorts Wolf and Fox are disjoint. Sorts are useful in logics because they allow to reduce the search space in inference. Indeed, if we know that x is a Wolf and y is an Animal, and z is a Fox, then variables x and y can be unified, but variables x and z cannot. In this manuscript, sorts do not subsume other sorts, so I do not understand why the formalism in this paper is related to the one in [4].
In [6], the notion of many-sort is used in several contexts (e.g., many-sorted set, many-sorted relation, and many-sorted function). I would have expected the operations intersection and union be defined as many-sorted operations that are defined over each sort. However, operation inter only appears in sort V, and operation union only appears in sorts C and P. Hence, it is not clear to me if this many-sort notion is really used.
Hence, I do not see a clear connection between the multi-sort logic and the proposal.
Issue 4: Comprehensiveness
The authors provide some rules, but their comprehensiveness is not shown.
From the theoretical perspective, I would expect to read a comparison with existing formalisms (e.g., [13]) regarding expressive power. I would expect that rules are defined with some rational. For example, some rules appear when extending rules that are called "ontological" in order to consider qualifiers. These rules follow a rational that is guided by the question: How a known set of rules can be extended to consider qualifiers? If this were the question, then I would be satisfied with the story. The rules are comprehensive because cover a well-known set of rules. However, if you consider additional rules, then it is not clear why these rules are described, and other possible rules are not. The set of rules is not comprehensive anymore.
I also miss a semantics for the data that can be used to infer the rules. Here, you are presenting rules and examples that we can intuitively follow to check if the rules are sensible.
From the practical perspective, I would expect to see an evaluation on how many inferences these rules allow, or the support and confidence of the proposed rules in Wikidata. This evaluation is missing in the paper.
I would have been happy with either a theoretical or a practical approach to show the comprehensiveness of the proposed rules.
Conclusion
Although the topic is relevant, I would not recommend the paper in its current form. The main issue I see is the lack of enough comprehensiveness. This could be addressed by either providing a theoretical evaluation, or by providing a practical evaluation that shows how these rules can be used for a concrete task.
|