BimSPARQL: Domain-specific functional SPARQL extensions for querying RDF building data

Tracking #: 1720-2932

Authors: 
Chi Zhang
Jakob Beetz

Responsible editor: 
Guest Editors ST Built Environment 2017

<
Submission type: 
Full Paper
Abstract: 
In this paper, we propose to extend SPARQL functions for querying Industry Foundation Classes (IFC) building data. The official IFC documentation and BIM requirement checking use cases are used to drive the development of the proposed functionality. By extending these functions, we aim to 1) simplify writing queries and 2) enhance query abilities to retrieve useful information implied in 3D geometry data according to requirement checking use cases. Extended functions are modelled as RDF vocabularies and classified into groups for further extensions. We combine declarative rules with procedural programming to implement extended functions. Real use cases and building models are used to demonstrate the value of this approach and indicate query performance. Compared with query techniques developed in the conventional Building Information Modeling domain, we show the added value of such approach by providing an extended application example of querying building and regulatory data, where spatial and logic reasoning can be applied and data from multiple sources are required. Based on the development, we discuss the applicability of proposed approach, current issues and future challenges.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 15/Oct/2017
Suggestion:
Accept
Review Comment:

I have reviewed the previous version of this submission and already then I considered it acceptable for publication.

This new version has further improved from the previous one in several areas:
- the title is better
- there is more detailed explanation of
- how geometrical information is managed,
- how extensions are implemented in SPIN, and
- what is the nature of the implementation.

Moreover, there is a new evaluation of the approach with rules related to building code checking, both with respect to performance and rule size.

The paper is accompanied by an open source implementation (in GitHub) and a repository of datasets mentioned in evaluation. I consider this a valuable contribution and overall a very valuable system.

The contributions are original and extremely significant in the area of Linked Building Data. The quality of writing is good. This is excellent work overall.

While reading, I spotted the following typos/issues:
- 1. "Closed World Assumption (OWA) "
- 3.3. "that can relatively easy to reuse"
- 4.1. "anther"
- 4.3. "compare the height of a wall derived from its geometric representation with its height quantity with a tolerance value" (complex sentence, rephrase)
- 8.3 "[?]" (unknown reference)

Review #2
Anonymous submitted on 15/Oct/2017
Suggestion:
Minor Revision
Review Comment:

I thank the authors for vastly improving the paper and addressing most of the comments made by the reviewers. The paper’s contribution is original, and the authors have refined the paper to strengthen or clarify some of the claims. The quality of writing has also been improved. As for the significance of the results, this version of the paper does indicate what the paper tries to improve in the state of the art, though the future work could still be expanded to refer to some of the limitations. Identifying future work could benefit the community and tackle problems that the authors face, but not necessary want to work on themselves.

Reading through the paper, I have the following (minor) comments.

Abstract. What are “query abilities”? I still have a problem with “real use cases”, wouldn't “realistic scenarios put forward by the community” not be better? Alternatively, the authors could state in which sources/communities those scenarios/use cases were identified. In the last sentence, the authors mention the development of something, but it’s not clear what.

Section 1. The authors could maybe add a footnote to “plain SPARQL” indicating that they refer to SPARQL queries that are compliant with the SPARQL 1.1 Recommendation. Figure 1 is not clear; what do the arrows mean? Be consistent referring to floats; the authors used “Fig. 1” in Section 1, and “Figure 2” in the second section. I believe references to figures should use “Fig.”.

Section 2. In the last paragraph, you mention “standard language”. SPIN has never been a standard. Also, with the advent of SHACL becoming a W3C Recommendation, the authors may want to say something in Future work how another prototype implementation can adopt SHACL instead of SPIN (see: http://spinrdf.org/spin-shacl.html).

Section 3.3. Triplestore(s) in one word instead of two. A reference to the Parliament triplestore is missing.

Section 5. In the caption of Listing 10, it should be “TURTLE” instead of “Turtle”.

Section 6. Restate what you mean with “real-world”. At the beginning of this section, be explicit and mention what you are evaluating or validating (both have different semantics).

In Section 7, the authors could restate that possible extensions are manifold, an exhaustive list of functions is practically impossible, and that Section 7 therefore aims to demonstrate how one could extend BimSPARQL for custom/specific scenarios. The authors could also indicate here that, while allowing (to some extent portable) extensions for specific cases are already a step forward, the knowledge/rule/ontology engineering processes is a non-trivial problem that would require appropriate method and tools that are not within the scope of this paper and maybe the subject of future work.

Section 8. Something is wrong with the reference in the last paragraph.

Throughout the paper, any mention of “section X” should be replaced by “Section X”.

I would encourage the authors to actually put in the paper some of the clarifications and comments they have provided the authors in their letter.

For instance, the authors acknowledge that the extended rules are only portable in terms of the data model (RDF) and the rule engine (SPIN), and that these have to be installed on each BimSPARQL-enabled endpoint in order to function. The authors also acknowledged the knowledge engineering activities that are required as I have indicated above, but could be stated more explicitly in the paper. Because the authors allow one to extend functions using SPIN and “arbitrary programming”, the whole problem of ensuring tractability, correctness, … can be shifted to an appropriate methods and tools for those knowledge engineering activities.

The authors used a particular representation for geometries in Section 4.3. The authors, as they pointed out in their letter, should emphasize that the same representation should be used by other implementation in order to facilitate interoperability or results that are the same across implementations. While they currently focus on triangulated boundary representations, I can understand that this can change. Yet the argument can still be made that, at the end of the day, a consensus about a representation has to be made. Especially if one wishes to put forward BimSPARQL as a standardized extension of SPARQL.

In Section 4.4.1, the authors can, maybe again, explicitly state that their aim was not to provide a full coverage of possible functions, but to provide a basis that one can extend. The authors could also elaborate on how they, perhaps, looked into other initiatives for providing their 6 functions (e.g., looking into OGC standards for commonly used functions).

Review #3
Anonymous submitted on 27/Oct/2017
Suggestion:
Minor Revision
Review Comment:

This paper present a research perspective which aim to make easier the querying of IFC files stored into an IFCOWL ontology with some sparql functions. This functions are organized in three main parts: semantic level, spatial level and geometric level. A considerable improvement was made in regards to the previous submission. Globally the article is clear, understandable and composed of many listings and examples which illustrate the proposal. Some minor mistakes are always present in this paper. For example in the introduction where the definition of the IFC is unclear: datamodel, metamodel or schema. Please choose only one. Please could you clearly explain that you deal only with IFC 2X3? This is not clear too. In the introduction you talk about Closed World Assumption. The corresponding acronym is CWA and not OWA (O for Open). I don’t really understand the sentence “as many… underlying metamodels”. Please could you clarify it? Always in the introduction, you talk about many buildings stored in the ontology. Do you have one repository for all the corresponding IFC files or a repository for each one? I think that the result of scalability greatly change depending this choice. In the background section, I don’t understand the sentence “at the same time…. [52]” What do you mean? If the IFC file doesn’t contain all the data required for the AEC process, how problems of redundancy or ambiguity may occurs with non-present data? At the end of a paper a reference lack, represented by [?] (top of the page 22 before conclusion section).
About the core of the proposal, I agree with the shortcuts develop in the semantic level. This idea is not new already presented in other works. I have more problem with the spatial function. Maybe because spatial referred in my mind to geomatics data (and geographical information systems) which are note defined in the IFC standard. In the IFC, the original coordinates are (0,0,0) and all new objects created are referenced according to this coordinates. Spatial and topologic calculation are difficult to do and I don’t understand why you want to do that into an ontology. I really appreciate your vision but I’m not sure that the geometric functions you describe (for example the detection of possible intersection of two 3D objects) must be made into an ontology. It’s a philosophical point of view. It is possible to extract separately 3D and semantic from the IFC and combine ontology for semantic reasonning with 3D checker for geometric overlap detections. The ontology will help to understand if this overlap is coherent in the real world or not.
Independently from this point, the paper underline the complexity of the querying of IFC file contains in the OWLIFC ontology. The effort made by of the authors to overcome this limitation represent a good research work and generate interests and discussions. Then, with minor changes described in the first paragraph, I accept the paper for publication in SWJ.


Comments

p.13 Table 7 -> s/betwteen/between/