OntoBroker - Mature and approved semantic middleware

Paper Title: 
OntoBroker - Mature and approved semantic middleware
Authors: 
Jürgen Angele
Abstract: 
OntoBroker provides a comprehensive, scalable and high-performance Semantic Web middleware. It supports all of the W3C Semantic Web recommendations for ontology languages and query languages. It is an ontology repository that includes a high performance deductive reasoning engine. Especially reasoning with rules is a major unique selling point for ontoprise. OntoBroker integrates a connector framework which makes it easy to connect a multitude of data sources like databases, web services etc. Thus it combines structured and unstructured data in one framework, OntoBroker is easy to extend and to integrate into existing IT landscapes and applications as it offers a variety of open interfaces. OntoBroker is also closely connected to ontoprise’s ontology modeling environment OntoStudio which is the development environment for handling ontologies, mappings to information sources, rules, generating queries, creating business intelligence reports etc. At many customers OntoBroker serves as a common semantic layer which is accessed by various applications and integrates different information sources. OntoBroker is the run-time environment for industrial solutions like SemanticGuide, SemanticXpress, and SemanticIntegrator. As part of those meanwhile thousands of installations are in productive use.
Full PDF Version: 
Submission type: 
Tool/System Report
Responsible editor: 
Rinke Hoekstra
Decision/Status: 
Accept
Reviews: 

This is a revised manuscript, now accepted for publication, following an "accept with minor revisions." The reviews of previous rounds are below.

Reviews for the resubmission:

Solicited review by Jacopo Urbani

The quality of the paper has improved and most of the concerns were
addressed. Unfortunately, I cannot argue for a complete acceptance
since I couldn't find in the new paper the two figures about the
openrule benchmark on multicores and load performance that are
mentioned in the rebuttal letter.

Even the text is more detailed on these two issues, the two mentioned figures
are supposed to answer to my concerns about the multi-core performance
and loading time so I would like this issue to be clarified before
proceeding to a further stage.

Some minor comments that appeared in the new version:

- abstract:
"makes it easy do connect" -> to connect

- It would be clearer if at page 6 there is a reference to the
benchmarks in 2.2. Now the reference is only in section 2.2 but it's not so
useful since there the topic is different, while it would have enriched the
explanation if set it in sec. 2.1.

- page 12: reference to dbpedia is missing

- page 14: (missed from the previous revision) a citation or a reference
to the competion and the prize to ontobroker should be added.

Solicited review by anonymous reviewer:

In my first review I strenuously objected to the highly misleading use of
the so called "ObjectLogic" as a new kind of invention. Unfortunately, in
the revised paper the author persists and piles up even more gibberish on
this matter.

First, ObjectLogic (as well as F-logic2 mentioned in the paper) is nothing
but a commercial gimmick, which has no place in a serious scientific
journal. None of this is original or belongs intellectually to the author
of that paper.

Second, the very term Object Logic (or O-logic) was introduced by Maier
almost 25 years ago and then further developed by Kifer and Wu. Object
Logic is not an extension of F-logic, but a subset of it.

Third, the author now claims that ObjectLogic is an extension of F-logic
with a "lot of modeling features", like property hierarchies, complex logic
formulas, and builtins. In reality, all these issues are completely trivial
and have been discussed in
http://forum.projects.semwebcentral.org/forum-syntax.html
and various other papers, including the original 1995 paper on F-logic.
The mention of builtins as a "language extension" is particularly amusing.
The author does not seem to have a vaguest idea of what constitutes a
credible language extension.

Claiming credit for trivialities and unoriginal ideas can hardly enhance
the reputation of the author.

Finally, I may not have noticed this before, but the title of the paper is
both hilarious and unbecoming, fitting well with the general tendency of
unrestrained self-praise permeating the paper. "Mature and approved
... middleware"??? Approved? By whom?

I do not think that a self-respecting journal can afford publishing a paper
with such stylistic and integrity flaws. The author must either eliminate
the serious issues that I and other reviewers pointed out or face rejection.

Reviews for the original submission:

Solicited review by Jacopo Urbani

The paper presents the tool "OntoBroker" which is a generic middleware
for storage and processing of RDF data. The paper describes the
the system and highlights the features and performance.

The paper describes the product (OntoBroker) in a clear and concise
way. The scientific output of the paper is little but since the paper
is under the Tools track this is not an issue. The features are
presented in detail but the paper requires an overall improvement.

The main problem the paper is that the performance analysis is
minimal. One of the selling points of this product is that it can be
used in many different scenario (which are presented at the end of the
paper) but only the tool's performance with the Open rule Benchmark is
presented. It is not explained how the system performs in all the
other cases and such explanation would make the work more
credible since currently many statements in the papers are not
supported by any number (see below for a list of them).

It is not clear whether the system has some limitations (that is one
of the criteria to evaluate this paper). In order to make the paper a
more complete description of the system, I suggest that the author(s)
add a section describing where the system doesn't work well or worse
than the others (if there is such case of course, anyway more
explanation on it is needed).

In general I argue that this paper should be accepted, but only under
the condition that the problems described below are fixed.

- The abstract is missing and it should be added in the final version of
the paper.

- The quality of the pictures is low and it is difficult to read them.

- Page 3: there is a question mark that I don't know the meaning of.

- ObjectLogic should be introduced an properly explained since it is a
fundamental part of the system.

- Page 5: topdownto use (should be splitted)

- Page 5: EDB is used but not introduced.

- Page 6: "In general, it can be said that dynamic filtering beats
magic sets". This is an example of statement that should be
motivated by some numbers. Have you run experiments to conclude
this? Do you refer to existing work in literature? In both cases,
more explanation is needed.

- Page 9: "This (the architecture) supports multi-core/multi-processor
hardware extremely well". Again, this statements requires more
explanation. What does "extremely well" means? Some experiments
should be added to justify this sentence.

- Page 10: "The relational layer may currently store up to 500 million
triples in a persistent way". More explanation is needed: what kind
of relational database do you use? Can we load more triples in the
system or is 500 millions the maximum that it can handle? How long
does it take to load this amount? (In particular this last question
is very important to determine the quality of the engine).

- Page 11: "In the meantime, the W3C has issued several
recommendations. OntoBroker supports all of those". Please
reformulate this sentence. It is unclear.

- Page 11: "ontoprise" without the first capital letter

- Page 12: "viz. ontologies" please avoid abbreviations unless it is
necessary.

- Page 13: "can be integrated in a very comfortable way". The "very"
adjective sounds as an attempt of overselling.

- Page 15: "In several production lines, an impressive amount of
specialized machines...". What is impressive: 10,100, 1000 machines?

- General: please, add more references to RDF, OWL, LOD and the other approaches you used for the comparison.

Solicited review by anonymous reviewer:

This is an overview of the OntoBroker system and its applications.
The paper describes briefly the theoretical underpinnings of the system,
then goes into the performance and application issues.

Apart from the occasional English lapses, this is a well-written overview.
My only concern is with the paper hyping up what they call "ObjectLogic" as
the "best of the breed" as if this is some kind of a nontrivial theoretical
or practical invention. This is bound to mislead an unsuspecting reader to
which I object. In reality, ObjectLogic is nothing by a combination of
several well-known ideas that are completely orthogonal to each other and
whose desirability or combination in one system was never in doubt. It is
just that different systems stress different combinations. I would
therefore request that the authors tone down the hype a few notches.

The following is a list of minor comments (mostly correction of English lapses).

p. 1 productive --> production
p.2 allows the connection of --> supports connectivity with
In the following chapter ...: Chapter or section?
p.3 integrate into existing (?): a spurious question mark?
and hence positioned on top ---> layered on top
p.4 the following way --> as follows
p.6, right above QSQ: dynamic filtering beats magic sets in the first case.
which case? Unclear.
Sec 2.2, l.1: some successive -> several successive

line below Fig 4: pure data flow --> purely data flow
sec 2.6: turing --> Turing
Sec 3, l.1: live-time? Did you mean run-time? Or real-time?
Sec 6.2: temperatures, pressures --> temperature, pressure
after the starting procedure --> after start.

Tags: 

Comments

This is a tools & systems paper presenting ontoprise's ontobroker, a well-known reasoner for F-Logic with commercial impact.

The paper is very nicely structured, presenting ontobroker's architecture, including a nice mini-tutorial on the different reasoning algorithms of the system. Then follow use case scenarios and brief presentations of commercially deployed applications.

The paper is clearly in scope as a tools & systems paper, and ontobroker is quite clearly software with considerable commercial impact. The paper is therefore excellently suited for publication in the journal.

However, the paper needs some major polishing before it can be published. In particular, there are many unexplained notions, which either need to be explained (perhaps on a general level) or references need be given. Generally, many more references should be given where appropriate, and in particular to support claims. In some places, the paper is overselling, which can easily be corrected (the success of the system speaks for itself anyway). Many of the figures are of low quality, sometimes unreadable. Finally, the paper needs a careful proofreading.

I give some more details below in the hope that they will be helpful in revising the paper.

page 1

please add an abstract

references to RDF, OWL. SPARQL, RIF (RIF is not discussed later in the paper)

ObjectLogic should be explained in more detail somewhere because it is refered to a lot later in the paper

"best of breed" is overselling

page 2

"omnipotent basis for nearly any semantic application" is overselling

the picture prints very poorly

page 3

"normal logic" is not clear. perhaps "normal logic programs"?

"Horn logic with negation" doesn't exist - perhaps Horn logic plus negation. But then you need to explain what kind of negation you mean (even if you mean "normal logic programs" it's not clear, or would at best default to SLDNF negation, which is probably not what you mean

there's an "(?)"

what's a LOD endpoint? reference to LOD - and it's awkward that it is only explained later (and in an insufficient way, see below)

"fulfills all of the information requirements" - this is overselling?

Figure 2 is not explained and difficult to read

page 4

please explain (or give a pointer to) the syntax used

in the top example, is UNA an issue? why not? If this is done under UNA, is there an issue with OWL 2 RL (which is not under the UNA)?

disadvantages -> singular

page 5

EDB not explained (it's explained later)

last sentence before "Magic Set Evaluation" needs to be rewritten

page 6

results -> result

"beats magic sets in the first case" - I don't understand "in the first case"

what does QSQ stand for? reference?

unifies with the body *literal*

we will consider the first rule body (not clear what this exactly refers to)

page 7

tend to -> tends to

the implementation cannot handle left-recursive rules and cycles ... -> why?

object logic query -> explain, likewise objectlogic

"our query selection" -> unclear

page 8

shows such an operator net -> for which rule set?

explain the operations MV, &, etc.

the figures are unreadable

page 9

drastically reduced -> why?

Sentence line 3 rather unclear

line 4 "Fig 6 shows" -> it doesn't, really. please explain. Figure 6 is quite unclear

SGI -> what's that?

The SGI example -> there should be a bit more info or a proper reference.

each query can in turn be evaluated in several parallel threads -> not clear why/what happens

open rule benchmark: explain, reference

page 10

references to the other reasoners

EDB explained here (too late)

what are triples in this context? You're not talking about RDF

b+ trees -> reference?

will hence increas -> should hence increas (work is not finished, you don't know)

Please address limits of ontobroker, in this case, what brings ontobroker to its current performance limits?

page 11

chapter about information integration (use numerical reference - and it's not a chapter)

SPARQL Query -> SPARQL

"with some minor restrictions" -> which?

"best scalable for large ontologies" -> I don't think that this can easily be substantiated and should be reworded.

under which semantics are you performing OWL 2 RL reasoning? Can you do TBox reasoning? Are you sound and complete with respect to the semantics which OWL 2 RL inherits from OWL 2 DL? If so, it's worth a mention. If not, it should also be mentioned.

Frame Logic: introduce acronym F-Logic properly, and give a reference when it is first introduced

Horn logic using negation -> unclear (see above)

"fastest subset of predicate logic" -> this is certainly not true

"cannot be expressed in OWL" -> this is a wild claim, please reword

what about RIF?

page 12

LOD reference missing

first bullet point is partly a repetition

page 13

the mappings reference seems strange and old. Perhaps the Euzenat/Shvaiko book? OTOH you seem to need "complicated" mappings (not those usually considered in ontology matching). perhaps best to explain in more detail what you mean.

the bullet points are partly salestalk. I don't find it too bad, but perhaps you can adjust this a bit.

[FAM2004] seems to be a strange reference for such a general claim

OntoStudio is mentioned, but not introduced

page 14

picture is of very low quality

section 5 is too brief, please expand

page 15

is there a paper reference for the kuka use case?

page 16

references to ICD, Mesh, NCI

page 17

"winner of the open rule benchmark" -> reword

page 18

The references need formatting and polishing. use papers instead of URIs if possible. add editors and publishers.

The messages of the paper are twofold. The first claimed contribution is a new language called ObjectLogic, the second one the maturity of the implementation OntoBroker. While the latter one seems to be well established, the first one is false and strongly irritating. To the best of my knowledge the only description of ObjectLogic is a tutorial available on the web from the company's homepage (http://www.ontoprise.de/fileadmin/user_upload/Publications_EN/ObjectLogi..., Nov. 21st 2011). However, this tutorial is nearly completely identical to a previous tutorial called 'How to write F-Logic – Programs covering OntoBroker Version 5.2, September 2008'. Except some minor additions (mostly because adapting the new F-Logic syntax introduced by the F-Logic forum) the content of the ObjectLogic tutorial is a copy with some rearrangements of the previous F-Logic tutorial. The author replaced the notion 'F-Logic' wherever it occurs by 'ObjectLogic'. The author must retract the notion 'ObjectLogic' and call his system an implementation of F-Logic, as he did in the many years before.
The following has to be corrected as well. The author mentions a language F-Logic2 - however such a language has never been introduced, neither in reference [14] nor [3]. Moreover, in [3] no modeling of OWL features are mentioned which go beyond F-Logic original ones.
To summarize, presenting ObjectLogic as a new kind of a language is an attempt to claim authorship for work developed by others. This is in severe conflict to academic ethics and behavior. The paper as it is should not be published.

The latest resubmission addresses the concerns of the reviewers to an extent which is acceptable for publication.

Recent articles [1] mention the lack of user-friendly tool support as one of the reasons for the still lingering adoption of Semantic Web Technologies in enterprises. The OntoBroker infrastructure is one of the few tools closing this gap and in my opinion, this is its main merit, more so than new research.

OntoBroker is indeed an environment where IT experts without deeper academic knowledge of Semantic Web Technologies can build a self-contained and useful application out of the box.

This paper may help such IT-experts (software developers with expertise in object-oriented programming, requirements engineers, data modelling experts, even business persons with some technical affinity) to understand what this type of semantic middleware is good for and how it works.

The reader should have some basic knowledge on ontologies, though.

Minor comments:
(1) Figure 1 OntoBroker Architecture seems to be a screenshot from Powerpoint. Some words are underlined red as from the language correction feature.
If still possible, that should be corrected.
(2) The implementation of the OWL 2 RL Reasoning (OWL 2 RL/RDF rules) mentioned in 2.6 Language support is described in Deliverable D3.3 of the ONTORULE project [2].

[1] http://www.semantic-web-journal.net/content/ontologies-reasoning-enterpr...
[2] http://ontorule-project.eu/outcomes?func=fileinfo&id=46