PaaSport Semantic Model: An Ontology for a Platform-as-a-Service Semantically Interoperable Marketplace

Tracking #: 1503-2715

Nick Bassiliades
Moisis Symeonidis
Panagiotis Gouvas
Efstratios Kontopoulos
Georgios Meditskos
Ioannis Vlahavas

Responsible editor: 
Lora Aroyo

Submission type: 
Full Paper
PaaS is a Cloud computing service that provides a computing platform to develop, run, and manage applications without the complexity of infrastructure maintenance. SMEs are reluctant to enter the growing PaaS market due to the possibility of being locked in to a certain platform, mostly provided by the market’s giants. The PaaSport Marketplace aims to avoid the provider lock-in problem by allowing Platform provider SMEs to roll out semantically interoperable PaaS offerings and Software SMEs to deploy or migrate their applications on the best-matching offering, through a thin, non-intrusive Cloud broker. In this paper we present the PaaSport semantic model, namely an OWL ontology, extension of the DUL ontology design pattern. The ontology is used for semantically annotating a) PaaS offering capabilities and b) requirements of applications to be deployed. The ontology has been designed to optimally support a semantic matchmaking and ranking algorithm that recommends the best-matching PaaS offering to the application developer. The DUL design pattern offers seamless extensibility, since both PaaS Characteristics and parameters are defined as classes; therefore, extending the ontology with new characteristics and parameters requires the addition of new specialized subclasses of the already existing classes, which is less complicated than adding ontology properties.
Full PDF Version: 


Solicited Reviews:
Click to Expand/Collapse
Review #1
By Idafen Santana submitted on 15/Jan/2017
Minor Revision
Review Comment:

The paper introduces the PaaSport semantic models, as the main supporting layer for a matchmaking process designed to select the most suitable appliance over a set of Cloud providers. The main goal of this process, as stated by authors, is to assist the deployment or migration of applications into a new PaaS infrastructure/provider.

The paper covers an interesting topic, which several contributions have addressed in the last years. Authors fairly state the main contribution of their work and compare them to previous efforts in the area, describing how they rely and extend them where necessary. Moreover, the main contribution of the paper is evaluated correctly, and sufficient information is provided in order to support the claims exposed on it.

Authors propose an experimentation process which studies the validity of their contribution from two differents points of view. First of all, the present the results of evaluating the semantic models by means of different tools, and how they have addressed the issues that arose from them. Secondly, they evaluate the impact of using these models on the selection process, comparing it with other representative model. Experimentation results are well explained and justified.

Overall, the paper is well writing and can be easily understood, despite some parts of the speech being a little repetitive throughout the different sections.

Despite the overall impression about the manuscript, some issues should be addressed before publishing it:

- The main speech of the paper flows around the applicability of the PaaSport solution for SMEs, both as PaaS providers and software companies. While this is a suitable and valid case, I don't see why authors have focused the application of their solution to this group of interest. The solution is valid for many other use cases, as covered by some of the approached in the SOA section.

- Even though authors explain the conversion of units (e.g. GB, MB), and introduce it as one of the parameters in their evaluation, there are other kind of conversion which are not covered. Not all the providers state their offering using the same terminology. One of the most representative cases would be Amazon AWS, in which computation capacity is expressed in terms of ECUs, instead of CPU capacity. In order to obtain the equivalent in GHz/MHz, those units must be multiplied by the amount of each unit (which may vary over the time). These kind of conversions require more complex transformation, which can be addressed using more complex queries. Authors should clarify in which point of their process this could be done. This topic has been covered in the past, using SPIN rules, as also introduced in the last section.

- In table 3, regarding ontology requirements, authors introduce scalable and efficient reasoning as one of them. It is not clear for me which type of reasoning they are introducing and how it fits on the recommendation process. Even though this is stated on the paper referenced in [5], it is not clear for me whether this a requirement of the ontology or an algorithm one. Despite providing a description of the DL profile of the models (done later in the paper), authors should clarify this point on this paper, as it may impact the performance evaluated in section 6.

- Authors do not explain in detail how the annotation process works. Having a valid annotation of the services and infrastructures used is critical to the process. Even though the GUI is based on the models, thus making the annotation fit the vocabulary, is not clear how the system is sure of their validity, and that they are consistent with the corresponding SLA and QoS. Whether this is an assumption or something out of the scope, it should be clarified on the text.

- Both in tables 4 and 5, as well as on section 4, there is no a proper justification of why major providers (e.g. Amazon AWS) are left from the list of studied platforms.

- The following paragraph, justifying the inclusion of origins and Cloud Foundry, is not really clear:

A PaaS typically resides on top of IaaS providing
the ability to access remote computing resources. With
IaaS there is the possibility of remotely controlling
machines or virtual machines that can be used as necessary.
Thus, it was decided to study the OpenShift origins
and Cloud Foundry, two of the most popular
PaaS platform systems.

- Even though is not directly referenced in the text, based on figure 6, it seems that the PaaSport models only cover 1 level of dependencies. It this enough for any service/application? Could dependencies graphs (instead of the exemplified trees) happen?

- Figure 26 is not easy to follow when describing the algorithm. It describes the main parts of the process, but it is not clear for me how the steps are performed.

- In section 6.2 it is explained that the experimentation configurations are stored in a triple store (GraphDB). Why aren't they stored and retrieved using D2RQ as in the rest of the system?

Some other really minor issues to be corrected as well:

- The footnote referencing the PaaSport models could be moved to an earlier section of the paper, which will be helpful to the reader.

- On page 4, in the sentence: "...architectural framework are presented The taxonomy... " there is a missing period in between.

- On page 12, in the sentence: "..., featuring five (5) distinct...", the "(5)" is not really necessary.

- Figure 21 and 22 quality could be improved. It is a little hard to read.

- Table 7 is not really a table. It could be implemented as a listing. The line numbers are really helpful though.

Review #2
By Eva Blomqvist submitted on 16/Jan/2017
Review Comment:

The paper describes the details of an ontology, supporting a marketplace for platform-as-a-service offers. The ontology is used to annotate offers and requirements and then perform matchmaking using a broker service. The paper is well written and easy to read, and it contains a lot of details that makes the reader get a proper understanding of how things work, especially concerning the modelling choices in the ontology and the technical architecture. This is on one hand the strength of the paper, but on the other hand also a weakness - the paper is very long. Even for a full research paper 43 pages is a lot, and I would also like to argue that this paper does not really fulfil the criteria of being a research paper, but should rather be put into one of the two alternative categories "ontology paper" or "application paper". More specifically, the criteria for a research paper are originality, significance, and quality. Although there is nothing wrong with the quality of writing here, I do not see many original research results, nor do I see much significance in the contribution to the research community - practical significance and contribution, yes, but scientific significance in terms of new important insights or methods, probably not. However, the two other categories of papers mentioned above are restricted to short paper (according to the submission guidelines around 10 pages as I interpret it), whereas the paper in its current form would certainly not fit there either due to its length. My best overall suggestion would be to split the paper into three: one ontology description paper focusing on the ontology as such, which could then be submitted to that track of the journal, one application paper describing the matchmaking process and the corresponding broker software architecture etc., which could then be submitted to the application description track, and finally, an online technical report, which could look much like the current version of this paper (or with even more details), putting everything together and giving detailed examples of how to use the model and the service.

It is with slight reluctance that I give this recommendation for the paper, because I did in fact enjoy reading it, and I think it shows nicely a practical case of developing an ontology starting from an ontology design pattern (as part of an upper ontology). However, even apart from the formal criteria of the journal the paper does not have a clear target group - who are the readers? For a full documentation of an ontology I would expect to go to an updated webpage, rather than a journal paper (that may be outdated already at the next release of an ontology version, in terms of technical details), hence, the paper is not really interesting as a piece of ontology documentation. There are however interesting details in how the ontology was developed, certain modelling choices, i.e. the consistent use of DnS, requirements and application examples, as well as the nice discussion on scalability, which are really interesting and would fit very well as an ontology description paper, targeting ontology engineers and researchers, or potential users of the ontology itself. Similarly for the case of how the ontology is applied and the architecture of the brokering service. Although I am not an expert in that domain, I would assume that this could make an interesting application paper, but readers of such a paper would most likely not be very interested in the development method and details of the ontology, but rather how to use it and what the service can do. So these are two very different target audiences. Finally, some technical details fit better in a living online document, rather than a paper, since they may be extended or even changed in the future.

Considering my recommendation above, I will not go into great detail in my comments regarding the paper content. Nevertheless, I have a few comments and questions that the authors may consider:
* What is the motivation behind the choice of OWL 2 DL? Why not a simpler OWL 2 profile, with better scalability characteristics? What specifics of OWL 2 DL is it that you cannot do without in this particular application case?
* The requirements presented in the paper are quite high-level, and I think they should be - you do not have space in a paper to go into details on this. However, some of the requirements do not seem very concrete and easy to verify, since they contain terms such as "easily" - how do you measure that? Furthermore, concerning your more detailed requirements, one obvious evaluation of the ontology, in addition to the structural methods presented in the evaluation section, would of course be to verify that it fulfils its requirements. So even if you do not list all the detailed requirements in the paper it would be interesting to hear how many such requirements you had, in what form they were encoded (e.g. competency questions or something else), and how you verified them. Along these lines you could also mention something on the rest of the ontology development process, i.e. did you use a specific methodology, who was involved, where there several iterations etc.?
* Do you estimate that the ontology largely covers this domain as it is now? Or do you see many extensions in the near future? In the actual ontology file there are some comments mentioning points of extension, however, the ontology does not seem to be annotated with any version number/version IRI, nor any other metadata, e.g. what set of requirements it covers or similar. What is your plan for maintenance and extension, and how stable is the current ontology? This is not really clear from the paper, nor the ontology itself.
* From the paper it is not entirely clear what kind of reasoning (i.e. based on OWL semantics) the ontology is used for, if any, or if the "reasoning" is only embedded into SPARQL queries? Closely related to this is also the question on why you need DUL, i.e. just for the generalisation capabilities and interoperability, or also for reasoning, i.e. using inherited axioms from DUL for consistency checking etc.?
* You do mention the reason for using an SQL-DB as storage, but only towards the end of the paper, and not in great detail. This would deserve a more thorough motivation, since the obvious choice would otherwise be a graph store, in which case you avoid all problems with having to modify the schema in the first place etc. Since this motivation comes late, the earlier discussion on the mapping becomes obscure to the reader, when the obvious choice would be to choose another backend.
* For someone who is not really an expert in the application area of this ontology, this paper also raises the questions how the work relates to the filed of semantic web services and service matchmaking? Potentially some of the related work listed in the paper stem from this area, but it is not so clear unless you have been following this research area closely. Also the paper does not make a detailed comparison between this ontology and the related Cloud4SOA ontology, but rather just discusses them at a very high level. In an ontology paper, I would expect a more detailed discussion of their respective coverage and reasoning capabilities.
* Finally a note on the mentions of DUL. Whereas in most places DUL is described nicely, and DnS is mentioned as the pattern, there are also some places where DUL is in itself referred to as a pattern, which I do not really agree with. I think this is merely a bit sloppy use of terminology, however, it should be revised if the paper is resubmitted.

Review #3
By Andrea Giovanni Nuzzolese submitted on 25/Jan/2017
Review Comment:

The paper presents an ontology for modelling the knowledge related to Platform as a Service (PaaS) category of cloud computing services. Such an ontology encompasses different knowledge layers involved in the description of PaaS platforms, applications, offering and actors. Additionally, the ontology is designed as part of the PaaSport cloud broker architecture, which focuses on resolving the data and application portability issues that exist in the Cloud PaaS market through a flexible and efficient deployment and migration approach.

The paper tackles an interesting topic with a particular focus on the knowledge modelling, which is relevant to the journal.
Nevertheless, the paper shows a number of weaknesses that, in my opinion, prevent the its publication as it is in the current form. Namely, those weaknesses involve the research narrative, the originality of the work presented in the paper, the soundness of the ontology presented and its evaluation.

== Research narrative
The paper needs to be significantly reworked in most of its parts.
In fact, many sections contain repetitive and redundant content that complicates the reading mostly. For example:
* the fact that the semantic models are built on top of DUL is unnecessarily repeated more than once;
* the text in some sections repeat the content expressed by tables (e.g. Table 2, 3 and Sections 3.2-4);
* there is an unnecessary level of details in some parts. E.g. Figure 7 does not introduce relevant details and novelty to the reader if compared with Figure 5. The same holds for the triples provided in RDF/XML as examples if compared to Figure 6;
* many parts can be summarised with much more high-level details and references (if external works exist). This, for example, involves the description of the architecture (presented in [5]) or the part about D2R with associated figure at page 30;
In my opinion all these aspects make the paper unnecessarily long especially if compared with the real contribution of the work, i.e. the semantic models and their evaluation.

== Originality of the contribution
Great part of the work presented in this paper has been already presented in [5] and published to the journal of Expert Systems With Applications. Also the ontology (with the examples and exactly the same SPARQL query pattern) has been presented in [5]. Hence, the novel contribution of the paper is very limited and mainly involves the of the evaluation on pitfalls detection based on OOPS!

== Soundness of the ontology
In ontology design the requirements are typically recorded as competency questions [i] that help the modelling phase and serve as benchmark to assess the ability of the resulting ontology to cover the domain. Such competency questions are not provided at all and, consequently, it is not clear to what extent the proposed ontology is effective. It is much appreciated the effort of the authors to provide a section (i.e. Section 3.1) focused on the requirements for the semantic models. However, it is not clear the rationale between the requirements and the design of the ontology.
The most peculiar sub-classes of PaaSParameter are rather categories (also the authors refer to those classes as categories of parameters) than classes. In my opinion, those classes introduce a unnecessary level of granularity for an ontology that should (i) serve as a reference model; (ii) foster the sharing of knowledge in the PaaS ecosystem; (iii) be easily extensible. I would suggest to rework that part of the ontology by modelling those classes as individuals by also reusing modelling solution like the classification ontology design pattern [ii].

== Evaluation
The evaluation based on OOPS! detects possibile pitfalls in the ontology. The comparison of query response times provides hints on possible performance issues. However, no analysis has been carried on in order to record the effectiveness of the ontology (i) to cover the domain efficiently and (ii) to be an added value for the PaaSport architecture for knowledge sharing and interaction.

[5] Bassiliades N., Symeonidis M., Meditskos G., Kontopoulos E., Gouvas P., Vlahavas I., A Semantic Recommendation Algorithm for the PaaSport Platform-as-a-Service Marketplace, Expert Systems with Applications, volume 67, pages 203–227. Elsevier, 2017. DOI: 10.1016/j.eswa.2016.09.032

[i] Michael Gru ̈ninger and Mark S. Fox. In: Benchmarking – Theory and Practice. Ed. by Asbjørn Rolstad ̊as. IFIP Advances in Information and Communication Technology. DOI: 10.1007/978-0-387-34847-6_3.