An Ontology for Open 311 Data

Tracking #: 915-2126

Authors: 
Mark Fox
Soroosh Nalchigar

Responsible editor: 
Guest Editors Smart Cities 2014

Submission type: 
Ontology Description
Abstract: 
Last decade has seen a rapidly increasing interest in the publishing of city data. Applying data analytics to these data could result in discovery of city knowledge, insights and thereafter data-driven decision making and action. A major challenge in this context is to integrate data coming from different sources for later analyzes. This paper proposes a formal foundation ontology, called Open 311 Ontology that provides a unified terminology and a reference model for representing the 311 data of cities. It is illustrated that the this ontology could be used for reasoning and answering competency questions as well as mapping and integrating data coming from various sources.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Boris Villazon-Terrazas submitted on 09/Jan/2015
Suggestion:
Major Revision
Review Comment:

This manuscript was submitted as 'Ontology Description' and should be reviewed along the following dimensions: (1) Quality and relevance of the described ontology (convincing evidence must be provided). (2) Illustration, clarity and readability of the describing paper, which shall convey to the reader the key aspects of the described ontology.

This paper tries to describe an Ontology for representing the data about non-emergency municipal services to the public. The ultimate goal of 311 systems is to enhance accessibility of city services. In this line Open311 is a collaborative model and open standard for civic issue tracking. In practice Open311 publishes a set of free and public available APIs that provide access to 311 services.

The authors try to model the information present on 311 datasets coming from four cities: Toronto, New York, San Francisco, and Chicago. Moreover, the authors describe part of the ontology development process, by defining a set of competency questions. Next, they describe the most important classes and properties of the ontology. They, also presents the already existing related ontologies for modeling the domain, and present an evaluation based on performing SPARQL queries.

Comments
- the paper does not follow the SWJ paper format
- there is not exhaustive analysis on the degree of coverage of the ontology over the data sources, this is important in order to validate the quality of the ontology
- is the ontology actually being used and validated in a real case scenario? my guess is not. So, there is no real evaluation.
- there is no detailed description of the ontology, how many classes, properties, etc does the final ontology have?
- I think that you cannot put as a reference a working paper (reference 3)
- In section 1 you mention the problem of NER, but you do not discuss about this in the next sections, how you really tackle this?
- Did you check other ontology development methodologies?
- There is no detailed description about the data sources of Toronto, New York, San Francisco, and Chicago. How big are the data sources? what is the format of the data sources? what are the entities (all of them) included on the data sources? all of them are expressed in English? Again, I was expecting a more detailed analysis of those data sources. Moreover, I'm also missing a brief description of the Open311 API
- In section 3.1 you are literally writing "hypothetical use case scenarios". Why? The ontology is going to be deployed on real case scenario, right?
- What are the functional and non-functional requirements of the ontology? I was expecting more competency questions, you only have six.
- The ontology is available at [1], but is not published following the best practices [2]. Moreover, there is no documentation about it. Why the ontology is not available in Turtle?
- Regarding the ontologies you reused, just checking the owl file I can see that you also reused foaf among others, but there are not mentioned on the paper. why?
- At the beginning of section 3.4 you forgot to list the International Contacts Ontology; the same for DBpedia ontology.
- Figure 3 is missing, and you have two Figure 1's
- In section 3.5 you have a few axioms, but I'm not sure if they are used later on. I think you need to include the description of why and when these are needed.
- Minor typo section 4. out ontology -> our ontology
- You never said how did you populate the ontology from the selected data sources? How many instances did you get?
- Is there any public SPARQL endpoint to have access to the data?
- According to the SPARQL/RDF spec you cannot define as a prefix a digit leading name, so the examples with defined prefix 03110 are not going to work, unless I'm blind and the prefix is O311O? Did you actually try the SPARQL queries?
- Regarding section 4.2 I was expecting a more detailed description on the mappings and also who did you transform the data from data sources to RDF.
- Finally, as a conclusion of the review I can say that the ontology is not being used on real use case scenarios, so there is no real evaluation on the ontology. Therefore I think the paper needs a major revision.

[1] http://ontology.eil.utoronto.ca/open311.owl
[2] http://www.w3.org/TR/swbp-vocab-pub/

Review #2
By Carsten Keßler submitted on 13/Jan/2015
Suggestion:
Major Revision
Review Comment:

Nicely written ontology description paper. Besides a number of small issues listed below, I have two major concerns about this paper: for one, it is very long for an ontology description. These are supposed to be short papers, so this paper would need to be shortened significantly.

The other concern is the relevance of the ontology. While 311 data is clearly a useful aspect of open data, I'm wondering whether any adoption of the ontology can be expected in the foreseeable future, given the limited number of sources that publish such data. The authors should hence provide convincing arguments that their ontology will be of use for others.

Some specific issues:
- This is not a foundation ontology, it is clearly at the application level (abstract).
- On page 2, the authors claim that the ontology covers the “complete” terminology for 311 calls. I don’t think this is possible or even desirable; ideally, the ontology should be extensible (in fact, it is).
- Can you give some examples of the missing values in the NYC dataset (which columns, why are they missing)?
- In section 3.3, some of the data properties seem to me like they should be object properties, specifically the AddressType, Borough, LocationType, Neighborhood, and Source. If there is a reasoning behind keeping them as data properties, the authors should include this in the text.
- The paragraph heading at the bottom of page 7 should say "GeoNames ontology".
- The axiom restricting the number of cross streets to max. 2 may be too restrictive.
- Considering where the data is coming from, splitting up time stamps into day, time, and month accoring to the time ontology will most likely bloat the dataset. I think a data property with xsd:dateTime values would be more useful here.
- Figures 5 and 6 are not intuitive. It is not clear which fields are the "input fields" from the city datasets, and which parts of the diagram refer to parts of the ontology. I'm not sure how to improve the diagram, but in its current form it is very hard to read. Even a simple table with two columns would be more useful here.
- The paper contains a number of typos that any spell checker should identify.

Review #3
By Oscar Corcho submitted on 15/Jan/2015
Suggestion:
Major Revision
Review Comment:

This paper describes the Open 311 ontology that has been developed by the authors to represent data about 311 calls for various cities in North America. As such, the paper is evaluated according to the set of criteria that are proposed in the journal for ontology papers (http://www.semantic-web-journal.net/reviewers#types).

First of all, it should be pointed out that the paper may not be actually categorized as a short paper, as requested for this type of papers. It is 15 pages long and probably some of the details that are provided in the description may be summarized a bit to reduce the size of the paper and make the descriptions more brief and pointed, as requested in the aforementioned evaluation criteria.

The design principles behind the development of the ontology are described, although I must admit that I see the number of competency questions a bit too small to introduce all the types of needs and requirements that the ontology developers were considering in the development process. The methodological process is clear, and follows a well known method for ontology development, which has been used for the development of some influential ontologies, and which relies on the use of competency questions for capturing requirements and evaluating the resulting ontology later. The analysis done for existing datasets is adequate, although probably a bit too biased to North American cases, while 311-like data are also available in cities from other geographical areas and which have some important differences (for instance, in the organisation of cities into neighborhoods, or in the organisation of city councils). In any case, the use cases that are provided also make sense according to what 311 data can be used for, and I have myself experienced with other cities using this type of data the need to address similar types of problems.

The ontology is freely available on the Web (although the URI that has been applied is not following the most recent common practices in the publication of ontologies on the Web (e.g., not using the suffix .owl after the general URI of the ontology). However, there are no real cases identified where data has been actually transformed and are published (as Linked Data or in a SPARQL endpoint) according to this ontology. Only a few examples are provided, which are nice to understand how data from several cities can be transformed according to the ontology, but it would be nice to have these complete datasets available and provide links to the datasets.

The competency questions are also transformed into SPARQL, as a good practice.

Moving into the analysis of ontology quality, I have several concerns that may need to be addressed:
- One of the first concerns is related to the problem of interoperability between data from different sites. This concern is really important in terms of the types of services and requests that are modeled in the ontology. The authors propose one classification, but every data source that they have analysed has different sets of categories, as the authors have acknowledged. However, they are not clear about how these disparate sets of services provided as strings are mapped to this classification, and why they think that the classification is rich enough.
- From the competency questions it is not clear why the types of services and requests need to be encoded as OWL taxonomies, instead of using, for instance, SKOS concept schemes, which may fit better into a context as open as these sets of services.
- There are many elements in the original data sources that may be converted into URIs, such as for instance the divisions, section units, etc. It is not clear how they could be converted into URIs, making the datasets more linked to other potential datasets from the cities.
- On the related ontologies, some of the design decisions are not clearly expressed. The authors just say that they reuse ontology X, ontology Y, but they do not specify why. For instance, why aren’t they using the W3C Organisation ontology for the description of organizations? This ontology is inspired by the one that is used here, and they are both compatible. The same applies to the iContact ontology, whereas other ontologies may have been used, such as schema.org properties or vCard. And for transportation, Linked GTFS (and schema.org) is already providing some concepts that may be used.
- It is not clearly described either why the cardinality of has311Type is one for ServiceRequest. Isn’t it really possible to have several types? Or several handling agencies?
- All properties may also use camelCase notation, although this is a minor comment, obviously.

On the readability side of the paper, we have the following aspects that may need to be considered:
- It may be good to explain briefly in the abstract what 311 is about, since readers from different cultural backgrounds may not understand what the 311 service is.
- In general, there are many typos and grammar errors throughout the paper.
- In the references to open data sites from North-American cities, the links only provide links to the open data sites, but not to the specific datasets where 311 data is provided.