Using Syntactic and Semantic Analyses to Improve the Quality of Requirements Documentation

Paper Title: 
Using Syntactic and Semantic Analyses to Improve the Quality of Requirements Documentation
Kunal Verma, Alex Kass, Reymonrod G. Vasquez
We discuss our experiences with deploying a tool called the Requirements Analysis Tool (RAT), which automatically reviews requirements documents for clarity and content based issues using a variety of syntactic and semantic techniques. The tool has been deployed at over 150 large software projects. We provide an overview of our syntactic approach, which is based on enforcing restrictions on both sentence structure and vocabulary in a way that is carefully chosen to align with best practices. We discuss how RAT analyzes natural language text to find defects such as terminological inconsistencies and missing contextual information. Structured content from requirements is then represented as a semantic graph and RAT performs semantic analysis to help users perform interaction analysis. We present a number of case studies based on real world deployments of RAT which demonstrate number of improvements in the projects‟ requirements ranging from clearer sentence structure to more complete requirements.
Full PDF Version: 
Submission type: 
Tool/System Report
Responsible editor: 
Giancarlo Guizzardi

Revised manuscript following an accept with minor revisions - now accepted for publication.

Solicited review or the original submission by Ivan Jureta:

The paper is an industry experience report on the use of a software tool called the Requirements Analysis Tool (RAT). The paper says the tool was applied in over 150 software projects.

Overall, the paper is a useful overview of industry experience and as delivers what it promises, my recommendation is to accept pending minor adjustments to the Related Work section (see below). I particularly appreciated the observation that requirements, when elicited from users, often leave out crucial contextual information.

There are obvious directions in which this work may evolve, and that would certainly be relevant. One is an elaborate comparison of conclusions from the 150-plus projects with recommendations on ontology and methodology for requirements engineering. Another is the analysis of the scope and depth of the requirements ontology used in the projects. I do not think, however, that these discussions are in the scope of the paper. I therefore do not see the absence of such discussions as limitations of this paper.

What would be appreciated is a slightly different Related Work section. The current one attempts to relate this work to formal methods and research on the design of requirements modeling languages. I believe a Related Work session for this paper instead needs to compare the lessons learned in these projects with lessons learned in past industrial experience reported in the requirements engineering field. Highlighting departures from past experience would be particularly interesting.

The paper is well written.

Solicited review of the original submission by Renata Guizzardi:

This is a very well written paper regarding a system which supports requirements management. Requirements management is an interesting and complex topic, which may very well profit from semantic web technologies and theories. In my view, the proposed system makes a fair use of both technologies and theories, although there is room for improvement, especially regarding the latter.

This review is structured as follows: * denote a complement. For each compliment, I add some EXCEPTIONS, when necessary.

Some highlights of the paper are:

* Authors show a good coverage of the Requirements Engineering (RE) literature. Not much space for the analysis of literature is found but by the citations mentioned and the decisions taken, we can spot that out.
EXCEPTIONS here: a) In Introduction, paragraph 2, you cite goal based analysis. In this context, goal-based analysis is mentioned among much less powerful techniques in terms of requirements analysis (not only documentation or elicitation). I would not treat all of these techniques as offering the same kind of support, as it seems in the text. I think goal analysis, for instance, does a bit more than only augmenting the requirements document. It may provide a much more powerful away to link such requirements to strategic goals of the organization and it may also visually organize requirements by linking them to goals in the models. You may also want to cite newer references on the techniques. The goal based analysis reference for instance is from 1996! There is much more recent work about this topic (please check the IEEE RE Conference and related journal publications).
b) In section 2, you mention some adjectives/phrases which may lead to imprecise requirements. You called them "problematic phrases". In general, such phrases are very much related to non-functional requirements. Have you thought of this relation? What do you have to comment about this?

* Authors also seem to be knowledgeable on the common practical issues that are raised within organizations. Examples of this fact can be found on section 2: they choose to tackle terminological and content-based problems, both very typical problems in RE; they also highlith what they call "interaction problem", which are also quite common.

EXCEPTION: when you describe the interaction diagrams, I think you should explicitly say the objectives of the types of diagrams supported by RAT. I say this because there are different types of interaction diagrams with different purposes, for example: UML interaction sequence diagrams or agent-based sequence diagrams which explicit show message passing between entities, thus including the passive entities in the diagram. In this sense, why doesn't RAT provide such kind of diagram? You do have all the necessary information, since it already knows which agent sends what (passive entity) to each other. The kind of diagram you show has much less information; what are the specific purposes you envision to them?

* The paper provides a thorough explanation of the tackled issues and related functionality in the tool;

* The paper connects well the discussions in the text with illustrative examples and good figures;
EXCEPTION: a) In section 3, after topics 1 and 2, you should also comment on the topic 1, which was practically left out in the subsequent paragraph (you focus on examples of topic 2).
b) you do not exemplify how the semantic analysis is applied to look for "dependencies and conflicts between requirements, and finding missing requirements." You can exemplify, for instance, how the ontology assists on the creation of the interaction diagrams. I think you cannot suppress this type of example because the semantic feature is one which makes a stronger link to the semantic web, thus being a good motivation to publish the paper in this journal.

* The paper presents seven case studies showing practical results with the use of the system. The case studies are quite large, some of them having more than 1000 requirements.

* The paper has very compelling arguments to motivate the work, especially in the Introduction (paragraph 3 is great);

* I completely share your opinion on: "A practical approach to automating the review of requirements documents therefore requires finding just the right compromise to ensure that requirements can be documented in a way that is convenient for the writer, clear for the reader, and tractable for the automated review system."
EXCEPTION: In this sense, don't you think that RAT still provides a very long and perhaps even tiresome documentation? After all, the analyst needs to go through a very long text full of flags here and there. Moreover, in the last paragraph of section 2.2, you mention that the system "allows users to quickly check all the interactions and validate them with the stakeholders". Reading through pages and pages of requirements to spot agent-agent interaction requirements does not seem quick to me at all! In this sense, I do not think you fully supported your claim in paragraph 4 of the Introduction: "to detect content-based issues by automating some manual tasks such as interaction analysis." You'd better say that interaction analysis is "semi-automated".

* The text is well structured and written in very good English.

Main issue:
You base the semantic analysis on a core requirement ontology, which can be extended by the users. This means that the power and usefulness of the tool is bound by the quality of this ontology. How can you ensure that the users will extend it in a consistent way?
By reading about the core ontology, I find it rather simplistic. It is rather an application (or system) based conceptual model than a domain ontology. In this sense, I would invite the authors to take a look at the literature regarding Foundational Ontologies. Foundational Ontologies are theoretical bodies of knowledge which aim at providing real-semantics to concepts. The most prominent researcher on this topic is Nicola Guarino, from LOA, Italy but more and more people have been recently joining this field. You may also want to visit the IAOA (International Association for Ontologies and Applications), which gather researchers and practitioners on the field.

I believe that if you move in this direction, you may be able to augment RAT to do much more with the ontology than only the kind of analysis provided now. I do not expect or think this can be done in the scope of this paper. This is rather something for future releases of the system. But I strongly encourage you to pursue it.

Remaining minor issues:

- You call your entities agent and passive entities. I wonder how much this is influenced by the agent-oriented paradigm. It is in line with the work done in that community, which makes this feature very interesting. But I also recognized that you may have been simply influenced by linguistics. Would you mind commenting about this?

- Have you considered applying the syntactical analyzer in other textual documents? I ask this because I myself participate in a project in which we have a controlled vocabulary type of document which could profit from this kind of approach. In this respect, how flexible is this component in order to be adapted to other types of controlled text documents? Could you for instance add other rules? Could you base the tokenization in other types of glossaries?

"As a result, senior personnel on the project SPENT REVIEWING the requirements for clarity-based issues." Suggestion: spent time reviewing

"We DON'T know whether the manual…" Suggestion: DO NOT