DemaRQ: Demarcator for ReQuirements

Short Description

Requirements demarcation is a simple but important task during the analysis of a textual requirements specification. The task is essentially to determine which statements in the specification represent requirements. Following suitable writing and markup conventions does not guarantee immediate and unequivocal demarcation, since neither the presence nor a fully accurate enforcement of such conventions can be taken for granted. Resorting to after-the-fact reviews for sifting requirements from other material in a requirements specification is both tedious and time-consuming.

Motivated by the need for demarcating requirements in requirements specifications irrespective of domain, terminology or style, we present a novel tool, DemaRQ (Demarcator for ReQuirements), for demarcating requirements in free-form requirements specifications. DemaRQ is based on Machine Learning (ML). The ML classifier in DemaRQ is a Random Forest model with Cost-sensitive Learning. This classifier has been trained over 16161 manually labeled statements from 26 requirements specifications (written in natural language) using different styles and covering diverse domains.

DemaRQ works by first parsing a requirements specification using Natural Language Processing (NLP). The tool then computes, based on the NLP results, a set of features for each sentence in the requirements specification. The features fall under four categories: token-based features capture the token-level information, syntactic features are derived syntax-related information, semantic features are about the semantic categories of the verbs, and frequency-based features characterize sentences based on document-level information. The computed features are aggregated in a feature matrix. DemaRQ then applies its pre-trained model for classifying each sentence in the input requirements specification as a REQUIREMENT or a NON-REQUIREMENT.

System Prerequisite

Java SE Runtime Environment Version 7 or higher

DemaRQ installation material

  • The most recent version of DemaRQ (including the source code) is on Github at this link and on Zenodo at this link.

  • Selected annotated requirements documents used for training DemaRQ can be found at this link.

How to cite?

Acknowledgement

This project has received funding from QRA Corp, Luxembourg's National Research Fund under the grant BRIDGES18/IS/12632261, and the European Research Council under the European Union's Horizon 2020 research and innovation programme (grant agreement No 694277).

Contact information

  • Sallam Abualhaija(*) [homepage], Chetan Arora(*,^) [homepage], Mehrdad Sabetzadeh(*,+) [homepage], Lionel C. Briand(*,+) [homepage].

  • Affiliation:

    • *: Interdisciplinary Centre for Security, Reliability and Trust.
      University of Luxembourg -- 29, Avenue John Fitzgerald Kennedy, L-1855 Luxembourg

    • ^: School of Information Technology, Deakin University, Geelong, Australia.

    • +: School of Electrical Engineering and Computer Science, University of Ottawa, Canda

  • Emails: {abualhaija, arora, sabetzadeh, briand}@svv.lu