1MEIC03T03: Speech2Req - FEUP-MEIC-DS-2024-25/ai4sd GitHub Wiki
Our goal with Speech2Req is to help the process of Requirements Elicitation, the practice of researching and discovering the requirements of a system from users, customers, and other stakeholders, with a simple AI tool that can translate non-technical speech into technical requirements that developers can work with.
Vision
Non-technical speech recognition and its processing into technical requirements are Speech2Req's key features. Thus, Speech2Req would be used by an engineering project's Product Owners and Stakeholders to allow for a more streamlined Requirements Elicitation process.
Speech2Req addresses the need for clear communication, ensuring all team members understand project goals. It streamlines the process, reducing errors, misinterpretations, and delays while boosting collaboration between technical and non-technical teams.
Speech2Req differentiates itself in the market by addressing a common pain point: the communication gap between business stakeholders and development teams. This leads to increased customer satisfaction, as clients see faster project timelines and reduced rework due to misinterpretations.
Over time, it is possible to gather valuable data on common stakeholder-to-engineer communication patterns, potentially enhancing Speech2Req's AI/algorithm to better serve future clients, driving long-term innovation and competitiveness.
Research
Listed here below are some similar tools and projects that we found during our research. We plan on using them as inspiration and to fix some of their issues.
- IBM Engineering Requirements Management DOORS Next
This is a widely-used tool that helps capture, trace, and manage requirements throughout the project lifecycle using AI to analyze requirements, ensuring consistency and helping prevent conflicts. It facilitates collaboration among stakeholders and helps with traceability and impact analysis, however, the tool's setup is very complex. Speech2Req, in contrast, aims for a more simple approach that will facilitate this process.
- Jama Connect
Modern requirements management tool that leverages AI for better traceability and requirements analysis. It helps teams to capture, manage, and validate requirements, and provides traceability across the development lifecycle, offering strong traceability features, it's a good choice for complex projects. However, similarly to DOORS Next it has a steep learning curve, and is more adequate for larger scale projects, whereas Speech2Req will be usable for smaller projects too.
- Requirements Assistant by RequirementsAI
A real AI-based tool that helps automate requirements analysis. It uses NLP to extract and structure requirements from documents and stakeholder communications. It offers strong pros such as the automation of requirements gathering from text and support of documentation processing. On the other hand, NLP models might struggle with complex or ambiguous language. While Speech2Req can't simply ignore the limitations of NLP, we have in mind different strategies to work around this. For instance, the mode, upon detection of ambiguous language, could prompt the stakeholder to confirm their intent.
Domain Analysis
The frontend is a Node.js application using the Express.js framework, focusing on two main domains: managing chat interactions and integrating with the Google Generative AI API (referred to as Gemini
in the code). The code follows a simplified MVC (Model-View-Controller) pattern, with controllers handling route logic and services encapsulating business logic.
Key domains:
- Chat: Covers managing chat interactions, such as sending, retrieving messages, and listing histories. This logic is centralized in
chatService.js
, with dedicated routes inchatRoutes.js
and a specific controller (chatController.js
). - Gemini: Handles communication with the generative AI API to process prompts and return responses. This is encapsulated in
geminiService.js
and accessed throughgeminiController.js
.
Routes are configured in app.js
, which also includes middleware for general setup. The modular design ensures scalability, and the use of services abstracts external API and database (Firebase) details.
Class Diagram:
Architecture and design
Describe the architecture and design of the tool. Use component/deployment diagrams. If needed, resort to package diagrams to organize them into more manageable parts.
Be clear about what is the current architecture/design and what is the one you envision in the future, in case they are different. Identify main risks and justify the most important choices to show the soundness of the architecture and design that you have implemented or plan to implement.
----- missing diagrams do this tomororw ----
Technologies
Backend
- Express: Used to create a RESTful API due to its simplicity and middleware support.
- Axios: Chosen for easy HTTP requests, keeping API calls modular and readable.
- CORS: Enables Cross-Origin Resource Sharing, necessary for external API access.
- Google Generative AI: Integrated for advanced AI features, enhancing user experience.
Client Restrictions
The client imposed no restrictions on technology choices; all selections were made to best suit the project needs.
Sprint 0 Implementation and Impact
In Sprint 0, a basic API prototype was built with core routes and middleware. This initial setup established a scalable backend structure and validated Google AI integration, giving a clear path for future development.
Sprint 1 Implementation and Impact
In Sprint 1, we implemented improvements on the basic API prototype from sprint 0. This included improvements on the UI, such as better user experience with better error handling, and other general AI improvements.
Sprint 2 Implementation and Impact
In Sprint 2, we began migrating our local system to the AI4SD superheroes interface, with the most notable developments being related to the app's Dockerization (in order to be able to run it remotely on the server). Error handling was also improved, along with other
Backlog after sprint 2:
Sprint 3 Implementation and Impact
In Sprint 3, our main focus was to complete the integration with the superhero interface. Our application now runs on the cloud, with new code updates being pushed directly to the server and new images built remotely. Our team performed a lot of work on the global project, such as important optimizations for the pipeline's build process, and also in the docker secrets implementation. In this sprint we also performed further work on the frontend, and integrated firebase in our backend, allowing for persistent user chats. The chats themselves were also improved, now with the ability to utilize context from previous messages. Finally, we also implemented our key functionality of speech to text conversion, allowing for a hands free experience.
Development guide
Explain what a new developer to the project should know in order to develop the system, including how to build, run and test it in a development environment.
Document any APIs, formats and protocols needed for development (but don't forget that public APIs should also be accessible from the "How to use" above).
Describe coding conventions and other guidelines adopted by the team(s).
Security concerns
The project faces potential security vulnerabilities such as injection attacks, cross-site scripting (XSS), and cross-site request forgery (CSRF). To mitigate these risks, we employed input validation, escaping techniques, and content security policies. Authentication and authorization flaws are addressed through role-based access control (RBAC) and secure token management. Sensitive data exposure is minimized using encryption and environment variables, while dependency vulnerabilities are managed through regular updates and audits. Docker configurations follow best practices like using non-root users and minimal base images. Rate limiting and payload size checks prevent denial-of-service attacks, ensuring a robust, secure application infrastructure.
How to use
Any user with access to the superhero frontend can use our application. It's as simple as selecting Speech2Req from the sidebar and sending a prompt, either via text or speech. This prompt can even be something as simple as "I want to build a game". The system will then reply with relevant questions, such as "Is it a mobile or desktop game?", "Will it include 2D or 3D graphics?", and so on. This is obviously an extreme example, used to demonstrate how our system can help with the elicitation process or requirement engineering, and in a real context, the user would give a more specific/specialized prompt, which would make the process less cumbersome and more efficient. The user can then choose to export the generated requirements in different formats. Conversations with this chat bot are also stored in a database, meaning logged in users can continue their work throughout multiple sessions.
How to contribute
Explain what a new developer should know in order to develop the tool, including how to build, run and test it in a development environment.
Defer technical details to the technical documentation below, which should include information and decisions on architectural, design and technical aspects of the tool.
Contributions
Link to the factsheets of each team and of each team-member. For example: