1MEIC05T1: WardenAI - FEUP-MEIC-DS-2024-25/ai4sd GitHub Wiki
WardenAI
Using WardenAI, you can check for vulnerabilities in your code.
Vision
In an era of rapidly evolving cyber-threats, secure software development is more critical than ever. Vulnerabilities in code can lead to serious consequences, from data breaches and financial losses to reputational damage. For developers, ensuring code security can be complex, requiring specialized knowledge and extensive effort. WardenAI a VS Code extension powered by artificial intelligence to detect security vulnerabilities in a programmer's code, helping developers identify and mitigate potential vulnerabilities.
The aim of the tool is to point out vulnerabilities induced by the user's code. Our approach centers on usability and accessibility for developers of all skill levels. The extension provides alerts and remediation guidance tailored to the detected issues, offering specific insights that help developers resolve vulnerabilities.
In essence, this VS Code extension aims to help developers address security risks early in the software development process. Being the team's aim that the tool provides real-time feedback on potential vulnerabilities, it serves as a proactive assistant for writing secure code, supporting developers in creating resilient software with security built-in from the start.
Research
Survey of similar projects and analysis of their pros and cons when compared to the product to be developed.
Domain Analysis
Include high-level class diagram with key domain concepts. Complement this diagram with other high-level diagrams has appropriate (activity, sequence, etc.).
Architecture and design
The prototype is able to generate a vulnerabilities report on one file with code in any language.
To use the prototype, the user uploads the file via the command line and then chooses if it wants to run the analysis offline, which is more secure and free but slower, or online, which is faster. After that it receives a report in markdown.
The system runs inside a docker container, where:
- there is the Prompt Engine, that receives the file, ask the user for online or offline analysis and sends the code to the appropriate subsystem.
- Ollama is used, a program that makes it easier to run large language models locally, and the llama3.1 model, for offline analysis. Also the Prompt Engine communicates with the Gemini API, for online analysis.
This can be viewed in the following diagrams:
Component Diagram
Deployment Diagram
Package Diagram
Activity Diagram
Use Case Diagram
The most difficult decision the team made was to decide which AI systems/models would be used. By now it uses 2 different models for the same task, but they bring different advantages and are targeted at different types of users.
The main disadvantages of the current system are:
- the only way to interact with it is through the command line
- it doesn't use the GPU to run the LLM locally, which makes the offline analysis really slow.
This system needs an easier integration with external systems. To solve that an API will be created, so, in the future, the architecture must reflect that.
Technologies
Prototype (Sprint 0)
The prototype released as a result of Sprint 0 was developed using the following technologies:
Technology | Choice | Description |
---|---|---|
Python | Own | Programming language: all team members are extremely familiar with the language |
Docker | Client | Taking into account that WardenAI is to be integrated in a large set of many other AI4SD tools, the Client expressed interest that the tool was at some point be available as a Docker service |
Ollama | Own | LLM library that can be executed locally in any machine, serving as a starting point for the use of AI in the tool and enabling the tool's Offline analysis mode (further described in this document) |
Gemini | Own | Google's chatbot offers a free trial option that fulfills the needs for the product, enabling the tool's Online analysis mode ((further described in this document)) |
The prototype released as a result of Sprint 0 offers the user the ability to check the code of one file via a terminal session (the prototype does not analyze the code in real-time, i.e., as the user writes it using VS Code). The user can choose between two modes of analysis:
- Offline, supported by Ollama: using a pre-downloaded (when building the Docker image) Ollama model, a less thorough analysis can identify the most common and simple security issues in the provided code.
- Online, supported by Gemini: relying on a connection with Google's Gemini chatbot, a more complete and thorough analysis is possible with the prototype.
It is worth highlighting that the prototype is containerized, which already contributes to the aforementioned requisite that the tool is made available as a Docker image/service, estabilishing a relevant base for further development.
Development guide
Explain what a new developer to the project should know in order to develop the system, including who to build, run and test it in a development environment.
Document any APIs, formats and protocols needed for development (but don't forget that public APIs should also be accessible from the "How to use" above).
Describe coding conventions and other guidelines adopted by the team(s).
Security concerns
Identify potential security vulnerabilities classes and explain what the team has done to mitigate them.
Quality assurance
Describe which tools are used for quality assurance and link to relevant resources. Namely, provide access to reports for coverage and mutation analysis, static analysis, and other tools that may be used for QA.
How to use
You can access the web app interface via http://104.155.4.93/assistants/wardenAI
You can also try the assistant's API via its OpenAPI specification tool on https://superhero-05-01-150699885662.europe-west1.run.app/docs
How to contribute
Explain what a new developer should know in order to develop the tool, including how to build, run and test it in a development environment.
Defer technical details to the technical documentation below, which should include information and decisions on architectural, design and technical aspects of the tool.
António Rego, Daniel Bernardo, Pedro Beirão, Pedro Lima and Pedro Januário 2024