1MEIC07T5: TwisterAI - FEUP-MEIC-DS-2024-25/ai4sd GitHub Wiki
The goal of our product is to produce and improve mutation tests based on a set of tests provided by the user. These tests are selected based on information given by the user.
Vision
Our vision is to create an AI chatbot that transforms the mutation testing. The chatbot will suggest improvements and cover weaknesses in the set that was provided and enhance the robustness and reliability of software.
Our product will receive two prompts provided by the user. The first one being the context of the already developed code and what needs improvement in its tests and the second, the existing unit tests set. The chatbot then analyzes these inputs, identifies potential limitations and suggests refinements. It also generates more test cases that address uncovered scenarios, ensuring total coverage.
As an AI assistant, the goal is to generate an extensive set of mutation tests and then select the most pertinent ones according to the user's input.
Research
There are several existing tools used for mutation testing, some of the most notable are:
PIT: PIT is one of the most widely used mutation testing tools for Java.
-
Pros: Uses selective mutation to reduce the time to run tests, has good integration with popular Java build tools, and has a strong community support.
-
Cons: It's limited to Java, and configuring PIT for large or complex projects can take time.
Stryker: Stryker is a popular cross-platform mutation testing framework that supports JavaScript, TypeScript, C# and Scala.
-
Pros: Multi-language support, ease of use, fast performance, and integration with some popular frameworks.
-
Cons: Configuration can be complex, especially when using multiple languages.
mutmut: mutmut is a mutation testing tool for Python that focuses on being simple and fast.
-
Pros: Good Pytest integration, terminal-based reporting.
-
Cons: Limited features compared to other tools, and smaller user base.
Domain Analysis
Class Diagram
Sequence Diagram
Architecture and design
This tool is built around three components: WebApp, which is the front-end, Mutation testing tool, which could be in Python, Java, or JavaScript, and Gemini which is an AI that will be used to choose the most pertinent mutation tests.
Risks and choices
-
Risks: Using external tools like Gemini poses a risk, since we are relying on the reliability and up-time of other services.
-
Key Choices: We chose Python and JavaScript as the first languages to implement, since they are widely used and it felt easy enough to make a prototype from them and prove our concept.
Technologies
Since the client didn't impose any restrictions, we explored existing mutation testing tools for Python and JavaScript, choosing the most well-known and widely used ones so we could have as much information and community support as possible. Later, we also implemented this tool for Java.
To run the mutation tests in Python, we made use of MutPy, since this seemed like one of the most used tools for this language.
We used NodeJs as our environment.
Our implementation in Java and JavaScript didn't make use of any external libraries.
To ensure a good understanding of our product, we designed mockups using Figma, since this is a complete and easy to use tool that also allows us to make prototypes to navigate between different pages.
Prototype from Sprint 0
Sprint 0 served as a proof-of-concept phase, and we were able to generate mutation tests for Python and JavaScript. We did mockups for the frontend, allowing us to think about the user interaction with the product and how everything will fit together.
The following image is one of our mockups:
Development guide
There are some dependencies you should be aware of in order to develop our project:
- MutPy and Node.js.
Full instructions on how to run the project are available in the READMEs in the repository.
We are following some standard coding conventions:
-
camelCase for variables and methods
-
PascalCase for classes
Security concerns
Some people may be working on confidential code, so it's important to make sure the code submitted to our tool isn't accessible by anyone else.
Quality assurance
The development team is manually testing the product to try to find and fix any errors. Additionally, every pull request is reviewed by someone other than the code author, to assure the quality of the code is up to standard.
How to use
Our assistant is available in the shared repository, so you can look for our tool: TwisterAI
there. Another way to run the project is by cloning the repository T07_G05 and following the instructions in the README. You can also run and test each individual component by opening each directory and following the instructions in its README.
You can see a short demo of what our product does here: demo
How to contribute
Firstly, you should have knowledge about testing on the language you want to contribute to. Then, if you want to contribute, clone the repository, create a branch containing commits with clear messages and create a pull request. For any further information, feel free to contact our Scrum Master, Ema Martins.
Sprint 2 Retrospective
-
What went well:
- Quality work: All the features implemented were high-quality and fully functional, meeting the team's standards and expectations.
- Appropriate pace: The team maintained a steady pace throughout the sprint, completing issues in a timely manner.
-
What went wrong:
- Pull request issues: PRs were not reviewed quickly, causing delays in development.
- Communication problems: Communication between teams was slow and unclear, meaning it wasn't clear which direction the project was going in, making the development ineffective and cumbersome.
-
Main problems to address:
- Communication: Communication needs to be open and clear in the future, to make sure we all understand our roles and what we want to do as a group.
- PR Reviews: We should be faster in reviewing pull requests, so we don't have bottlenecks in development.
Sprint 3 Retrospective
-
What went well:
- Quality work: All the features implemented were high-quality and fully functional, meeting the team's standards and expectations.
- Good adaptability: The team was tasked with interacting with a lot of new things in the infrastructure, but we were able to implement what we needed.
- PR Reviews: The team was faster and more efficient reviewing pull requests, without compromising the quality of the work implemented.
-
What went wrong:
- Communication problems: Again, communication between teams was slow and unclear, meaning it wasn't clear which direction the project was going in, making the development ineffective and cumbersome.
- Inter-team coordination: It was very hard to coordinate everything that needed to be done with the other teams, until it got to the point that the teams seemed to give up on trying to coordinate, and just went about their business independently.
- Lack of compromise: When trying to integrate this work with other teams, everyone is reliant on each other, however, lack of compromise, slow development, and lack of accountability from the other teams slowed everyone down, meaning we had a lot of extra work at the end of the Sprint, since we couldn't implement what we needed earlier on.