1MEIC06T1: ArchiDetect - FEUP-MEIC-DS-2024-25/ai4sd GitHub Wiki
Overview
The primary goal of this product is to develop an analysis tool that assesses the probability of specific architectural patterns existing within a software repository by examining various data points from GitHub. The end product will support decision-making by analyzing GitHub data, flagging possible patterns, and generating reports that highlight findings for developers, architects, and project managers. Below, you can find an index to quickly browse this page.
- Vision
- Research
- Domain Analysis
- Architecture and design
- Technologies
- Development guide
- Security concerns
- Quality assurance
- How to use
- Process of Development
- Happiness Metrics
- How to contribute
- Contributions
Vision
ArchiDetect empowers developers to swiftly uncover and analyze architectural patterns within their GitHub repositories, effortlessly integrating with their Visual Studio workflow. By integrating seamlessly with Visual Studio, we’re transforming architecture analysis from a complex, time-consuming process into a seamless part of the developer’s day-to-day.
With this tool, developers and stakeholders gain instant visibility into architectural trends and potential design challenges, allowing teams to align on a clear, data-backed understanding of a project’s needs. This means smarter, faster decisions that prevent issues before they escalate and help ensure every project is built on a solid architectural foundation.
For our team, this product represents a game-changing solution in a space currently underserved by existing tools. By automating architecture pattern detection and providing actionable insights, we’re helping development teams everywhere build stronger, more sustainable software—driving quality and innovation forward with every release.
Research
We found some projects that share similar purposes with ArchiDetect, namely:
CodeMaat, an open-source command-line tool, this product uses version control logs to produce data on coupling, complexity and module ownership.
- Pros: Analyses author contributions, modules that change together (logical coupling) and change rate (churn), and other aspects of development to then offer helpful insight into code quality issues.
- Cons: Doesn’t detect architectural patterns specifically.
CodeCharta, another open-source tool, this time focused on converting software metrics into interactive maps, offers a visualization of the codebase in a 3D cityscape format.
- Pros: Helps with code maintainability and highlights large or complex areas that may be candidates for refactoring.
- Cons: Doesn’t detect architectural patterns specifically.
SonarQube flags code issues, such as bugs, vulnerabilities, code smells, and it’s vastly used for continuous code quality and security analysis.
- Pros: Supports architectural rule definitions and dependencies, helping to maintain structural quality.
- Cons: Focuses only on static analysis of code and rule-based architectural checks, rather than identifying architectural patterns through repository activity and story points like ArchiDetect proposes to do.
Domain Analysis
Physical Diagram
After sprint 0, Nexus was introduced as a component of the AI4SD architecture, which is a source of repositories data. So, the application will retrieve the repository data from Nexus in order to avoid relying on the Github API:
Sequence Diagram
Architecture and design
ArchiDetect, as described above, is a simple tool that follows a modular design.
- The frontend, where the input data is inserted and the output data displayed.
- The backend, responsible for gathering data on one's repository, communicating with Nexus, and building the prompts to send to the LLM.
- Nexus, that scrapes data from one's repository
- The LLMs, that will recognize architectural patterns from the backend's information on one's repository.
In the future, the app is expected to use several LLMs when recognizing architectural patterns, so it can cross-reference various results, making the conclusions more solid.
Moreover, the user will also be able to select which data is to be fed to the LLM, allowing them to filter out "noisy" data (for example, bad commit messages), in case they recognize it is a problem in one's repository.
Technologies
Identify the main technologies, languages and frameworks used. Clearly identify which ones were restrictions imposed by the client and which were your own choices. Justify your choices and explain in your own words the motivation for the restrictions of your client.
Explain the prototype or base implementation that you have implemented in Sprint 0, and how that has informed the rest of the development.
Development guide
Explain what a new developer to the project should know in order to develop the system, including who to build, run and test it in a development environment.
Document any APIs, formats and protocols needed for development (but don't forget that public APIs should also be accessible from the "How to use" above).
Describe coding conventions and other guidelines adopted by the team(s).
Security concerns
Identify potential security vulnerabilities classes and explain what the team has done to mitigate them.
Quality assurance
Describe which tools are used for quality assurance and link to relevant resources. Namely, provide access to reports for coverage and mutation analysis, static analysis, and other tools that may be used for QA.
How to use
Explain how to use your tool from an user standpoint. This can include short videos, screenshots, or API documentation, depending on what makes sense for your particular software and target users. If needed, link to external resources or additional markdown files with further details (please add them to this wiki).
Process of Development
Sprint 0
Sprint 1 Retrospective
At the end of Sprint 1, it is the team's overall opinion that we worked well. However, some aspects should be improved to increase the agility of our process of development, such as clearly defining some tasks that aren't so easily perceived using user stories and assigning them to members of the team, beginning to work on the sprint backlog sooner as to review the work done more thoroughly.
At this point, there are still some doubts regarding the integration of the work developed by the teams with the AI4SD tool.
We decided to implement changes regarding the first aspect, so that Sprint 2 can begin with increased productivity among the team.
Boards at the end of Sprint 1
Sprint 2 Retrospective
In this Sprint, the team feels the planning was too optimistic, and, with more work pilling up from other courses, the time management was a little chaotic.
For Sprint 3, we are implementing team meetings more often, so there is a closer objective to work towards.
Boards at the end of Sprint 2
Sprint 3
This is how the workload of our team is looking like at the beginning of the last Sprint of this project. We noticed that the Product Backlog had an item that was already in progress (Issue #11), so we moved it into the "In Progress" board.
Boards at the beginning of Sprint 3
Happiness Metrics
Here we have a table representing the level of happiness between the team members. The vertical axis has the member evaluating, and the horizontal axis the member who is being evaluated.
We use 🤠 as the best possible evaluation!
How to contribute
Explain what a new developer should know in order to develop the tool, including how to build, run and test it in a development environment.
Defer technical details to the technical documentation below, which should include information and decisions on architectural, design and technical aspects of the tool.
Contributions
Link to the factsheets of each team and of each team-member:
- Team 1