1MEIC05T3: LavraAI - FEUP-MEIC-DS-2024-25/ai4sd GitHub Wiki

LavraAI, your companion on code visualization.

With LavraAI, you can visualize the flowchart of your application with the click of a button.

Vision

Users like to abstract theories and knowledge into visual representation, so why not save "cpu resources" in understanding the code and finding problems or things that are complicating the flow of the work? This product envisions a code flow that is easy to understand and easy to pick up for users that are new to its development or those who want a fluid experience.

Our product LavraAI is a VSCode extension embedded with AI that will surely save all the user's stress in understanding what other developers did, unlike the traditional boring IDE experience where they have to adventure by themselves in this jungle of function calls and branches. In just a few clicks of the mouse, they can see clearly what is going on in the big picture with a sequence diagram. And if they wish to check for the details they can expand each of the functions into flow diagrams of the control flow context.

Research

There is a VSCode extension that currently makes it possible to render a PlantUML in the VSCode IDE. However, that extension does not give the user the ability to translate a piece of code into the UML representation, which is the main selling point of our product.

Survey of similar projects and analysis of their pros and cons when compared to the product to be developed.

Domain Analysis

Code Selection: The user selects a block of code via the frontend and chooses the type of diagram he wishes to view.
Code Translation: The backend sends a request to the Gemini LLM so that it translates the code into PlantUML language.
Diagram Generation: The backend generates the diagram and makes it appear on the frontend.

High-Level Class Diagram

high level class diagram

Activity Diagram

activity diagram

Include high-level class diagram with key domain concepts. Complement this diagram with other high-level diagrams has appropriate (activity, sequence, etc.).

Architecture and design

Component Diagram

component diagram

Frontend

User: This is the component on which the user of the program can interact with. It includes the code of the program and all the interactions he can do with our system, such as selecting the code, right clicking and previewing various diagram types.
LavraAIExtension: This is the component on which the LavraAI extension is located. It makes a request to the gemini API, using the code that was selected and an engineered prompt as input, it writes the output onto a text file and calls the PlantUML extension to display the diagram.
PlantUMLExtension: This is the component of the PlantUMLExtension which handles the mapping of the language of PlantUML into a visible diagram.

Backend

GeminiAPI: This is the component of the GeminiAPI, that receives an input of code and provides an output of a translated version of that code into PlantUML code.

Future Architecture

A future architecture example could attempt to remove certain components, or parts of components that are not required or that are inconvenient for the User. Components such as PlantUMLExtension may be removed in a later future if found a more convenient alternative for the user that doesn't require having this extension downloaded as well. In terms of parts of components that could be removed or improved, there is a case to improve the usability of the extension, such as being able to reload the diagram without creating a new one, generating a diagram as you type, querying the diagram in order for it to improve the current diagram. All of these measures are certainly things to consider in the near future.

Main Risks and Important Choices

Main Risks: The reliance on Gemini tokens and the PlantUML extension can be risky due to the fact that we don't have much control over these components. Another problem might be the scalability of the extension, due to the limited gemini tokens that are currently available and lack of better concurrency of requests by different users to the Gemini api.
Important Choices: The choice of providing different types of diagram for the user to pick is a very important feature that improves the variety and helpfulness of the extension. Also choosing for usability features, such as being able to query the diagram can be very important for the user to understand the coding problem in a more natural form.

Describe the architecture and design of the tool. Use component/deployment diagrams. If needed, resort to package diagrams to organize them into more manageable parts.

Be clear about what is the current architecture/design and what is the one you envision in the future, in case they are different. Identify main risks and justify the most important choices to show the soundness of the architecture and design that you have implemented or plan to implement.

Technologies

Our Choices

Typescript: For the creation of the extension, the use of typescript is mostly standardized. The other option we considered was javascript, but eventually settled on typescript due to it being generally less prone to errors.
PlantUML: To create the diagrams, we fine tuned the Gemini to be able to translate into PlantUML language, due to the fact that the PlantUML extension requires this.

Client's Choices

Gemini: The use of the Gemini API as the LLM for the translation of code into PlantUML was the client's suggestion, due to the fact that each member of the group has access to a certain amount of gemini tokens, making it financially viable to maintain the extension until the end of the project's date.

Prototype

For Sprint 0, developments have been made in both the frontend and the backend. In the front end the user can already interact with the extension and send a block of selected code, as well as choosing the type of diagram that he wishes to visualize. LavraAI extension also alreeady is able to display diagrams calling the PlantUML extension. However the PlantUML extension requires PlantUML code in order to generate the diagrams, so in the backend we were able to fine tune a Gemini LLM so that in the frontend we connect and request it to translate the user selected code into PlantUML language.

These implementations have informed us of the challenges that are yet to come. How do we increase the usability of the user? How do we maintain the performance and consistency of results of the GeminiAPI? How do we increase the features of the extension in a way that makes sense? How do we deal with errors? How do we deal with repository commits of the diagrams? All of these questions have surged after the development of Sprint 0 and hopefully will be resolved in future iterations.

Development guide

Prerequisites

Npm: Requires installation of npm, since the project uses typescript it naturally requires npm. First the download of the npm packages are required by use of the installer found in the website. To get the latest version run:

npm install -g npm

Typescript: Typescript can be downloaded globally using the following command:

npm install -g typescript

The project also requires download of other extensions from visual studio code in order for it to run. The extensions to download can all be found in the README.md of the project as well as the necessary steps to take afterwards.

Coding Conventions

For coding conventions, an approach of pair or sometimes triple programming was made, so that everyone is familiarized with the same coding convention. The scrum master reviews the code of each commit and makes sure that the conventions are the same throughout. Same name conventions, function names, error handling methods,...

Explain what a new developer to the project should know in order to develop the system, including how to build, run and test it in a development environment.

Document any APIs, formats and protocols needed for development (but don't forget that public APIs should also be accessible from the "How to use" above).

Describe coding conventions and other guidelines adopted by the team(s).

Security concerns

Throughout the development of the extension during the Sprint, it was found a notable security issue. The use of the Key for the Gemini API endpoint is situated in the frontend extension code, This may not be the best approach since the client can now have access to the key and use it himself, and possibly even sabotaging the Gemini API that we are using. Further measures to make this key to become unattainable by the user is yet to be implemented, however prototyping of what the architecture of the program should be in order to hide this key has started and is currently in progress.

Another problem might be of a user trying to confuse the system by making the code that he selected be a AI command or instruction that makes the Gemini LLM respond in a negative way, potentially dangerous. Since the Gemini is tasked to translate the code, it shouldn't take the user input as an instruction, but as a block of code, however it could be somehow possible and if so a big security problem could arise if no action is taken.

Quality assurance

For Quality Assurance, for now it has been thought of creating Unit Tests to verify certain functionalities of the extension. More particularly, testing of individual sectors of code inside of the extension frontend, and also a verification of PlantUML code results provided by the GeminiAPI.

How to use

Upon selecting a portion of code, the user can right-click to check additional actions provided by the IDE, Visual Studio Code in this case. If the user has the LavraAI extension installed, one of the actions provided will be "Preview Control Flow". The shortcut for this action is Ctrl + Alt + P.

After this, the user will be prompted to choose between either an Activity Diagram or a Sequence Diagram

Sprint Retrospective

Sprint 1

Work done

During sprint 1, development started by building features that allow the user to start having some control over the output through a refresh command and a very simple chat. Front and Backend were separated, since the GEMINI request is no longer being sent by the frontend, but now there's a Docker layer between the user and the request, furthering our model into what the final merged product will look like.

What we did well

Sprint 1 was good, the dev team managed to complete every item on the sprint backlog before the deadline, with all items tested and reviewed to make sure everything worked fine. The team also communicated openly and without problems, which led to an easier development without constraints or waiting time.

What didn't go well

The team underestimated themselves and the work to be done, with most of the sprint backlog items done way before the end of the sprint. The team was also very "closed off" from the other teams.

What do we want to do differently?

Next sprint the team will look into increasing the load of work, but still try not to overestimate. Next iteration we must also seek to communicate more often with the other teams in order to develop a more cohesive product.

Sprint 2

Work done

During sprint 2, most of the focus went towards improving the chat experience, which now keeps track of the chat context with the user, allowing for much better responses. Furthermore, we also integrated our backend with the main repository (AI4SD) and started working on a common extension front-end.

The dev team also focused some time on polishing the existing code by fixing some coding no-no's. For example, our code no longer has hard-coded strings - every string is in a JSON file which is then used to load each string.

Before this sprint, the extension only accepted one client at a time. Now it has multiple-client support with automatic inactive client disconnections.

Seeing as development of our agent was going smoothly and with little time to implement other features on the product backlog, the dev team also worked together with other teams to improve on the idea of the VSCode extensions merger, which consists of a single big extension containing most of, if not all, of the VSCode AI agents developed by the other teams.

What we did well

Much like the previous sprint, sprint 2 was good. Not only was every item in the sprint backlog completed, but work was also put into the development of the AI4SD architecture as a whole. Our team continued to communicate with ease with each other but also opened the communication with other teams, leading to a more cohesive product within AI4SD.

What didn't go well

Unlike in Sprint 1, we slightly overestimated the workload this time. While we managed to complete everything on schedule, the team felt the time was insufficient. Additionally, a significant amount of time was spent waiting, both for the pipeline execution to finish and for feedback on pipeline errors (or the absence of them).

What do we want to do differently?

In the next iteration, we must reduce the workload. As for the main AI4SD repository, we should have taken the initiative earlier regarding general aspects of AI4SD. In the upcoming sprint, we aim to address this by starting sooner.

Sprint 3

Work Done

We tried to incorporate secrets within our backend, but were forced to backtrack and place the key hard-coded. We also developed a new modular system for adding new extensions to the common AI4SD X-Men.

What we did well

We managed to integrate our system within AI4SD's X-Men and have a complete, working extension.

What didn't go well

Some issues out of our control arived when trying to add secrets, so we wasted time. We also didn't finish as soon as we wanted to, which left things in the end to be slightly rushed. We also developed the new modular system for the extension of AI4SD X-Men, which couldn't be merged as early as we would've wanted since we had to handle concurrent changes from other groups from time to time.

Contributions

Team:
- João Lourenço (PO)
- Tiago Cruz (SM)
- HaoChang Fu
- João Cardoso
- Tomás Xavier
- Afonso Osório