Configuration - davidmarsoni/llog GitHub Wiki
This section contains detailed explanations about the different configuration setup of the project. This page aims to provide a clear understanding of the project structure and how to configure it for your needs. It is divided into several sections, each focusing on a specific aspect of the project.
:clipboard: Table of Contents
- :clipboard: Table of Contents
- :floppy_disk: Caching system
- :wrench: Management of the cached files
- :world_map: Route system
- :gear: Services
- :robot: Agents System
- :rocket: Deployment
:floppy_disk: Caching system
For our project, we have decided to create a custom caching system that has allowed us to store all the content inside a Google Cloud bucket.
This caching system allows us to store the content of the Notion pages and Notion databases, as the Notion API is very slow, making it impossible to use it for near-real-time conversation. It also stores the content of the PDF, TXT, and MD files that are uploaded by the user. This content is stored in a Google Cloud bucket and is used to generate the indexes and the metadata of the files.
:memo: Notion caching system
The Notion caching system is relatively simple. First, we get the content of the Notion page or database using the Notion API. Then, we store the content in a Google Cloud bucket. The content is stored in a .pkl
file, which is a Python pickle file. This file is used to store the content of the Notion page or database. Next, we generate the indexes and the metadata of the files. The indexes are generated using the LlamaIndex library, and the metadata is generated using a custom function that extracts the metadata from the Notion page or database using a custom prompt and the OpenAI GPT-4o-mini model.
For a detailed example of how to connect to the Notion API and get the content of a page or database, you can check the examples
page:
For the actual implementation of the Notion caching system, you can check the file cache.py
located in the folder services/notion
.
It is available in the link below:
:file_folder: File caching system
The file caching system is similar to the Notion caching system. The only difference is that for large PDF files, we split them into batches, which allows us to avoid having too much data in memory. The other parts are similar to the Notion caching system.
The actual implementation of the file caching system is divided into 2 files inside the folder services/documents
.
pdf.py
will handle the PDF files caching management. The PDF files are parsed using the llamaIndex library.text.py
will handle the text files caching management.
:cloud: Cache of the list of files on the Google cloud bucket
We also have set up a second caching system to display the list of files retrieved from the Google Cloud bucket. We had to do this because when we tried to add many more files to the Google Cloud bucket, the API time to get the full list of files was too long. So, we decided to cache the list of files in a JSON file that is stored in the Google Cloud bucket. This cache has a lifetime of 60 seconds.
:wrench: Management of the cached files
All the cached files are stored in a Google Cloud bucket. The files are stored inside a cache
folder located at the root of the Google Cloud bucket. For each file we cache, we create at least 3 files:
[!NOTE] For bigger PDF files, we have decided to split the file into batches for the raw data. This allows us to avoid having too much data in memory. The other parts are similar to the Notion caching system.
- the data file: This file contains the content of the file.
- the metadata file: This file contains the metadata of the file. This file contains some information about the file itself or the virtual folder structure.
- the index file: This file contains the indexes of the file. This allows us to avoid having to generate the indexes each time we want to query the file. This file is generated using the LlamaIndex library.
:page_facing_up: Basic file management
Our file management system is relatively simple. We have a folder called cache
that contains all the files that are cached. Each file is stored in a folder that is named after the file name. For most of the queries made to the bucket we use a document_service that will handle the query to the bucket and the cache of the files.
:file_folder: Virtual folder structure
To be able to manage the files in a more efficient way, we have decided to create a virtual folder structure. This allows us to have a better organization of the files while keeping the files in the same folder.
This way simplifies the management of the files when we have to delete or update the files. It also simplifies getting the indexes and the metadata of the files.
:world_map: Route system
To learn more about the route system, you can check the following link:
:gear: Services
In our project we use services to handle the different logic parts of the project. Like that we can separate the logic of the project from the routes. This allows us to have a better organization of the code and to be able to reuse the code in different parts of the project.
:robot: Agents System
In our project, to be able to respond to complex queries from the user we have implemented an agent workflow system. This system allows us to make the agents communicate with each other to be able to respond to the user.
The workflow system is composed of 3 agents and 4 steps. The workflow system is composed of the following steps:
- Setup: This step is used to set up the agents and the variables that will be used in the workflow system.
- Query: This step will use the query agent to get the related information from the data source. The query agent will use the tools that are available to it to get the information. Then it will pass the review step to check if the information is correct or not.
- Review: This step will use the review agent to check if the information is correct or not. The review agent will use the tools that are available to it to check if the information is correct or not. Then it will pass the write step to generate the final answer for the user or directly return the answer to the user.
[!NOTE] The review step can only call a write step 2 times. This is to avoid having too many write steps in the workflow system and infinite loops.
- Write: This step will use the write agent to generate the final answer for the user. The write agent will write the final answer for the user. The write agent will use the tools that are available to it to generate a new answer that will be sent to the review step.
Now that we have explained the workflow system, we will explain the different agents that are used in the workflow system.
:mag_right: Querying agent
Our query agent is a simple agent that has 4 tools that are used to query the data source. The query agent will use the tools that are available to it to get the information from the data source. The query agent will use the tools in the following order:
- analyze query: The agent will first check if the query can be answered directly with the data available in history context or if it needs to query the data source. If the query can be answered directly, the agent will return the answer to the review agent. If the query cannot be answered directly, the agent will use the tools that are available to it to get the information from the data source.
- metadata search: The agent will use the metadata search tool to filter the indexes by metadata to get the most relevant indexes. This will allow the agent to get the most relevant indexes for the query. With this step we can get rid of the indexes that are not relevant for the query. This will allow us to have a better performance and to avoid querying too many indexes.
- context search: The agent will use the context search tool to filter the content of the indexes to get the most relevant content. This will allow the agent to get the most relevant content for the query.
- web search: The agent will use the web search tool to search the web with Tavily to get the most relevant content. This will allow the agent to get the most relevant content from the web.
:pencil2: Write agent
Our write agent is a simple agent that has no tools but has a custom prompt that it has to follow to generate the answer. This agent is used to generate a comprehensive answer for the user. The write agent will use the information that is provided by the query agent and the review agent to generate the final answer for the user.
:eyes: Review agent
Our review agent is a simple agent that has 3 tools that are used to check if the information is correct or not. The review agent will use the tools that are available to it to check if the information is correct or not. The review agent will use the tools in the following order:
- context checker: The agent will check if the generated answer matches the context of the question. This will allow the agent to check if the generated answer is relevant to the question or not.
- instruction checker: The agent will check if the generated answer matches the instruction of the question. For example, if the question is to write code, the agent will check if the generated answer is code or not. Or if the user asks to respond in a specific language, the agent will check if the generated answer is in the right language.
- word counter: The agent will check if the number of words generated matches the user request. For example, if the user asks to generate a text of 100 words, the agent will check if the generated answer is 100 words or not.
:rocket: Deployment
The deployment of the project is done using Google Cloud Run. This allows us to have a simple and easy way to deploy the project on the cloud. The deployment is done using a Docker container that is created using a Dockerfile.
:cloud: Google cloud bucket setup and configuration
On the google cloud architecture, we have set up a Google Cloud bucket that is used to store the files that are uploaded by the user. This bucket is used to store the files that are uploaded by the user. The files are stored in a folder called cache
.
:robot: Automatic deployment
To simplify the deployment process of the project, we have decided to set up a continuous deployment system using the built-in tool inside Google Cloud.
To do this, we have created a build trigger that is connected to the main branch of the project. This build trigger will be started each time a change is pushed to the main branch.
:whale: Docker configuration
For the deployment of this project, we have decided to use a Docker container. The Docker container allows us to have an easy way to deploy the project to a Google Cloud Run service.
To create the Docker build easily, we have created a Dockerfile that contains all the necessary steps to create the Docker image. The Dockerfile is located in the root of the project and is called Dockerfile
.
This file will replicate the same steps explained in the README.md
file to create the Docker image. This includes:
- the installation of Python and the installation of the requirements.
- the installation and update of the npm packages and the dependencies.
- The generation of the static CSS file for Tailwind CSS.
- Checking the Google Cloud bucket storage
- and starting the app using gunicorn.
gunicorn is a Python WSGI HTTP server for UNIX. It is a pre-fork worker model, which means that it forks multiple worker processes to handle requests. This allows for better performance and scalability compared to the built-in Flask development server.
To have a detailed view of the Dockerfile, you can follow the link below: