Costs - davidmarsoni/llog GitHub Wiki
Costs are an important point to consider when we are using LLM and Cloud services. In this section, we will explain the cost of the different services used in the project and how to reduce them.
:mag: Tavily
Tavily is a web search engine that allows you to search the web for information. It is used in our project to provide a web search tool for our agents.
The cost of the Tavily API is free for the first 1000 requests per month. After that, the cost is 0.008 dollars per request. As our project is not used by real users, we won't exceed the free tier of the Tavily API.
If you want to have more information about the Tavily API pricing, you can check the official documentation here.
:robot: OpenAI API
The OpenAI API is mostly a paid service. The actual free tier is so limited that it is not possible to use it for a real project.
The cost of the OpenAI API can vary a lot depending on the model used and the number of requests and tokens used. For our project, we are using the GPT-4o-mini model. This model is the most cost-effective model of OpenAI, and it is a great model for our project.
This model currently costs around 0.4 dollars for 1 million tokens. This is a very low cost compared to the other models of OpenAI.
For example, 100 requests of 10,000 tokens (including the input and output tokens) will cost around 0.4 dollars.
Here below is the screenshot taken on the 12th of July 2024 of the OpenAI API pricing page for this model.
To see the current pricing of the OpenAI API, you can check the official documentation here.
With LlamaIndex, we can also use the OpenAI API to generate the indexes. These indexes are generated using another model of OpenAI called text-embedding-ada-002
. This model is cheap, and it is a great model for generating the indexes. The cost of this model is 0.1 dollars for 1 million tokens. This is a very low cost compared to the other models of OpenAI.
For example, 1 million tokens represent 750,000 words, which correspond to 1000-2500 pages of text. This is a very low cost for generating the indexes.
Here below is the screenshot taken on the 12th of July 2024 of the OpenAI API pricing page for this model.
To see the current pricing of the OpenAI API, you can check the official documentation here.
:cloud: Google Cloud Storage
Google Cloud Storage, which is one of the Google solutions to store files, is a paid service. The cost of the service depends on the amount of data stored and the number of requests made to the service.
For our project, we are using a Google Cloud Storage bucket to store the files and indexes. The size of these files isn't really big. For example, the size of the indexes, raw data, and metadata is around 1MB for a 600-page PDF file.
Moreover, the cost of Google Cloud Storage is very low for small amounts of data. For our use case, the current cost of Google Cloud Storage for a europe-west6
(Zurich) region for standard storage is around 0.0025 dollars per GB per month. This is a very low cost for our usage.
To have the current detailed pricing of the Google Cloud Storage you can check the official pricing page here.
:rocket: Google Cloud Run
Google Cloud Run is a serverless container service that allows you to run your docker containers on the cloud. The cost of this service is based on multiple factors such as the number of requests, the amount of CPU and memory used, and the duration of the requests. A free tier is available for the first 2 million requests per month, first 180,000 vCPU-seconds, and first 360,000 GB-seconds of memory. To be able to use this free tier you need to create your app run inside the region us-central1 this free tier isn't available in the other regions.
For our project, as we don't have any real usage of the app by end users, the free tier is enough to cover the cost of the service.
For detailed pricing of Google Cloud Run, you can check the official pricing page here.
If you want an estimate of the cost of Google Cloud Run, you can use the Google Cloud Pricing Calculator. Here, you will have to enter the following parameters:
:package: Artifact Registry
Google Cloud Artifact Registry is a service that stores the builds of the Docker images. This service uses the same pricing as Google Cloud Storage. The cost of this service is based on the amount of data stored and the number of requests made to the service. As the project builds are larger than the files stored in the Google Cloud Storage bucket, the cost of this service is a bit higher than Google Cloud Storage. Also, we were not aware at first that Artifact Registry saves Docker images indefinitely unless you delete them. So, we have set up rules to delete the builds after 24 hours. This is a good practice to avoid having too much data stored in the Artifact Registry and avoid incurring excessive costs.
:memo: Notion API
The Notion API is free to use. However, it has enormous limitations. The main limitation is the number of characters that can be returned each second.
For example, when you want to get the content of a page of, say, 5000 characters, the API will return only 1000 characters at a time, then you wait a few seconds to get the next 1000 characters. This is a big limitation when you want to get the content of a page with a lot of text.
To have a detailed view of the limitations of the Notion API, you can check the official documentation here.
:moneybag: Overall cost
The overall cost of the project is manageable and it is not too expensive. The two most expensive services are the OpenAI API and Artifact Registry. The other services are very cheap, and the free tier is enough to cover the cost of the project. The overall cost of the project is less than 1 dollar per month if we don't deploy the app. If we deploy the app, the cost will be around 5–10 dollars per month depending on the usage of the app.