ai ollama - ghdrako/doc_snipets GitHub Wiki

Ollama is a streamlined tool for running open-source LLMs locally, including Mistral and Llama 2. Ollama bundles model weights, configurations, and datasets into a unified package managed by a Modelfile.

Ollama supports a variety of LLMs including LLaMA-2, uncensored LLaMA, CodeLLaMA, Falcon, Mistral, Vicuna model, WizardCoder, and Wizard uncensored.

Ollama also supports the creation and use of custom models. You can create a model using a Modelfile, which includes passing the model file, creating various layers, writing the weights, and finally, seeing a success message.

One of the unique features of Ollama is its support for importing GGUF and GGML file formats in the Modelfile. This means if you have a model that is not in the Ollama library, you can create it, iterate on it, and upload it to the Ollama library to share with others when you are ready.

Running Models Using Ollama

ollama run codellama  # if not installed installl first

ollama pull llama2
ollama pull llama2-uncensored
ollama pull llama2:13b

Ollama Web UI

git clone https://github.com/ollama-webui/ollama-webui.git

docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v ollama-webui:/app/backend/data --name ollama-webui --restart always ghcr.io/ollama-webui/ollama-webui:main

lite version

git clone https://github.com/ollama-webui/ollama-webui-lite.git
cd ollama-webui-lite
pnpm i && pnpm run dev
visit http://localhost:3000/

Models

Ollama models are stored in the ~/.ollama/models directory on your local machine. This directory contains all the models that you have downloaded or created. The models are stored in a subdirectory named blobs.

When you download a model using the ollama pull command, it is stored in the ~/.ollama/models/manifests/registry.ollama.ai/library//latest directory. If you specify a particular version during the pull operation, the model is stored in the ~/.ollama/models/manifests/registry.ollama.ai/library// directory.

If you want to import a custom model, you can create a Modelfile with a FROM instruction that specifies the local filepath to the model you want to import. After creating the model in Ollama using the ollama create command, you can run the model using the ollama run command.

Using Ollama with Python

You can also use Ollama with Python. LiteLLM is a Python library that provides a unified interface to interact with various LLMs, including those run by Ollama.

To use Ollama with LiteLLM, you first need to ensure that your Ollama server is running. Then, you can use the litellm.completion function to make requests to the server. Here's an example of how to do this:

from litellm import completion

response = completion(
    model="ollama/llama2",
    messages=[{ "content": "respond in 20 words. who are you?", "role": "user"}],
    api_base="http://localhost:11434"
)

print(response)