PrivateGPT on Photon OS on WSL2 - dcasota/ollama-scripts GitHub Wiki

Provisioning an Offline-PrivateGPT as RAG-powered application has become easy in nowadays. The following recipe uses langchain-rag from Ollama, see https://github.com/ollama/ollama/tree/main/examples/langchain-python-rag-privategpt.

First, install Photon OS on WSL2, see https://github.com/dcasota/photonos-scripts/wiki/Photon-OS-on-WSL2 .

# change here
$ROOTLESS_USER="dcaso"
$distroname="Ph5"
wsl -d $distroname -u $ROOTLESS_USER -e /bin/bash

Configure langchain-python-rag-privategpt.

# (Re-)Install Ollama
sudo tdnf install -y python3-pip python3-devel git
cd $HOME
sudo rm -r -f ollama
sudo rm -r -f .ollama
export RELEASE=0.1.38
curl -fsSL https://ollama.com/install.sh | sed "s#https://ollama.com/download#https://github.com/ollama/ollama/releases/download/v\$RELEASE#" | sh
# Get the Ollama source examples
git clone -b v$RELEASE https://github.com/ollama/ollama.git
cd ollama/examples/langchain-python-rag-privategpt

sudo python3 -m venv .venv
source .venv/bin/activate
sudo pip3 install --upgrade pip
export PATH=$PATH:/home/$ROOTLESS_USER/.local/bin
sudo pip3 install -r requirements.txt

# In Ollama version 0.1.32, there is a limitation due to an older subcomponent version.
# This has been reported back in https://github.com/ollama/ollama/issues/2572.
# With the workaround mentioned, the processing of hundreds of pdf documents works flawlessly.
# sudo sed -i "s/chromadb = \"^0.3.26\"/chromadb = \"^0.4.7\"/" ./pyproject.toml
# pip3 install pyproject.toml
# pip3 install chroma-hnswlib
# pip3 install pybind
# pip3 uninstall langchain-community -y
# pip3 uninstall -r requirements.txt -y
# pip3 install tqdm
# pip3 install langsmith
# pip3 install huggingface-hub
# pip3 install langchain
# pip3 install gpt4all
# pip3 install chromadb
# pip3 install llama-cpp-python
# pip3 install urllib3
# pip3 install PyMuPDF
# pip3 install unstructured
# pip3 install extract-msg
# pip3 install tabulate
# pip3 install pandoc
# pip3 install pypandoc
# pip3 install sentence_transformers

# In Ollama 0.1.38 , there is still an issue.
# As workaround, updating components seems to work.
# pip --disable-pip-version-check list --outdated --format=json | python -c "import json, sys; print('\n'.join([x['name'] for x in json.load(sys.stdin)]))" | sudo xargs -n1 pip install -U

# Create the `source_documents` directory, store all your PDF documents in it 
mkdir -p $HOME/ollama/examples/langchain-python-rag-privategpt/source_documents

# download and store PDF documents. example
curl https://www.ensi.ch/de/wp-content/uploads/sites/2/2024/05/ENSI_Erfahrungs_und_Forschungsbericht_2023.pdf -o $HOME/ollama/examples/langchain-python-rag-privategpt/source_documents/ENSI_Erfahrungs_und_Forschungsbericht_2023.pdf

ollama serve &
# Pull llama2-uncensored
ollama pull llama2-uncensored
# Run Ingest.py
python ./ingest.py

That's all. Now start python privateGPT.py.

As a first mass test, I've copied all 1300++ pdf documents from www.ensi.ch into source_documents. In short:

On Windows: Download and configure HTTrack Copier to download the website content.

On WSL2:

cd $HOME/ollama/examples/langchain-python-rag-privategpt
find /mnt/c/'My Web Sites/ENSI' -name *.pdf >./allpdfs.txt

sudo cat <<EOFInstall | sudo tee ./import.sh
#!/bin/bash
increment_version() {
    local version=\$1
    local num_part=\${version##*[!0-9]}
    local str_part=\${version%%\$num_part}
    local next_num=\$((num_part + 1))
    echo "\${str_part}\${next_num}"
}

new_ending="1_"
while IFS= read -r line; do
   copyfile=\$line
   fname=\`basename "\$copyfile"\`
   if [ -f "./source_documents/\$fname" ]; then
      cp "\$copyfile" "./source_documents/\$new_ending\$fname"
      new_ending=\`increment_version \$new_ending\`
   else
      cp "\$copyfile" "./source_documents/\$fname"
   fi
done <./allpdfs.txt
EOFInstall

sudo chmod a+x ./import.sh
sudo ./import.sh

echo data integrity checks - files in ./source_documents:
ls ./source_documents -1q | wc -l
echo data integrity checks - lines with filenames in ./allpdfs.txt :
wc -l ./allpdfs.txt

# curl -J -L -O https://j3e.de/linux/convmv/convmv-2.05.tar.gz
# sudo tar -xzvf convmv-2.05.tar.gz
# sudo ./convmv-2.05/convmv -f iso-8859-15 -t utf-8 -r --notest "/mnt/c/My Web Sites/ENSI"

python ./ingest.py

And start again python ../privateGPT.py.

Here a few question prompt impressions.

Multi language support

Actually, ingest.py and privateGPT.py use all-MiniLM-L6-v2 as embeddings_model. Accordingly to https://ollama.com/blog/embedding-models, there are three embeddings_models supported: mxbai-embed-large, nomic-embed-text and all-minilm. In fact, this is too much developer field. E.g. mxbai-embed-large is not in huggingface' api path `sentence-transformer? , but expected by ingest.py.

Accordingly to this entry, multi-language support for embedding-models in Ollama isn't ready yet.

The recipe from https://github.com/ollama/ollama/issues/2965 seems to work.

In langchain-python-rag-privategpt directory, run the following commands.

sudo tdnf install -y git-lfs git
git lfs install

git clone -b b2536 https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
pip3 install -r requirements.txt

# paraphrase-multilingual-MiniLM-L12-v2
export Model=paraphrase-multilingual-MiniLM-L12-v2
export HuggingFacePath=https://huggingface.co/sentence-transformers
git clone $HuggingFacePath/$Model
python3 convert-hf-to-gguf.py ./$Model --outfile ./models/$Model.gguf --outtype f32

# Qwen1.5-7B-Chat
# export Model=Qwen1.5-7B-Chat
# export HuggingFacePath=https://huggingface.co/Qwen
# git clone $HuggingFacePath/$Model
# python3 convert-hf-to-gguf.py ./$Model --outfile ./models/$Model.gguf --outtype f16

cd models
sudo cat <<EOF | sudo tee ./Modelfile
FROM ./paraphrase-multilingual-MiniLM-L12-v2.gguf
EOF
ollama create paraphrase-multilingual-MiniLM-L12-v2 -f ./Modelfile
cd ..
cd ..

Now edit ingest.py.

embeddings_model_name = os.environ.get('EMBEDDINGS_MODEL_NAME', 'paraphrase-multilingual-MiniLM-L12-v2')

And edit privateGPT.py.

embeddings_model_name = os.environ.get("EMBEDDINGS_MODEL_NAME", "paraphrase-multilingual-MiniLM-L12-v2")

Run python ingest.py. And then python ../privateGPT.py.

Remark: To avoid a few deprecation messages, run pip3 install langchain_community and modify privatGPT.py with

#!/usr/bin/env python3
from langchain.chains import RetrievalQA                                                                                                                             from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain_community.vectorstores import Chroma
from langchain_community.llms import Ollama

Troubleshooting

AttributeError: module 'torch.library' has no attribute 'register_fake'

With Ollama 0.3.2 there is an issue with the compatibility of torch. A workaround is to upgrade torch to 2.4(latest). pip3 install torch --upgrade