然后使用以下选项运行模型。如果有更好的选择,请告诉我。
./main \
--model ./models/ggml-vicuna-13b-4bit.bin \
--color \
--threads 8 \
--batch_size 256 \
--n_predict -1 \
--top_k 12 \
--top_p 1 \
--temp 0.36 \
--repeat_penalty 1.05 \
--ctx_size 5120 \
--instruct \
--reverse-prompt '### Human:' \
--file prompts/vicuna.txt
./main \
--model ./models/ggml-vicuna-13b-4bit.bin \
--color \
--threads 4 \
--batch_size 256 \
--n_predict -1 \
--top_k 12 \
--top_p 1 \
--temp 0.36 \
--repeat_penalty 1.05 \
--ctx_size 2048 \
--instruct \
--reverse-prompt '### Human:' \
--file prompts/vicuna.txt
还有我的提示文件
A chat between a curious human and an artificial intelligence assistant.
The assistant gives helpful, detailed, and polite answers to the human's questions.
项目文件:[eachadea/ggml-vicuna-13b-4bit](https://huggingface.co/eachadea/ggml-vicuna-13b-4bit)
另,web ui
https://learnubuntu.com/install-conda/
https://huggingface.co/anon8231489123/vicuna-13b-GPTQ-4bit-128g
~/miniconda3/condabin/conda