前言

学习大模型者，都是想在自己的环境中部署一个模型进行试验。最近DeepSeek让我想实现了这个目标。

准备

本文使用一台Azure云上的虚拟机(16 vCPU， 64GB内存)，Python环境，模型选择 DeepSeek-R1-Distill-Qwen-1.5B。

VLLM CPU : https://docs.vllm.ai/en/latest/getting_started/installation/cpu/index.html
VLLM Project : https://github.com/vllm-project/vllm
DeepSeek-R1 : https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

实操

第一步：在Azure中创建Linux 虚拟机

使用Ubuntu Server 24.04 LTS- Gen2 镜像, Size 选择为 Standard D16ads v5 ( 16 vcpus, 64GiB memory)

第二步：登录到Linux虚拟机，安全Python环境

sudo apt update

apt list --upgradable

sudo apt install python3

第三步：安装CPU vLLM编译器，下载VLLM（https://github.com/vllm-project/vllm）

sudo apt-get update  -y

sudo apt-get install -y gcc-12 g++-12 libnuma-dev

sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 10 --slave /usr/bin/g++ g++ /usr/bin/g++-12 git clone https://github.com/vllm-project/vllm.git

cd vllm

pip install -r requirements-cpu.txt

pip install -e .

错误信息：

~$ git clone https://github.com/vllm-project/vllm.git

Cloning into 'vllm'...

fatal: unable to access 'https://github.com/vllm-project/vllm.git/': Failed to connect to github.com port 443 after 133171 ms: Couldn't connect to server

通过Git clone VLLM文件并编译安装，但因为访问github不成功，所以需要提前准备好 VLLM ZIP文件，通过上传到Storage Account后，在Linux中使用wget下载并unzip。采用“曲线方针”方式，成功下载VLLM文件。

VLLM 成功下载后，进入VLLM目录。执行 requirements-cpu.txt依赖安装

pip install --upgrade pip

pip install cmake>=3.26 wheel packaging ninja "setuptools-scm>=8" numpy

pip install -v -r requirements-cpu.txt --extra-index-url https://download.pytorch.org/whl/cpu

第四步：安装完成后，通过下面的命令安装 VLLM CPU版本

VLLM_TARGET_DEVICE=cpu python setup.py install

Note**： 此处setup.py文件有个bug****，需要安装前修改文件。通过运行以下命令行修改setup.py****文件中的get_vllm_version()**函数：

def get_vllm_version() -> str: try:
       version = get_version(
       write_to="vllm/_version.py",  # TODO: move this to pyproject.toml ) except LookupError:
       version = "0.0.0"

PS: 执行“VLLM_TARGET_DEVICE=cpu python setup.py install“这一步需要较长时间。

第五步：安装完成后，VLLM就完成了cpu版本的安装

考虑是中国区环境下的VM，可以通过以下endpoint访问mirror huggingface. 在Linux VM中添加以下环境变量，访问mirror huggingface镜像。

export HF_ENDPOINT=https://hf-mirror.com

第六步：加载DeepSeek 1.5B的模型

vllm serve "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B"

遇见错误一：operator torchvision::nms does not exist


File "/home/lbadmin/.venv/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 1805, in __getattr__ module = self._get_module(self._class_to_module[name]) 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
File "/home/lbadmin/.venv/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 1819, in _get_module raise RuntimeError(

RuntimeError: Failed to import transformers.processing_utils because of the following error (look up to see its traceback):

operator torchvision::nms does not exist

通过重新安装 torchvision 后，解决以上问题。

pip install --force-reinstall torchvision --extra-index-url https://download.pytorch.org/whl/torchvision/

遇见错误二：RuntimeError: Failed to infer device type

 File "/home/lbadmin/.venv/lib/python3.12/site-packages/vllm/engine/arg_utils.py", line 1074, in create_engine_config

   device_config = DeviceConfig(device=self.device) 

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
File "/home/lbadmin/.venv/lib/python3.12/site-packages/vllm/config.py", line 1626, in __init__

   raise RuntimeError("Failed to infer device type")

RuntimeError: Failed to infer device type

操作失败！通过从网上的资料判断，目前错误原因是所选择的模型不支持当前CPU 运行！

！试验失败！

假如第六步加载模型可以成功，就可以通过下面的代码测试模型：


curl -X POST "http://<public ip>:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{
       "model": "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B", "messages": [
           { "role": "user", "content": "What is the capital of France?" }
       ]
}'

谨以此文做一个笔记，2025年，持续学习大模型！

当在复杂的环境中面临问题，格物之道需：浊而静之徐清，安以动之徐生。云中，恰是如此!

【Azure 环境】Azure 虚拟机上部署 DeepSeek R1 模型教程(1.5B参数)【失败】 - LuBu0505/My-Code GitHub Wiki

前言

准备

实操

第一步：在Azure中创建Linux 虚拟机

第二步：登录到Linux虚拟机，安全Python环境

第三步：安装CPU vLLM编译器，下载VLLM（https://github.com/vllm-project/vllm）

第四步：安装完成后，通过下面的命令安装 VLLM CPU版本

第五步：安装完成后，VLLM就完成了cpu版本的安装

第六步：加载DeepSeek 1.5B的模型

遇见错误一：operator torchvision::nms does not exist

遇见错误二：RuntimeError: Failed to infer device type

！试验失败！

原文地址

⚠️ GitHub.com Fallback ⚠️

【Azure 环境】Azure 虚拟机上部署 DeepSeek R1 模型教程(1.5B参数)【失败】 - LuBu0505/My-Code GitHub Wiki

前言

准备

实操

第一步：在Azure中创建Linux 虚拟机

第二步：登录到Linux虚拟机，安全Python环境

第三步：安装CPU vLLM编译器，下载VLLM（https://github.com/vllm-project/vllm）

第四步：安装完成后，通过下面的命令安装 VLLM CPU版本

第五步：安装完成后，VLLM就完成了cpu版本的安装

第六步：加载DeepSeek 1.5B的模型

遇见错误一：operator torchvision::nms does not exist

遇见错误二：RuntimeError: Failed to infer device type

！试验失败！

原文地址

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️