Notice: In order to resolve issues efficiently, please follow the template.
(注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节。)
Before asking
- Search existing issues: https://github.com/modelscope/FunASR/issues
- Search the docs: https://modelscope.github.io/FunASR/
- Check the README quick start and deployment section.
Question
按照https://github.com/modelscope/FunASR/blob/main/docs/vllm_guide_zh_v2.md,安装后执行python demo_vllm.py --input audio.wav --hotwords 张三 北京 ,提示cuda版本有问题。
安装命令:
pip install torch torchaudio
pip install funasr>=1.3.0
pip install vllm>=0.12.0
pip install safetensors tiktoken websockets regex fastapi uvicorn python-multipart
pip install -e .
cuda
(funasr_vllm) feixin@feixin:/data/deploy/FunASR-new/FunASR/examples/industrial_data_pretraining/fun_asr_nano$ nvidia-smi
Thu Jun 11 18:48:11 2026
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.86.10 Driver Version: 570.86.10 CUDA Version: 12.8 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4060 Ti Off | 00000000:04:00.0 Off | N/A |
| 33% 52C P0 72W / 165W | 7879MiB / 16380MiB | 44% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 100505 C python 118MiB |
| 0 N/A N/A 1592328 C python 3178MiB |
| 0 N/A N/A 1592744 C VLLM::EngineCore 4438MiB |
| 0 N/A N/A 2259378 C python 118MiB |
+-----------------------------------------------------------------------------------------+
Code or command
(funasr_vllm) feixin@feixin:/data/deploy/FunASR-new/FunASR/examples/industrial_data_pretraining/fun_asr_nano$ python demo_vllm.py --input audio.wav --hotwords 张三 北京
============================================================
Fun-ASR-Nano vLLM Inference
============================================================
Model: FunAudioLLM/Fun-ASR-Nano-2512
Tensor Parallel: 1 GPU(s)
Dtype: bf16
Language: 中文
Hotwords: ['张三', '北京']
Downloading Model from https://www.modelscope.cn to directory: /home/feixin/.cache/modelscope/hub/models/FunAudioLLM/Fun-ASR-Nano-2512
Traceback (most recent call last):
File "/data/deploy/FunASR-new/FunASR/examples/industrial_data_pretraining/fun_asr_nano/demo_vllm.py", line 171, in <module>
main()
File "/data/deploy/FunASR-new/FunASR/examples/industrial_data_pretraining/fun_asr_nano/demo_vllm.py", line 63, in main
engine = FunASRNanoVLLM.from_pretrained(
File "/data/deploy/FunASR-new/FunASR/funasr/models/fun_asr_nano/inference_vllm.py", line 710, in from_pretrained
return cls(
File "/data/deploy/FunASR-new/FunASR/funasr/models/fun_asr_nano/inference_vllm.py", line 184, in __init__
self._load_audio_components(model_dir, **kwargs)
File "/data/deploy/FunASR-new/FunASR/funasr/models/fun_asr_nano/inference_vllm.py", line 355, in _load_audio_components
self.audio_encoder = self.audio_encoder.to(self.device, dtype=torch.float32)
File "/opt/miniconda/envs/funasr_vllm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1384, in to
return self._apply(convert)
File "/opt/miniconda/envs/funasr_vllm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 934, in _apply
module._apply(fn)
File "/opt/miniconda/envs/funasr_vllm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 934, in _apply
module._apply(fn)
File "/opt/miniconda/envs/funasr_vllm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 934, in _apply
module._apply(fn)
[Previous line repeated 1 more time]
File "/opt/miniconda/envs/funasr_vllm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 965, in _apply
param_applied = fn(param)
File "/opt/miniconda/envs/funasr_vllm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1370, in convert
return t.to(
File "/opt/miniconda/envs/funasr_vllm/lib/python3.10/site-packages/torch/cuda/__init__.py", line 478, in _lazy_init
torch._C._cuda_init()
RuntimeError: The NVIDIA driver on your system is too old (found version 12080). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver.
What have you tried?
Environment
- OS: Ubuntu24.04
- Python version: 13.10
- FunASR version: 1.3.9
- ModelScope version: 1.37.1
- PyTorch / torchaudio version: 2.11.0
- Install method (
pip, source, Docker):
- Device (
cuda, cpu, mps): cuda
- GPU model:
- CUDA/cuDNN version: 12.8
- Docker image tag, if used:
Notice: In order to resolve issues efficiently, please follow the template.
(注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节。)
Before asking
Question
按照https://github.com/modelscope/FunASR/blob/main/docs/vllm_guide_zh_v2.md,安装后执行python demo_vllm.py --input audio.wav --hotwords 张三 北京 ,提示cuda版本有问题。
安装命令:
cuda
Code or command
What have you tried?
Environment
pip, source, Docker):cuda,cpu,mps): cuda