vllm基准测试——benchmark_serving实践步骤

该文章已生成可运行项目,

vLLM服务性能基准测试说明记录

1. 测试环境准备

下载以来软件,最好在虚拟环境中执行:

conda create -n vllm_bench
conda activate vllm_bench
pip install vllm openai pandas datasets #若有缺少再进行pip install

2. 模型服务部署

2.1 启动API服务

vllm serve \
    /model_path/ \
    --max-model-len 80000 \
    --gpu-memory-utilization 0.4 \
    --swap_space 512 \
    --device auto \
    --no-enable-prefix-caching \

初始化完成记录:

INFO 07-25 11:16:59 [serving_chat.py:125] Using default chat sampling params from model: {'temperature': 0.6, 'top_p': 0.9}
INFO 07-25 11:16:59 [serving_completion.py:72] Using default completion sampling params from model: {'temperature': 0.6, 'top_p': 0.9}
INFO 07-25 11:16:59 [api_server.py:1457] Starting vLLM API server 0 on http://0.0.0.0:8000
INFO 07-25 11:16:59 [launcher.py:29] Available routes are:
INFO 07-25 11:16:59 [launcher.py:37] Route: /openapi.json, Methods: GET, HEAD
INFO 07-25 11:16:59 [launcher.py:37] Route: /docs, Methods: GET, HEAD
INFO 07-25 11:16:59 [launcher.py:37] Route: /docs/oauth2-redirect, Methods: GET, HEAD
INFO 07-25 11:16:59 [launcher.py:37] Route: /redoc, Methods: GET, HEAD
INFO 07-25 11:16:59 [launcher.py:37] Route: /health, Methods: GET
INFO 07-25 11:16:59 [launcher.py:37] Route: /load, Methods: GET
INFO 07-25 11:16:59 [launcher.py:37] Route: /ping, Methods: POST
INFO 07-25 11:16:59 [launcher.py:37] Route: /ping, Methods: GET
INFO 07-25 11:16:59 [launcher.py:37] Route: /tokenize, Methods: POST
INFO 07-25 11:16:59 [launcher.py:37] Route: /detokenize, Methods: POST
INFO 07-25 11:16:59 [launcher.py:37] Route: /v1/models, Methods: GET
INFO 07-25 11:16:59 [launcher.py:37] Route: /version, Methods: GET
INFO 07-25 11:16:59 [launcher.py:37] Route: /v1/chat/completions, Methods: POST
INFO 07-25 11:16:59 [launcher.py:37] Route: /v1/completions, Methods: POST
INFO 07-25 11:16:59 [launcher.py:37] Route: /v1/embeddings, Methods: POST
INFO 07-25 11:16:59 [launcher.py:37] Route: /pooling, Methods: POST
INFO 07-25 11:16:59 [launcher.py:37] Route: /classify, Methods: POST
INFO 07-25 11:16:59 [launcher.py:37] Route: /score, Methods: POST
INFO 07-25 11:16:59 [launcher.py:37] Route: /v1/score, Methods: POST
INFO 07-25 11:16:59 [launcher.py:37] Route: /v1/audio/transcriptions, Methods: POST
INFO 07-25 11:16:59 [launcher.py:37] Route: /v1/audio/translations, Methods: POST
INFO 07-25 11:16:59 [launcher.py:37] Route: /rerank, Methods: POST
INFO 07-25 11:16:59 [launcher.py:37] Route: /v1/rerank, Methods: POST
INFO 07-25 11:16:59 [launcher.py:37] Route: /v2/rerank, Methods: POST
INFO 07-25 11:16:59 [launcher
本文章已经生成可运行项目
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值