MirrorYuChen
MirrorYuChen
Published on 2025-03-25 / 20 Visits
0
0

xinferece部署大模型学习笔记

xinferece学习笔记

1.环境搭建

  • (1) 拉取docker镜像
>> sudo docker pull xprobe/xinference
  • (2) 在docker内部启动Xinference,并映射端口和指定日志级别
>> sudo docker run -itd --ipc=host --name mirror_dev                                  \
-v /data/project/18_nfs/jingyu/workspace/.xinferece:/root/.xinference                 \
-v /data/project/18_nfs/jingyu/workspace/.cache/huggingface:/root/.cache/huggingface  \
-v /data/project/18_nfs/jingyu/workspace/.cache/modelscope:/root/.cache/modelscope    \
-e XINFERENCE_MODEL_SRC=modelscope                                                    \
-p 10086:9997 --gpus all                                                              \
xprobe/xinference:latest                                                        

2.运行xinference

  • (1) 进入容器内部
>> sudo docker exec -it mirror_dev bash
  • (2) 运行xinference
>> xinference-local --host 0.0.0.0 --port 9997
INFO 03-25 01:33:11 __init__.py:190] Automatically detected platform cuda.
2025-03-25 01:33:13,466 xinference.core.supervisor 563 INFO     Xinference supervisor 0.0.0.0:38870 started
2025-03-25 01:33:13,566 xinference.core.worker 563 INFO     Starting metrics export server at 0.0.0.0:None
2025-03-25 01:33:13,572 xinference.core.worker 563 INFO     Checking metrics export server...
2025-03-25 01:33:16,571 xinference.core.worker 563 INFO     Metrics server is started at: http://0.0.0.0:40945
2025-03-25 01:33:16,573 xinference.core.worker 563 INFO     Purge cache directory: /root/.xinference/cache
2025-03-25 01:33:16,579 xinference.core.worker 563 INFO     Connected to supervisor as a fresh worker
2025-03-25 01:33:16,629 xinference.core.worker 563 INFO     Xinference worker 0.0.0.0:38870 started
2025-03-25 01:33:20,634 xinference.api.restful_api 459 INFO     Starting Xinference at endpoint: http://0.0.0.0:9997
2025-03-25 01:33:20,830 uvicorn.error 459 INFO     Uvicorn running on http://0.0.0.0:9997 (Press CTRL+C to quit)
  • (3) 注意:这个docker环境可能有点问题,我这边直接使用时,所有模型都没部署成功,因此重新安装了一下:
>> pip install "xinference[all]"

3.访问xinference

3.1 直接通过web ui来

  • (1) UI地址: http://127.0.0.1:9998/ui

xinference_ui.png

​ 注意Model Engine选sglang可能会有bug:#3020

xinference_model_config.png

  • (2) API文档地址:http://127.0.0.1:9998/docs

4.参考资料


Comment