xinferece学习笔记
1.环境搭建
- (1) 拉取docker镜像
>> sudo docker pull xprobe/xinference
- (2) 在docker内部启动Xinference,并映射端口和指定日志级别
>> sudo docker run -itd --ipc=host --name mirror_dev \
-v /data/project/18_nfs/jingyu/workspace/.xinferece:/root/.xinference \
-v /data/project/18_nfs/jingyu/workspace/.cache/huggingface:/root/.cache/huggingface \
-v /data/project/18_nfs/jingyu/workspace/.cache/modelscope:/root/.cache/modelscope \
-e XINFERENCE_MODEL_SRC=modelscope \
-p 10086:9997 --gpus all \
xprobe/xinference:latest
2.运行xinference
- (1) 进入容器内部
>> sudo docker exec -it mirror_dev bash
- (2) 运行xinference
>> xinference-local --host 0.0.0.0 --port 9997
INFO 03-25 01:33:11 __init__.py:190] Automatically detected platform cuda.
2025-03-25 01:33:13,466 xinference.core.supervisor 563 INFO Xinference supervisor 0.0.0.0:38870 started
2025-03-25 01:33:13,566 xinference.core.worker 563 INFO Starting metrics export server at 0.0.0.0:None
2025-03-25 01:33:13,572 xinference.core.worker 563 INFO Checking metrics export server...
2025-03-25 01:33:16,571 xinference.core.worker 563 INFO Metrics server is started at: http://0.0.0.0:40945
2025-03-25 01:33:16,573 xinference.core.worker 563 INFO Purge cache directory: /root/.xinference/cache
2025-03-25 01:33:16,579 xinference.core.worker 563 INFO Connected to supervisor as a fresh worker
2025-03-25 01:33:16,629 xinference.core.worker 563 INFO Xinference worker 0.0.0.0:38870 started
2025-03-25 01:33:20,634 xinference.api.restful_api 459 INFO Starting Xinference at endpoint: http://0.0.0.0:9997
2025-03-25 01:33:20,830 uvicorn.error 459 INFO Uvicorn running on http://0.0.0.0:9997 (Press CTRL+C to quit)
- (3) 注意:这个docker环境可能有点问题,我这边直接使用时,所有模型都没部署成功,因此重新安装了一下:
>> pip install "xinference[all]"
3.访问xinference
3.1 直接通过web ui来
- (1) UI地址: http://127.0.0.1:9998/ui
注意Model Engine选sglang可能会有bug:#3020
- (2) API文档地址:http://127.0.0.1:9998/docs
4.参考资料
- [1] xinference