You are viewing the latest developer preview docs. Click here to view docs for the latest stable release.

使用 NVIDIA Triton 部署

使用 NVIDIA Triton 部署#

Triton 推理服务器提供了一个教程，演示了如何使用 vLLM 快速部署一个简单的 facebook/opt-125m 模型。有关更多详细信息，请参阅在 Triton 中部署 vLLM 模型。