Back to Marketplace

vLLM CLI

RAPR CLI connectorAI, ML & Model Ops

Serve, chat, complete, batch, and benchmark high-throughput LLM inference with vLLM.

By vLLM Projectv1.0.0Package license: MITFree package; provider account or API usage may be required.

vLLM CLI package details

vllmllm-servinginferenceopenai-compatiblegpu

RAPR CLI connector scope

This is RAPR-authored connector guidance for a command-line tool the user installs locally. The upstream CLI remains governed by its own license and terms. This package contains RAPR-authored CLI usage guidance, install commands, and agent instructions. It does not bundle the upstream CLI binary. Users can also go directly to the public upstream source linked on this page.

How to get started

Install RAPR AI

Download and install RAPR AI on your computer

Find in Marketplace

Open RAPR AI, go to Packages, and browse the marketplace

Install from Marketplace

Click Install. RAPR sets up the wrapper package, connector guidance, or skill instructions for this listing.

Authenticate

Use HF_TOKEN for gated Hugging Face models when needed

Post-install: authenticate to start using
Use HF_TOKEN for gated Hugging Face models when needed

vLLM CLI

Serve, chat, complete, batch, and benchmark high-throughput LLM inference with vLLM.

Official source:

Install with pip install vllm. Verify with vllm --help.

Ready to try vLLM CLI?

Download RAPR AI and connect vLLM CLI in seconds.