Ollama + LiteLLM
Overview
Skyvern can use local models via Ollama or any OpenAI-compatible endpoint (e.g., LiteLLM). Two paths:
- (A) Direct Ollama — use the Ollama API (
/v1/chat/completions
) - (B) OpenAI-compatible (LiteLLM) — Skyvern connects to a proxy that exposes an OpenAI-style API
A) Direct Ollama
1) Start Ollama locally
Install Ollama and run a model (example: llama3.1
):
The API is usually at http://localhost:11434
.
2) Configure Skyvern (ENV)
Add to your .env
:
Note: Ollama may not support
max_completion_tokens
— Skyvern handles this internally.
B) OpenAI-compatible via LiteLLM
1) Run LiteLLM as a proxy
Minimal example (see LiteLLM docs for more options):
2) Configure Skyvern (ENV)
Add to your .env
:
Start Skyvern (local)
After setting the environment variables in .env
, start backend + UI:
Open the UI and pick the model (or keep default); if only Ollama/LiteLLM are enabled, Skyvern will use that.
Verify your setup
Before starting Skyvern, quickly verify that your LLM endpoint is reachable.
Ollama
LiteLLM (OpenAI-compatible)
If your model doesn’t appear, re-check the proxy flags and your
.env
values (OPENAI_COMPATIBLE_API_BASE
,OPENAI_COMPATIBLE_MODEL_NAME
, etc.).
Troubleshooting
- Model not responding / timeout: ensure
ollama serve
is running andOLLAMA_MODEL
exists (ollama list
). - LiteLLM 401: set
OPENAI_COMPATIBLE_API_KEY
to a value accepted by the proxy or disable auth on the proxy. - CORS / wrong base URL: confirm
OPENAI_COMPATIBLE_API_BASE
and that it ends with/v1
. - Model not visible: ensure
ENABLE_OLLAMA=true
orENABLE_OPENAI_COMPATIBLE=true
in.env
, then restart services.
Internal References
- Ollama vars:
ENABLE_OLLAMA
,OLLAMA_SERVER_URL
,OLLAMA_MODEL
- OpenAI-compatible vars:
ENABLE_OPENAI_COMPATIBLE
,OPENAI_COMPATIBLE_MODEL_NAME
,OPENAI_COMPATIBLE_API_KEY
,OPENAI_COMPATIBLE_API_BASE
,OPENAI_COMPATIBLE_API_VERSION
,OPENAI_COMPATIBLE_SUPPORTS_VISION
,OPENAI_COMPATIBLE_REASONING_EFFORT