Ollama + LiteLLM

Overview

Skyvern can use local models via Ollama or any OpenAI-compatible endpoint (e.g., LiteLLM). Two paths:

  • (A) Direct Ollama — use the Ollama API (/v1/chat/completions)
  • (B) OpenAI-compatible (LiteLLM) — Skyvern connects to a proxy that exposes an OpenAI-style API

A) Direct Ollama

1) Start Ollama locally

Install Ollama and run a model (example: llama3.1):

$ollama pull llama3.1
>ollama serve

The API is usually at http://localhost:11434.

2) Configure Skyvern (ENV)

Add to your .env:

$# Enable Ollama integration
>ENABLE_OLLAMA=true
>
># Ollama server URL
>OLLAMA_SERVER_URL=http://localhost:11434
>
># Model name in Ollama (check with `ollama list`)
>OLLAMA_MODEL=llama3.1

Note: Ollama may not support max_completion_tokens — Skyvern handles this internally.


B) OpenAI-compatible via LiteLLM

1) Run LiteLLM as a proxy

Minimal example (see LiteLLM docs for more options):

$# Map an Ollama model behind an OpenAI-compatible endpoint
>litellm --model ollama/llama3.1 --host 0.0.0.0 --port 4000

2) Configure Skyvern (ENV)

Add to your .env:

$# Enable OpenAI-compatible provider
>ENABLE_OPENAI_COMPATIBLE=true
>
># The "model" exposed by the proxy (any identifier your proxy accepts)
>OPENAI_COMPATIBLE_MODEL_NAME=llama3.1
>
># API key required by the proxy (use any placeholder if proxy doesn't enforce)
>OPENAI_COMPATIBLE_API_KEY=sk-test
>
># Base URL for the LiteLLM proxy (OpenAI-compatible)
>OPENAI_COMPATIBLE_API_BASE=http://localhost:4000/v1
>
># (optional)
># OPENAI_COMPATIBLE_API_VERSION=2024-06-01
># OPENAI_COMPATIBLE_SUPPORTS_VISION=false
># OPENAI_COMPATIBLE_REASONING_EFFORT=low|medium|high

Start Skyvern (local)

After setting the environment variables in .env, start backend + UI:

$# with Docker (recommended)
>docker compose up -d
>
># or locally (dev):
># poetry install
># ./run_skyvern.sh
># ./run_ui.sh

Open the UI and pick the model (or keep default); if only Ollama/LiteLLM are enabled, Skyvern will use that.


Verify your setup

Before starting Skyvern, quickly verify that your LLM endpoint is reachable.

Ollama

$# Should return the list of local models
>curl -s http://localhost:11434/api/tags | jq .

LiteLLM (OpenAI-compatible)

$# Should return available models exposed by your proxy
>curl -s http://localhost:4000/v1/models -H "Authorization: Bearer sk-test" | jq .

If your model doesn’t appear, re-check the proxy flags and your .env values (OPENAI_COMPATIBLE_API_BASE, OPENAI_COMPATIBLE_MODEL_NAME, etc.).


Troubleshooting

  • Model not responding / timeout: ensure ollama serve is running and OLLAMA_MODEL exists (ollama list).
  • LiteLLM 401: set OPENAI_COMPATIBLE_API_KEY to a value accepted by the proxy or disable auth on the proxy.
  • CORS / wrong base URL: confirm OPENAI_COMPATIBLE_API_BASE and that it ends with /v1.
  • Model not visible: ensure ENABLE_OLLAMA=true or ENABLE_OPENAI_COMPATIBLE=true in .env, then restart services.

Internal References

  • Ollama vars: ENABLE_OLLAMA, OLLAMA_SERVER_URL, OLLAMA_MODEL
  • OpenAI-compatible vars: ENABLE_OPENAI_COMPATIBLE, OPENAI_COMPATIBLE_MODEL_NAME, OPENAI_COMPATIBLE_API_KEY, OPENAI_COMPATIBLE_API_BASE, OPENAI_COMPATIBLE_API_VERSION, OPENAI_COMPATIBLE_SUPPORTS_VISION, OPENAI_COMPATIBLE_REASONING_EFFORT