Skyvern – We raised $2.7M to fix browser automation (open source)

Skyvern – We raised $2.7M to fix browser automation (open source)
0:00
/0:57

We just raised a $2.7M seed round to fix one of the most boring but expensive problems in business: manual browser work.

Every company has "that" workflow. Maybe it’s a team logging into vendor portals to download invoices. Maybe it’s navigating a 10-page government form that hasn't been updated since 2005.

We built Skyvern to automate this. Think of it as AI with hands.

The Problem: The "Maintenance Tax" of Automation For the last decade, you had two bad options for these workflows:

  1. Human SOPs: You write a Standard Operating Procedure and hire someone to click buttons. It’s slow, expensive, and humans hate doing it.
  2. Brittle Scripts: You write a Selenium/Playwright script. But the moment a frontend dev changes a class from .submit-btn to .btn-primary-v2, or a text box moves 10 pixels, the script crashes.

We built Skyvern to stop playing whack-a-mole with the DOM.

How Skyvern Works (Visual Reasoning) Most agents fail because they try to parse code (which is full of obfuscation and dynamic rendering). We switched to Visual Reasoning. Skyvern doesn't look for #checkout-button. It takes a screenshot, uses a Vision-LLM to find the thing that looks like a checkout button, and clicks it. If the underlying code changes but the UI stays the same, Skyvern keeps working.

The Architecture: Planner vs. Actor vs. Validator A simple "See -> Click" loop isn't enough for complex tasks. LLMs hallucinate or get stuck. Skyvern 2.0 splits the brain:

  • Planner: Holds the high-level goal ("Download March invoices").
  • Actor: Executes the immediate step ("Enter date range").
  • Validator: This is the critical part. It looks at the screen after the action to see if it actually worked. Did a popup block the click? Did the URL change? If not, it tells the Planner to retry.

Why we raised the money: The "Compile-to-Code" Engine Real talk: Running an LLM for every single interaction is too slow and too expensive for high-volume scraping.

We are using this capital to build "Route Memorization." You let the AI figure out the path once (the expensive part). We then "compile" that successful path into a fast, cheap Playwright script. If the site changes and the script breaks, the AI wakes up, "heals" the path, and re-compiles the script.

Benchmarks & Links We got tired of "vibes-based" evaluation, so we built Web Bench (5,750 tasks across the top 1,000 websites), specifically focusing on complex Write actions (forms, inputs).

We’re open source. If you’re tired of fixing broken cron jobs or writing SOPs that nobody follows, give us a spin. Try it yourself here: app.skyvern.com

– The Skyvern Team