All posts

Your agents, your infrastructure: the case for self-hosted runtimes

Managed agent platforms are a fine deal until the day they aren't. Here are the five reasons teams self-host their agent runtime, and the progressive infrastructure model that makes it actually doable.

Appstrate
sovereigntyself-hostinginfrastructure

🌍 When AWS went down, your AI went down

The outage window in February 2026 was seven hours. Bedrock was out for four of them. For the teams that had wired their production agents on AgentCore, there was no fallback: the runtime was the provider. The prompts were in their format. The state machine was on their servers. The logs lived in their region.

For that half-day, a lot of companies learned what they had actually built on top of.

This isn't a pitch against managed runtimes. Bedrock AgentCore, Claude Managed Agents, Microsoft Agent Framework are well-designed products. They're the right choice for a lot of teams. But they make a specific trade: you get a turnkey platform, and in exchange you give up control of the infrastructure layer.

This post is about the other path (running the runtime yourself) and why more teams are choosing it than the 2025 consensus expected.

💡 The five reasons teams self-host

"Privacy" shows up in every self-hosting pitch. It's the least interesting of the real reasons. Here's the list as it actually plays out on the ground.

1. Data residency isn't optional

If you're EU-regulated, HIPAA, GLBA, a defense contractor, or operating in a country with a data localization law, where the agent runs is a compliance question with a legal answer. You can't fix "the agent container ran in us-east-1" with a stronger NDA.

Managed runtimes in 2026 have some regional flexibility, but the region that has your data, the region that has your LLM, the region that has the orchestrator, and the region that has the logs are not the same dropdown. Self-hosting is the only way to guarantee all four.

2. BYOM (bring your own model) isn't a feature, it's an exit plan

Every managed runtime has a preferred model. Bedrock pushes you toward Anthropic and Nova. Claude Managed is Claude. Azure is OpenAI-plus-open-weights. Switching models on those platforms is a supported operation, until you hit the one that isn't, or the one where the pricing flipped last quarter.

The thing that's worse than cloud lock-in is model lock-in. Cloud providers have interchangeable primitives: compute is compute. Models are not. A prompt that works on Claude 4.x fails on GPT-5 in subtle ways. A switch costs you eval budget, not just engineer time.

A self-hosted runtime makes BYOM a config change:

# Tuesday: a Claude agent
MODEL=anthropic:claude-sonnet-4-6

# Wednesday, after the pricing announcement: a Gemini agent
MODEL=google:gemini-2.5-pro

# The weekend, for a sensitive workload: a local agent
MODEL=ollama:llama-4-70b

No repository migration. No SDK swap. No contract renegotiation.

3. The cost math stops working at scale

Managed agent pricing in 2026 compounds three margins: the LLM vendor's margin on the model, the runtime vendor's margin on the orchestration, and the cloud vendor's margin on the compute underneath. For a team running ten thousand agent calls a day, the arithmetic is tolerable. For a team running a million, the spread between managed and self-hosted on a dedicated node becomes a line item that justifies an engineer.

This isn't true at every scale. If your workload is bursty and small, managed is cheaper. If it's sustained and large, at some point it flips. The question is whether you can see that point coming.

4. Embedding agents in your own product

This one is specific but growing fast. You're building a SaaS. You want to ship agents to your users: branded, inside your app, with your auth, your billing, your support path.

A managed runtime gives your users a console that isn't yours, auth that isn't yours, a pricing page that isn't yours. You become a reseller. For a lot of products, that's fine. For the ones that sell on the experience, it isn't.

A self-hosted runtime with a headless API (run by you, invisible to your users) is what makes it possible to sell "an AI agent that does X" without being visibly built on someone else's platform.

5. Air-gap, sovereign cloud, and "the security team said no"

There's a non-trivial pool of use cases where no external call leaves the perimeter. Ever. Defense, critical infrastructure, classified research, regulated finance back-offices. These teams aren't going to solve their problem with a managed runtime, full stop.

Self-hosted isn't a nice-to-have here; it's the only checkbox that produces a working system.

🧱 Progressive infrastructure: the four tiers

The honest objection to self-hosting used to be "I don't want to run Postgres, Redis, S3, Docker, a queue, and a secret store just to try it." It was a reasonable objection. The answer, in 2026, is that none of that is required up front.

The pattern Appstrate uses (and that we'd argue every open-source runtime should) is progressive infrastructure. Every external dependency is optional, with a built-in fallback:

ComponentFull infraFallback when absentTier
PostgreSQLManaged or self-runPGlite (embedded WASM Postgres)1+
RedisManaged or self-runIn-memory adapters2+
S3 / MinIOAny S3-compatibleFilesystem storage3
DockerDaemon on the hostBun subprocesses, no isolation3

At Tier 0, all you need is Bun and a terminal. The runtime comes up, stores everything in ./data/, and lets you run agents. That's the "can I try this in 30 seconds" tier.

At Tier 3, you're running the full stack (Postgres, Redis, S3, Docker) on your own servers, in your own region, behind your own firewall. Same binary. Same API. Same agents.

The point is that the move from Tier 0 to Tier 3 is a config change, not a rewrite. You do not pick your tier when you start. You pick it when the workload tells you.

🪐 What the one-line install actually does

This is what running Appstrate on a fresh VM looks like:

curl -fsSL https://get.appstrate.dev | bash

What happens under the hood:

  1. Detects the host (Linux / macOS), installs Bun if absent, installs Docker if docker is missing.
  2. Pulls the platform image, the sidecar image, and the Pi agent image from GHCR.
  3. Generates the signing secrets (BETTER_AUTH_SECRET, CONNECTION_ENCRYPTION_KEY).
  4. Starts the platform on :3000 with filesystem storage, in-memory queues, and embedded Postgres.
  5. Prints the URL and the first-admin signup link.

Zero infra on your side. You bring the host. The host can be your laptop, a $5 VPS, a bare-metal node in a data center, or a K8s cluster; the one-liner adapts. You can inspect the script before running it (as any sane ops-minded reader should), and you can run the underlying Docker Compose directly if you want to skip the wrapper.

When you're ready to scale:

# point DATABASE_URL at your Postgres; PGlite is retired, data migrates
# point REDIS_URL at your Redis; in-memory is retired, queues migrate
# point S3_BUCKET at your S3; filesystem is retired, storage migrates

No rewrite. No different binary. No "this only works on our cloud".

🔐 The sovereign default

Self-hosting isn't a feature you add to a managed runtime by unchecking a box. It's a design constraint that has to be there from day one, and it forces specific architectural choices:

  • No hard dependency on any external service. Not for scheduling, not for cache, not for storage, not for secrets.
  • Static binaries that run in air-gapped environments. No "phone home" telemetry on the default path. No license server.
  • A data model that survives an export. Every piece of state (agents, runs, logs, memory) is in tables you own, in a format you can read, with migrations you can read.
  • A license that lets you fork. Apache 2.0 in our case, the same license your existing infra runs under.

The teams who need self-hosting are the same teams who've been burned by vendor rug-pulls, license changes, acquisitions that killed the product, and "free tier" removals that ate a quarter. They're not looking for a trial. They're looking for a control plane they can own.

Appstrate docs