When to use an agent runtime platform
Six use cases where an agent runtime platform earns its keep in production, from adaptive automations to multi-tenant SaaS.
About agent runtime platforms
An agent runtime platform is the infrastructure that runs agent runtimes in production: multi-tenant, sandboxed, behind an API. It's the fifth layer of a stack that goes 1. LLM → 2. Harness → 3. Agent → 4. Runtime → 5. Agent runtime platform.
For a detailed walkthrough of those five layers, see What's an agent runtime platform, exactly?.
1. Deploying automations that adapt and self-correct
A lot of teams run their automations on Zapier or n8n. Those tools know how to call an LLM, and their "AI Agent" nodes can even retry an action with a different strategy inside a single run. But the workflow itself stays a frozen graph: the agent doesn't rewrite its own flow, doesn't learn anything from one run to the next, and on n8n, by default, a single tool error crashes the whole workflow instead of looping back to the agent so it can decide what to do (you can wire continueOnFail and error branches, but it's not free). That's tactical adaptation, not strategic.
By contrast, an agent runs on a feedback loop: it picks a tool, watches what comes back, adjusts, repeats.
Zapier or n8n with an LLM node already covers a lot (classification, one-shot RAG, extraction). But the moment the call sequence isn't known up front and the agent has to decide each next step, the workflow hits its ceiling.
| Zapier/n8n + LLM node | Agent runtime |
|---|---|
| Triage a ticket against a KB | Reconcile Stripe invoices ↔ accounting |
| Route an alert by log content | Diagnose an outage by polling 4 systems in a loop |
| Classify a procurement request | Negotiate a multi-constraint schedule with replanning |
| Extract fields from a PDF invoice | Investigate a security incident through successive pivots |
What's interesting: an agent runtime stays light on infrastructure. A prompt, one or two providers (Gmail, Slack, Notion, Jira), a cron schedule, a sandboxed environment where it runs on its own, operated by a single end-user on their laptop or a VPS (often the admin who wired it all up).
2. Driving agent runtimes through an API
A typical agent runtime is easy to drive from a dev tool (Claude Code, Antigravity, Codex) or a personal agent (OpenClaw, ZeroClaw). But it stays single-user: one person, one machine. To plug that agent into a wider system, you need an agent runtime platform that exposes an API.
The API acts as a standard socket: any system (an internal service, a managed cron, n8n, Zapier) can plug in to trigger the agent, watch it execute, and pick up the result asynchronously through a webhook.
In practice, that's a POST /runs endpoint that launches a run, a GET /runs/:id that fetches its status, an SSE stream to follow progress live, and an HMAC-signed webhook that notifies the calling system at the end. An n8n node calling the agent stays a perfectly valid pattern: keep the deterministic workflow for orchestration, and delegate the loop-heavy part to the agent.
3. Deploying secure agent runtimes
The moment an agent runs in production, three risks show up:
- The agent that's confidently wrong (the most frequent). Replit's AI dropped a production database during a code freeze (1,200 executives, 1,190 companies wiped), then fabricated 4,000 fake users to cover its tracks. No malice, just misplaced certainty.
- Prompt injection. A booby-trapped email in a support inbox can talk the agent into leaking a secret or running a destructive command.
- Manipulated credentials. An agent calling Stripe needs a key, and if that key ends up in the LLM's context, it can be exfiltrated through a single prompt.
An agent runtime platform answers those three risks with two complementary mechanisms.
The per-run sandbox acts like a closed aquarium: every run lives in its own isolated environment. The agent can't reach the prod database, other runs' filesystems, or the host machine's env vars, unless the platform explicitly allows it. A prompt injection that pushes the agent to try DROP TABLE users runs into the sandbox wall.
The level of isolation is picked based on criticality. Five tiers dominate the market, lightest to most isolated (each step adds security, but also RAM or syscall overhead):
On the Appstrate side, isolation follows a 4-tier progressive model (orthogonal to the market taxonomy above): from Tier 0 (Bun subprocess on the host, no isolation, for dev) to Tier 3 (RUN_ADAPTER=docker, every run in an isolated container with a sidecar, the production default). MicroVM isolation (Firecracker) is on the roadmap for the most sensitive workloads.
The credential broker (sometimes called a sidecar) acts like a bank teller: the key lives in a vault, the agent asks "call Stripe" without knowing it, the teller injects the key at the last moment on the network side. The LLM never sees the secret value.
Anthropic Managed Agents puts it explicitly: "credentials never live in the sandbox where Claude's generated code runs." AWS Bedrock AgentCore Identity converges: tokens live in a customer-key-encrypted vault, and agents only get scoped, time-bound access tokens. Both frame it the same way: isolate the LLM from the secret by construction.
All of this is logged in a per-tenant audit trail: who launched what, when, on what data. Security can replay an incident without digging through scattered logs.
4. Collaborating as a team with agent runtimes
The support team rolled out a triage agent that works well. The HR team wants an interview-summary agent, finance wants a Stripe reconciliation agent, the platform team wants a PR reviewer shared between the devs. The question becomes: who's allowed to run what, with which team's credentials, on which systems?
With an agent runtime platform, you can structure several teams inside the same infrastructure (organizations, workspaces, members, roles).
Inside an organization, each team gets its own isolated workspace (an application on Appstrate) with its agents, its runs, its logs, its roles, and its scoped credentials (the support Gmail isn't the HR Gmail). That's internal multi-tenant: one runtime, several teams, no cross-mixing. Without that layer, agents run under a single account, credentials sit around in cleartext, and permissions are hand-rolled every time a new team shows up.
To make sharing agents between teams even easier, we also designed AFPS, an open format that doubles as a versioned package registry: @acme/[email protected], latest and beta dist-tags, yanking a bad version. One team publishes, the others consume.
The organization gets a single point of governance over all its agents, scoped access per team, and the audit trail security asks for.
5. Embedding an agent in your SaaS product
When you want to embed an agentic feature inside your own SaaS product, the agent stops serving the internal team and starts serving the product's customers. A few examples:
- A B2B support SaaS gives every customer a "draft a first response" agent that talks to the customer's Gmail.
- An HR tool offers every manager a "generate the interview report" agent with access to the company's Google Drive.
- A vertical accounting SaaS triggers, per end-user, a reconciliation agent on the customer's Stripe.
- A grant-writing platform offers every researcher an agent that surfaces calls for proposals and drafts a first version, with access to their Google Drive.
The Stripe parallel sums up the need: a User is on the team, a Customer is in the market, two lifecycles, two sets of permissions. The vendor's credentials and the end customer's credentials must never cross, otherwise the in-house implementation breaks fast (sessions colliding, tokens reused, crons without context). All avoidable with a runtime that's multi-tenant by design.
The mechanics come in two steps.
1. OAuth setup (once per end-user).
- Each end customer is represented as an end-user (
eu_xxx). - The SaaS kicks off the OAuth flow through the connections API with the
Appstrate-User: eu_xxxheader. - The end-user sees the standard consent page (Google, Slack, Stripe…) and accepts.
- Tokens land encrypted in the Appstrate vault. The SaaS never handles them.
2. Use at run time (every execution).
- The SaaS backend calls the Appstrate API with a scoped API key and the
Appstrate-User: eu_xxxheader. - The sidecar pulls that end-user's credentials from the vault and injects them on the network side.
- The LLM never sees the secret.
- HMAC-signed webhooks push the results back into the product.
6. Deploying sovereign agent runtimes
A legal or contractual frame says everything stays in-house: data, model, logs, compute. The managed runtime is ruled out before the technical discussion, for legal reasons.
A few typical profiles:
- a regulated EU bank,
- a healthcare operator under HIPAA,
- a defense subcontractor,
- a company bound by FR data localization.
The right image: the agent in a sealed envelope. Everything it needs sits inside (compute, model, providers, vault), nothing crosses the border without explicit authorization.
What it looks like in practice:
- an agent reading a Postgres database behind the VPN, with no row leaving the datacenter,
- an air-gapped agent at a defense customer, with no internet access, talking to a local LLM served in-house,
- an agent constrained to a region by contract (EU, Canada, Switzerland) with compute, storage, LLM and logs co-located.
The primitives that show up at this stage:
- Full self-host: the envelope's border lines up with the datacenter's.
- Outbound proxies: police LLM calls (who calls which model, from where, to where).
- Custom providers: internal APIs without public OAuth, customer-specific.
- AFPS as audit artifact: the manifest's
authorizedUrisfield dictates what the agent is allowed to call, and the sidecar refuses anything not listed. Credentials are AES-encrypted per application.
Self-hosting remains the only way to guarantee the four dimensions in play for an agent: data, LLM, orchestrator, logs. This case is the operational embodiment: you choose the runtime because it can stay put.
Six cases, one map
| Use case | Trigger | Consumer | Primitives added |
|---|---|---|---|
| Adaptive automation | Cron, webhook, email | Ops/support team | Agent, providers, schedule |
| API-driven control | Integration into an existing stack | Internal service, cron, n8n, Zapier | + HTTP API, SSE, webhooks |
| Secure agents | Production with sensitive credentials or injection risk | Any production agent | + sandbox, credential broker, audit trail |
| Per-team agents | A 2nd team asks for its own agent | Internal members per team | + orgs, members, roles, scoped credentials, AFPS |
| Agents inside a SaaS | End-user of the product | External customers of the SaaS | + application, end-user, impersonation |
| Sovereign agents | Legal or contractual frame | Regulated, air-gap | + self-host, proxies, custom providers |
An organization can be hit by one of those cases, several in parallel, or all six. The primitives stack when you combine cases.
Which case for which situation
To deploy automations that adapt and self-correct: a prompt, a provider, a schedule. A day to ship a first working agent, a month to stabilize it.
To drive agent runtimes through an API: a POST /runs endpoint, an SSE stream to follow progress, an HMAC-signed webhook for the end. The agent becomes a primitive consumable from any system.
To deploy secure agent runtimes: per-run sandbox, network-side credential broker, per-tenant audit trail. The LLM never sees the secret values.
To collaborate as a team with agent runtimes: orgs, members, roles, scoped credentials. AFPS comes in to share and version agents across teams.
To embed an agent in your SaaS product: application, end-user, impersonation. Either the platform handles it, or you build it.
To deploy sovereign agent runtimes: full self-host, outbound proxies, custom providers. Appstrate is Apache 2.0, self-hostable through docker compose up, and the self-hosting docs cover from local Tier 0 to air-gap deployment.
Three next steps:
- Read the concepts and multi-tenancy pages for the full technical map.
- Install locally:
curl get.appstrate.dev | bash. - Open an issue or come to Discord if none of the six cases above covers the real need.