Benchmarks

Production sandbox benchmark

100 sandbox lifecycles, measured through the production API path.

These numbers include API authentication, admission checks, scheduling, VM readiness, a command probe inside the sandbox, and cleanup. They are not kernel-only boot timings. The production lane uses `POST /api/v1/sandboxes/run`, which combines create, server-side readiness, and first exec into one durable production request.

Verified 2026-06-09 UTC 100 / 100 successful create -> ready -> exec -> destroy lifecycles

Success100%0 failures

Command-ready p50512msproduction run path

Command-ready p950.992s10-way burst

Command-ready p991.300sall warm path

Total run24s100 samples

Latest verified run

Parameter	Value
Samples	`100`
Client concurrency	`10`
Batch pace	`0.5s`
Template	`miosa-sandbox`
Size	`xs`
Mode	`run` (`POST /api/v1/sandboxes/run`)
Regions requested	`us-mia`, `us-east`, `us-west`
Successful lifecycles	`100 / 100`
Admission rejections	`0`
Total elapsed	`24s`

Object storage smoke benchmark

MIOSA also verifies the tenant object-storage API through the same public API surface customers use:

POST   /api/v1/storage/buckets
PUT    /api/v1/storage/buckets/:id/objects/:key
GET    /api/v1/storage/buckets/:id/objects/:key
DELETE /api/v1/storage/buckets/:id/objects/:key

The June 9, 2026 smoke used a scoped API key with storage:read and storage:write, created a temporary private bucket, uploaded one object, downloaded it back, verified the SHA-256 digest, then deleted the object and bucket.

Object size	Samples	Result	Upload	Download	Throughput	Composite
`1 KB`	`1`	`1 / 1`	`212ms`	`178ms`	`0.05 Mbps`	`94.39`
`1 MB`	`1`	`1 / 1`	`1.155s`	`506ms`	`16.59 Mbps`	`92.61`
`5 MB`	`10`	`10 / 10`	`4.975s`	`682ms`	`61.48 Mbps`	`87.01`
`8 MB`	`1`	`0 / 1`	failed	failed	failed	`0.00`
`10 MB`	`1`	`0 / 1`	failed	failed	failed	`0.00`

Storage comparison against the ComputeSDK reference

The ComputeSDK storage benchmark shown in the reference screenshots uses 10 MB files and 100 iterations. That exact test is not passing on MIOSA yet because the current direct raw-upload path fails at 8-10 MB. The table below compares the reference values with MIOSA’s latest passing live storage run so the gap is visible instead of hidden.

Provider	Benchmark shape	Success	Upload median	Download median	Throughput median	Composite
Tigris	`10 MB`, `100` iterations	`100%`	`319ms`	`277ms`	`303 Mbps`	`95.4`
Cloudflare R2	`10 MB`, `100` iterations	`100%`	`628ms`	`276ms`	`303 Mbps`	`94.8`
MIOSA Storage API	`5 MB`, `10` iterations	`100%`	`4.975s`	`682ms`	`61.48 Mbps`	`87.01`
MIOSA Storage API	`10 MB`, `1` probe	`0%`	failed	failed	failed	`0.00`

Comments:

MIOSA is not leaderboard-comparable to Tigris or Cloudflare R2 until the 10 MB direct upload path passes consistently.
The current gap is upload-side. MIOSA’s 5 MB download median is usable but still slower than the ComputeSDK storage leaders; upload median is much slower.
The next backend fix is to move object upload off the default request-body read path and onto the streaming or presigned upload path, then rerun the same 10 MB, 100 iteration benchmark.
Until that fix ships, customer docs should treat direct API uploads as a small-object path and recommend presigned/direct storage uploads for larger artifacts.

MIOSA latency split

Production run path:

Phase	p50	p95	p99	Min	Max
Command-ready TTI	`512ms`	`0.992s`	`1.300s`	`295ms`	`1.331s`
VM boot slice	`95ms`	`253ms`	`253ms`	`30ms`	`312ms`

Previous standard public path, kept as a baseline:

Phase	p50	p95	p99	Min	Max
Create request	`287ms`	`348ms`	`353ms`	`218ms`	`356ms`
Ready/running	`589ms`	`1.005s`	`1.246s`	`477ms`	`1.246s`
Command-ready TTI	`947ms`	`1.333s`	`1.348s`	`762ms`	`1.610s`
VM boot slice	`101ms`	`253ms`	`524ms`	`31ms`	`524ms`

What is making the number slower?

The VM path is fast. In the production run, all 100 / 100 samples used the warm path and the reported boot slice had a 95ms median. The standard path was slower because it used three public round trips: create, poll readiness, then exec.

Component	Median	What it includes
Fused command-ready path	`512ms`	Auth, workspace admission, scheduling, server-side wait, first command, response
Previous standard path	`947ms`	Create response, external readiness polling, separate exec request
Round-trip removed by fusion	`~434ms`	Public polling plus second public exec call
Reported warm boot	`95ms`	The actual VM boot slice reported by the fleet

So the first optimization target was not raw boot. It was create/status/exec round trips. The run endpoint removes that waste while keeping the same durable sandbox lifecycle underneath.

Optimization path

The benchmark exposes three separate public lanes. MIOSA should publish all three, because each answers a different buyer question.

Lane	Current p50	What it proves	Immediate target
VM-ready	`95ms`	warm sandbox runtime is assigned and ready	`<100ms` sustained
Standard API-ready	`589ms`	public API create has produced a running sandbox	`<400ms`
Standard command-ready	`947ms`	first command succeeds through the legacy multi-request baseline	`<800ms`
Fused command-ready	`512ms`	first command succeeds through one durable request	`<500ms`

The engineering cuts are concrete:

Cut	Expected impact	Why it works
`create_and_wait` API/SDK path	`50-150ms`	removes external GET polling and returns only when the server has committed running state
`create_and_exec` benchmark/API path	`434ms measured`	removes external polling and the separate public exec POST
Admission-path caching	`100-180ms`	avoids repeat template, policy, plan, credit-balance, and scheduler reads for hot API keys
Region-local control plane / benchmark routing	`200-350ms` in far regions	avoids us-west control-plane round trips when the VM already boots locally
Warm-pool guardrails	tail reduction	keeps the `99/100` warm hit rate at `100/100` and removes the cold `5.322s` outlier

Fast without fragile

The fastest version of MIOSA should not skip billing, policy, cleanup, or durable placement. It should move those checks to the right boundary and avoid repeating them on every hot sandbox.

Layer	Keep durable	Make faster
Admission	API key, workspace, plan, policy, credits, idempotency all remain enforced	cache the effective policy/template/credit admission result for short TTLs per API key and workspace
Placement	persist sandbox row and `node_id` before exposing a route	allocate from an in-memory reservation ledger first, then write-through to Postgres/outbox
Boot	warm-pool claim remains the default, cold boot remains fallback	keep per-region warm pools ready for burst traffic
Readiness	only mark running after route registration and command health pass	push readiness over PubSub/SSE instead of external 50ms GET polling
First command	command still runs with auth and timeout	fuse `create -> wait -> exec` inside the selected host so the first command is not a second public API round trip
Cleanup	destroy, release reservation, stop billing, and revoke routes stay mandatory	make cleanup idempotent and janitor-backed so failed requests do not hold resources
Scale	each host keeps local runtime truth; control plane keeps durable truth	route creates to region-local controllers and replicate fleet state asynchronously

Target shape: the public API has the run endpoint today. The SDK wrapper should expose the same fast lane next:

await miosa.sandboxes.createAndWait({ template: "miosa-sandbox", region: "auto" })
await miosa.sandboxes.run({ template: "miosa-sandbox", command: "echo ok" })

createAndWait publishes API-ready latency. run publishes command-ready latency. Both use the same durable sandbox lifecycle underneath; the difference is that the server owns the wait loop and can execute the first command node-local.

Region split

Region	Samples	Success	TTI p50	TTI p95	TTI p99
`us-east`	`33`	`100%`	`328ms`	`644ms`	`684ms`
`us-mia`	`34`	`100%`	`506ms`	`815ms`	`859ms`
`us-west`	`33`	`100%`	`934ms`	`1.257s`	`1.331s`

Command-ready leaderboard

The external provider values below are from the supplied ComputeSDK-style benchmark view. MIOSA is inserted by measured p50 command-ready TTI from the production production run path, not isolated as a vanity row. Lower is better.

Declaw

0.49sp50 TTI · score 94.9

MIOSA

0.512sp50 production run path

Northflank

0.54sp50 TTI · score 94.4

Daytona

0.58sp50 TTI · score 74.3

E2B

0.64sp50 TTI · score 92.7

Modal

0.67sp50 TTI · score 92.8

Vercel

0.72sp50 TTI · score 90.7

Archil

0.75sp50 TTI · score 91.8

Runloop

0.81sp50 TTI · score 84.6

Cloudflare

1.84sp50 TTI · score 78.3

Blaxel

1.87sp50 TTI · score 80.1

CodeSandbox

7.32sp50 TTI · score 16.4

Tensorlake

15.22sp50 TTI · score 0.0

Upstash

17.01sp50 TTI · score 0.0

Orgo

0.27swarm-pool boot · 2.09s desktop-ready (measured)

AgentComputer

n/anot yet timed

ascii Box

n/anot yet timed

Benchmark placements

The benchmark screenshots expose separate tabs for median, P95, P99, and composite score. MIOSA’s raw latency placement is measured. The composite score below is labeled as an estimate because the external benchmark app does not publish its exact scoring formula; the estimate is anchored against the supplied provider score table and should be treated as directional until MIOSA is added to their official dataset.

Median TTI P95 TTI P99 TTI Composite

MIOSA rank: #20.512s p50 production run path · 100/100 success

1 Declaw

0.49s

100/100

2 MIOSA

0.512s

100/100

3 Northflank

0.54s

100/100

4 Daytona

0.58s

100/100

5 E2B

0.64s

100/100

6 Modal

0.67s

100/100

7 Vercel

0.72s

100/100

8 Archil

0.75s

100/100

9 Runloop

0.81s

100/100

10 Cloudflare

1.84s

100/100

11 Blaxel

1.87s

100/100

12 CodeSandbox

7.32s

100/100

13 Tensorlake

15.22s

100/100

14 Upstash

17.01s

100/100

MIOSA rank: #60.992s p95 production run path · tail is the next optimization target

1 Declaw

0.54s

100/100

2 Northflank

0.59s

100/100

3 Modal

0.78s

100/100

4 E2B

0.83s

100/100

5 Archil

0.90s

100/100

6 MIOSA

0.992s

100/100

7 Vercel

1.20s

100/100

8 Blaxel

2.07s

100/100

9 Cloudflare

2.62s

100/100

10 Runloop

2.64s

100/100

11 Daytona

5.52s

100/100

12 CodeSandbox

9.90s

100/100

13 Tensorlake

15.76s

100/100

14 Upstash

23.71s

100/100

MIOSA rank: #61.300s p99 production run path · west-region tail is visible here

1 Declaw

0.54s

100/100

2 Northflank

0.61s

100/100

3 Modal

0.79s

100/100

4 Archil

0.94s

100/100

5 E2B

0.94s

100/100

6 MIOSA

1.300s

100/100

7 Vercel

1.35s

100/100

8 Blaxel

2.35s

100/100

9 Runloop

2.64s

100/100

10 Cloudflare

2.72s

100/100

11 Daytona

5.58s

100/100

12 CodeSandbox

10.54s

100/100

13 Tensorlake

15.81s

100/100

14 Upstash

23.98s

100/100

MIOSA estimated score: 92.4directional rank: #5, between E2B/Modal/Archil and Vercel

Declaw

94.9Composite

Northflank

94.4Composite

Modal

92.8Composite

E2B

92.7Composite

MIOSA

~92.4Estimated

Archil

91.8Composite

Vercel

90.7Composite

Runloop

84.6Composite

Blaxel

80.1Composite

Cloudflare

78.3Composite

Daytona

74.3Composite

CodeSandbox

16.4Composite

Tensorlake

0.0Composite

Upstash

0.0Composite

Composite score is estimated because the external benchmark does not publish the exact formula. MIOSA's measured inputs are 0.512s median, 0.992s p95, 1.300s p99, and 100% success.

Detailed metrics

Provider	Score	Median TTI	P95 TTI	P99 TTI	Success
Declaw	`94.9`	`0.49s`	`0.54s`	`0.54s`	`100%`
MIOSA production path	`~92.4 est.`	`0.512s`	`0.992s`	`1.300s`	`100%`
Northflank	`94.4`	`0.54s`	`0.59s`	`0.61s`	`100%`
Daytona	`74.3`	`0.58s`	`5.52s`	`5.58s`	`100%`
E2B	`92.7`	`0.64s`	`0.83s`	`0.94s`	`100%`
Modal	`92.8`	`0.67s`	`0.78s`	`0.79s`	`100%`
Vercel	`90.7`	`0.72s`	`1.20s`	`1.35s`	`100%`
Archil	`91.8`	`0.75s`	`0.90s`	`0.94s`	`100%`
Runloop	`84.6`	`0.81s`	`2.64s`	`2.64s`	`100%`
Legacy create / poll / exec baseline	n/a	`0.947s`	`1.333s`	`1.348s`	`100%`
Cloudflare	`78.3`	`1.84s`	`2.62s`	`2.72s`	`100%`
Blaxel	`80.1`	`1.87s`	`2.07s`	`2.35s`	`100%`
CodeSandbox	`16.4`	`7.32s`	`9.90s`	`10.54s`	`100%`
Tensorlake	`0.0`	`15.22s`	`15.76s`	`15.81s`	`100%`
Upstash	`0.0`	`17.01s`	`23.71s`	`23.98s`	`100%`

Capability matrix

Speed is only one axis. MIOSA’s product surface is broader than “spawn a headless sandbox and exec a command.”

Fast sandbox lane#2median TTI placement

Tail latency#6P95/P99 placement

Platform breadth11/13capability rows covered

GPU optionsH100available by plan

Runtime lane

Headless sandbox, files, previews, snapshots, and first command execution.

MIOSA competes directly here.

Platform lane

Desktop VM, deploy/release plane, managed data, white-label embedding, BYOC.

This is the differentiation layer.

Enterprise lane

Compliance posture, GPU/H100 options, and mature enterprise procurement story.

This is where incumbents still have air.

Full provider matrix Scroll horizontally for Daytona, CodeSandbox, and Upstash. Teal = yes, amber = partial, gray = no.

Capability	MIOSA	Declaw	Northflank	Modal	E2B	Vercel	Daytona	CodeSandbox	Upstash
Headless sandbox create/exec	YES	YES	YES	YES	YES	YES	YES	YES	YES
Filesystem API	YES	YES	YES	PARTIAL	YES	YES	YES	YES	YES
Port preview URLs	YES	YES	YES	PARTIAL	YES	YES	YES	YES	PARTIAL
Snapshot/fork/resume	YES	YES	PARTIAL	PARTIAL	YES	YES	YES	YES	PARTIAL
Full desktop/browser VM	YES	NO	NO	NO	PARTIAL	NO	PARTIAL	PARTIAL	NO
Managed deploy/release plane	YES	NO	YES	PARTIAL	NO	YES	NO	NO	NO
Managed Postgres/Redis/storage	YES	NO	YES	NO	NO	PARTIAL	NO	NO	Redis/Vector
White-label tenant embedding	YES	NO	PARTIAL	NO	NO	NO	NO	NO	NO
BYOC / customer-owned fleet	YES	NO	NO	NO	NO	NO	PARTIAL	NO	NO
MCP/agent tool surface	YES	NO	NO	NO	PARTIAL	NO	PARTIAL	NO	PARTIAL
Multi-language SDKs	5	TS/Py	TS	Py	TS/Py	TS/Py	TS/Py/Go/Java/Ruby	TS	TS
GPU story	H100	NO	NO	YES	NO	NO	NO	NO	NO
Compliance public posture	Compliant	Limited	Enterprise	Enterprise	SOC2	SOC2	Enterprise	Enterprise	Enterprise

What this means

If a buyer only cares about raw median headless sandbox TTI, the category is tight.
If a buyer needs desktops, browser automation, white-label embedding, deploys, data, and BYOC in the same platform, MIOSA is no longer competing on a one-column sandbox table.
If a buyer needs GPU today, MIOSA can support GPU/H100 options while keeping the same platform surface.

Provider coverage

This page tracks the providers shown in the benchmark screenshots plus the providers exposed by ComputeSDK’s current provider list. Some vendors are broader platforms, some are narrow sandbox APIs, and some expose sandboxes as one feature in a larger developer-cloud product.

Provider	Category	Strongest public angle	MIOSA comparison note
MIOSA	Sandbox + desktop + deploy + data platform	Full lifecycle platform for agents and white-label SaaS	Broader platform surface than a headless sandbox-only provider
Declaw	Security-oriented sandbox	Fast TTI plus policy/security positioning	Strong security story; no public desktop/deploy/data plane equivalent
Northflank	Developer cloud with sandboxes	Persistent app/runtime platform plus sandbox execution	Strong deploy platform; less agent/desktop-specific
Modal	Serverless compute/GPU	GPU and Python-function workflow	Strong GPU story; sandbox is not a white-label desktop platform
E2B	AI code execution sandbox	Mature AI-agent sandbox API	Strong headless agent sandbox; limited platform breadth
Archil	Sandbox/storage-oriented provider	Fast benchmark row and storage-first positioning	Less public breadth than MIOSA’s computers/deploy/data surface
Vercel	Frontend platform plus sandbox	Distribution, OIDC, polished DX	Strong existing-account funnel; sandbox is headless and region-limited in public docs
Runloop	Devbox/sandbox provider	Long-lived devboxes and snapshots	Strong devbox framing; no comparable white-label/data plane
Blaxel	Agent platform and sandbox	Agent hosting, batch jobs, sandbox console	Strong agent platform/compliance posture; narrower managed data/deploy surface
Cloudflare	Edge platform sandbox	Edge distribution and developer ecosystem	Strong edge ecosystem; sandbox is one product within Cloudflare
Daytona	OSS/open sandbox platform	Open-source breadth and fast code-to-exec positioning	Strong OSS story; MIOSA adds managed data, deploys, desktops, white-label
CodeSandbox	Cloud dev environment	Browser IDE, previews, devbox UX	Strong interactive IDE; weaker benchmark row in supplied data
Tensorlake	AI-native sandbox	AI/RL tooling and sandbox filesystem benchmarks	Strong AI-lab framing; no public desktop/deploy/data platform equivalent
Upstash	Serverless data plus Box	Redis/Vector/QStash adjacency and built-in agent tooling	Strong data brand; Box is newer and JS/TS-centered
HopX	Cloud sandbox API	Multi-language code execution and desktop automation docs	Lower benchmark visibility; overlaps sandbox APIs more than platform plane
Namespace	Build/devbox platform	Builders, devboxes, macOS/CI style workloads	Strong CI/build niche; different buyer motion from MIOSA agent platform

Benchmark notes

The published MIOSA result is the clean 100/100 production lifecycle run after deploying POST /api/v1/sandboxes/run on 2026-06-09 UTC. The older standard path is also shown so the optimization is auditable instead of hidden. Setup failures from under-scoped keys or insufficient plan concurrency are not counted as fleet performance.

How to reproduce

Use a workspace and API key with enough sandbox concurrency for the test shape:

export MIOSA_API_KEY="msk_..."
export API_URL="https://api.miosa.ai"
export BENCH_WORKSPACE_ID="your-workspace-uuid"

./scripts/bench-continuous.sh 
  --samples 100 
  --concurrent 10 
  --pace 0.5 
  --template miosa-sandbox 
  --size xs 
  --mode run 
  --output bench-results/MIOSA-100-sandbox.tsv 
  --html bench-results/MIOSA-100-sandbox.html

Use --mode standard to reproduce the legacy create / poll / exec baseline. Use --mode run to reproduce the production command-ready lane. The benchmark deletes successful samples unless --keep is passed.

Sources and current research

ComputeSDK introduction for the provider set and abstraction shape.
Daytona docs for Daytona’s current sandbox positioning and SDK breadth.
Tensorlake homepage for Tensorlake’s published sandbox/filesystem benchmark positioning.
HopX docs for HopX sandbox/code-execution capabilities.
Blaxel docs for Blaxel sandbox/agent platform positioning.
Northflank sandboxes docs for Northflank sandbox behavior.
Internal competitive notes under docs/audits/providers/ for the first-pass capability matrix.