Add a dedicated pyro host surface for supported chat hosts so Claude Code, Codex, and OpenCode users can connect or repair the canonical MCP setup without hand-writing raw commands or config edits.
Implement the shared host helper layer and wire it through the CLI with connect, print-config, doctor, and repair, all generated from the same canonical pyro mcp serve command shape and project-source flags. Update the docs, public contract, examples, changelog, and roadmap so the helper flow becomes the primary onramp while raw host-specific commands remain as reference material.
Harden the verification path that this milestone exposed: temp git repos in tests now disable commit signing, socket-based port tests skip cleanly when the sandbox forbids those primitives, and make test still uses multiple cores by default but caps xdist workers to a stable value so make check stays fast and deterministic here.
Validation:
- uv lock
- UV_OFFLINE=1 UV_CACHE_DIR=.uv-cache make check
- UV_OFFLINE=1 UV_CACHE_DIR=.uv-cache make dist-check
Make repo-root chat startup native by letting MCP servers carry a default project source for workspace creation. When a chat host starts from a Git checkout, workspace_create can now omit seed_path and inherit the server startup source; explicit --project-path and clean-clone --repo-url/--repo-ref paths are supported as fallbacks.
Add project startup resolution and materialization, surface origin_kind/origin_ref in workspace_seed, update chat-host docs and the repro/fix smoke to use project-aware workspace creation, and switch dist-check to uv run pyro so verification stays stable after uv reinstalls.
Validated with uv lock, focused startup/server/CLI pytest coverage, UV_CACHE_DIR=.uv-cache make check, UV_CACHE_DIR=.uv-cache make dist-check, and real guest-backed smokes for both explicit project_path and bare repo-root auto-detection.
Extend the chat ergonomics roadmap now that the core workspace and MCP path are in place for the narrowed chat-host persona.
Add the next planned phase around project-aware chat startup, host bootstrap and repair, reviewable agent output, opinionated use-case modes, and faster daily loops so the roadmap keeps pushing toward a repo-aware daily tool instead of a generic VM or SDK story.
Document the constraints explicitly: optimize the MCP/chat-host path first, keep disk tools secondary, and take advantage of the current no-users-yet window to make breaking product-shaping changes when needed.
Update the README walkthrough asset to match the current workspace-first flow instead of the older JSON-plus-parsing demo.
The new tape now shows the recommended 4.x handoff path: workspace create with --id-only, model-native file read and patch apply, snapshot creation, drift and full reset, service start, host export, and cleanup.
Re-render the README GIF from that tape so the embedded recording demonstrates the current product story directly.
Validation: vhs validate docs/assets/workspace-first-run.tape; scripts/render_tape.sh docs/assets/workspace-first-run.tape (rendered outside the sandbox because vhs crashes on local-port allocation inside the sandbox).
Flip bare pyro mcp serve, create_server(), and Pyro.create_server() to default to workspace-core in 4.0.0 while keeping workspace-full as the explicit advanced opt-in surface.
Rewrite the MCP-facing docs and host-specific examples around the bare default command, update package and catalog compatibility to 4.x, and move the public-contract wording from 3.x compatibility guidance to the new stable default.
Adjust the server, API, and contract tests so bare server creation now asserts the workspace-core tool set, while explicit workspace-full coverage continues to prove shells, services, snapshots, and disk tools remain available.
Validation: uv lock; .venv/bin/pytest --no-cov tests/test_cli.py tests/test_api.py tests/test_server.py tests/test_public_contract.py; UV_CACHE_DIR=.uv-cache make check; UV_CACHE_DIR=.uv-cache make dist-check; real guest-backed smoke for bare Pyro.create_server() plus explicit profile="workspace-full".
Ship first-class MCP setup examples for Claude Code, Codex, and OpenCode so new users can copy one exact command or config instead of translating the generic MCP template by hand.
Reposition the docs to surface those host-specific examples before the generic config fallback, keep workspace-core as the recommended profile everywhere user-facing, and retain Claude Desktop/Cursor as secondary fallback examples.
Bump the package and catalog to 3.11.0, mark the roadmap milestone done, and add docs-alignment coverage that pins the new examples to the canonical workspace-core server command and the expected OpenCode config shape.
Validation:
- uv lock
- ./.venv/bin/pytest --no-cov tests/test_cli.py
- UV_CACHE_DIR=.uv-cache make check
- UV_CACHE_DIR=.uv-cache make dist-check
The 3.10.0 milestone was about making the advertised smoke pack trustworthy enough to act like a real release gate. The main drift was in the repro-plus-fix scenario: the recipe docs were SDK-first, but the smoke still shelled out to CLI patch apply and asserted a human summary string.\n\nSwitch the smoke runner to use the structured SDK patch flow directly, remove the harness-only CLI dependency, and tighten the fake smoke tests so they prove the same structured path the docs recommend. This keeps smoke failures tied to real user-facing regressions instead of human-output formatting drift.\n\nPromote make smoke-use-cases as the trustworthy guest-backed verification path in the top-level docs, bump the release surface to 3.10.0, and mark the roadmap milestone done.\n\nValidation:\n- uv lock\n- UV_CACHE_DIR=.uv-cache uv run pytest --no-cov tests/test_workspace_use_case_smokes.py\n- UV_CACHE_DIR=.uv-cache make check\n- UV_CACHE_DIR=.uv-cache make dist-check\n- USE_CASE_ENVIRONMENT=debian:12 UV_CACHE_DIR=.uv-cache make smoke-use-cases
make test was dominated by teardown-heavy workspace integration tests, not by coverage overhead. Service shutdown was treating zombie processes as live, which forced repeated timeout waits, and one shell test was leaving killpg monkeypatched during cleanup, which made shell close paths burn the full wait budget.\n\nTreat Linux zombie pids as stopped in the workspace manager so service teardown completes promptly. Restore the real killpg implementation before shell test cleanup so the shell close path no longer pays the artificial timeout. Also isolate sys.argv in the runtime-network-check main() test so parallel pytest flags do not leak into argparse-based tests.\n\nAdd pytest-xdist to the dev environment and run make test with pytest -n auto by default so available cores are used automatically during local iteration.\n\nValidation:\n- uv lock\n- targeted hot-spot pytest rerun after the fix dropped the worst tests from roughly 10-21s each to sub-second timings\n- UV_CACHE_DIR=.uv-cache make check\n- UV_CACHE_DIR=.uv-cache make dist-check
Capture the next UX pass after the workspace-core readiness review so the roadmap reflects the remaining friction a new chat-host user still feels.
Add milestones for trustworthy use-case smoke coverage, host-specific Claude/Codex/OpenCode MCP onramps, and the planned 4.0 default flip to workspace-core so the bare server entrypoint finally matches the recommended path.
This is a docs-only roadmap update based on the live use-case review and integration validation, with the full advanced surface kept as an explicit opt-in rather than the default.
Make the human workspace read commands easier to use in chat transcripts and shell pipelines by adding CLI-only --content-only to workspace file read and workspace disk read.
Keep JSON, SDK, and MCP behavior unchanged while fixing the default human rendering so content without a trailing newline is cleanly separated from the summary footer.
Update the 3.9.0 docs and roadmap status, and add CLI regression coverage plus a real guest-backed smoke for live and stopped-disk reads.
Make the recommended MCP profile visible from the first help and docs pass without changing 3.x behavior.
Rework help, top-level docs, public-contract wording, and shipped MCP/OpenAI examples so is the recommended first profile while stays the compatibility default for full-surface hosts.
Bump the package and catalog to 3.8.0, mark the roadmap milestone done, and add regression coverage for the new MCP help and docs alignment. Validation included uv lock, targeted profile/help tests, make check, make dist-check, and a real guest-backed server smoke.
Remove the remaining shell glue from the canonical CLI workspace flows so users can hand off IDs and host-authored text files directly.
Add --id-only on workspace create and shell open, plus --text-file and --patch-file for workspace file write and patch apply, while keeping the underlying SDK, MCP, and backend behavior unchanged.
Update the top walkthroughs, contract docs, roadmap status, and use-case smoke runner to use the new shortcuts, and verify the milestone with uv lock, make check, make dist-check, focused CLI tests, and a real guest-backed smoke for create, file write, patch apply, and shell open/read.
Extend the chat-ergonomics roadmap with the remaining UX work highlighted by the readiness review.
Document a second pass focused on removing shell glue from canonical CLI handoff flows, making the recommended chat-host profile more obvious without changing 3.x compatibility defaults, and polishing human-mode content reads for cleaner transcripts and copy-paste behavior.
Keep these milestones explicitly workspace-first and scoped to product UX, with CLI-only shortcuts allowed where the SDK and MCP surfaces already provide the structured behavior natively.
Turn the stable workspace surface into five documented, runnable stories with a shared guest-backed smoke runner, new docs/use-cases recipes, and Make targets for cold-start validation, repro/fix loops, parallel workspaces, untrusted inspection, and review/eval workflows.
Bump the package and catalog surface to 3.6.0, update the main docs to point users from the stable workspace walkthrough into the recipe index and smoke packs, and mark the 3.6.0 roadmap milestone done.
Fix a regression uncovered by the real parallel-workspaces smoke: workspace_file_read must not bump last_activity_at. Verified with uv lock, UV_CACHE_DIR=.uv-cache make check, UV_CACHE_DIR=.uv-cache make dist-check, and USE_CASE_ENVIRONMENT=debian:12 UV_CACHE_DIR=.uv-cache make smoke-use-cases.
Make workspace shell reads usable as direct chat-model input without changing the PTY or cursor model. This adds optional plain rendering and idle-window batching across CLI, SDK, and MCP while keeping raw reads backward-compatible.
Implement the rendering and wait-for-idle logic in the manager layer so the existing guest/backend shell transport stays unchanged. The new helper strips ANSI and other terminal control noise, handles carriage-return overwrite and backspace, and preserves raw cursor semantics even when plain output is requested.
Refresh the stable shell docs/examples to recommend --plain --wait-for-idle-ms 300, mark the 3.5.0 roadmap milestone done, and bump the package/catalog version to 3.5.0.
Validation: uv lock; UV_CACHE_DIR=.uv-cache make check; UV_CACHE_DIR=.uv-cache make dist-check; real guest-backed Firecracker smoke covering shell open/write/read with ANSI plus delayed output.
Expose stable MCP/server tool profiles so chat hosts can start narrow and widen only when needed. This adds vm-run, workspace-core, and workspace-full across the CLI serve path, Pyro.create_server(), and the package-level create_server() factory while keeping workspace-full as the default.
Register profile-specific tool sets from one shared contract mapping, and narrow the workspace-core schemas so secrets, network policy, shells, services, snapshots, and disk tools do not leak into the default persistent chat profile. The full surface remains available unchanged under workspace-full.
Refresh the public docs and examples around the profile progression, add a canonical OpenAI Responses workspace-core example, mark the 3.4.0 roadmap milestone done, and verify with uv lock, UV_CACHE_DIR=.uv-cache make check, UV_CACHE_DIR=.uv-cache make dist-check, and a real guest-backed workspace-core smoke for create, file write, exec, diff, export, reset, and delete.
Make concurrent workspaces easier to rediscover and resume without relying on opaque IDs alone.
Add optional workspace names, key/value labels, workspace list, and workspace update across the CLI, Python SDK, and MCP surface, and persist last_activity_at so list ordering reflects real mutating activity.
Update the stable contract, install/first-run docs, roadmap, and Python workspace example to teach the new discovery flow, and validate it with focused manager/CLI/API/server coverage plus uv lock, make check, make dist-check, and a real multi-workspace smoke for create, list, update, exec, reorder, and delete.
Remove shell-escaped file mutation from the stable workspace flow by adding explicit file and patch tools across the CLI, SDK, and MCP surfaces.
This adds workspace file list/read/write plus unified text patch application, backed by new guest and manager file primitives that stay scoped to started workspaces and /workspace only. Patch application is preflighted on the host, file writes stay text-only and bounded, and the existing diff/export/reset semantics remain intact.
The milestone also updates the 3.2.0 roadmap, public contract, docs, examples, and versioning, and includes focused coverage for the new helper module and dispatch paths.
Validation:
- uv lock
- UV_CACHE_DIR=.uv-cache make check
- UV_CACHE_DIR=.uv-cache make dist-check
- real guest-backed smoke for workspace file read, patch apply, exec, export, and delete
Document the post-3.1 milestones needed to make the stable workspace product feel natural in chat-driven LLM interfaces.
Add a follow-on roadmap for model-native file ops, workspace naming and discovery, tool profiles, shell output cleanup, and use-case recipes with smoke coverage. Link it from the README, vision doc, and completed workspace GA roadmap so the next phase is explicit.
Keep the sequence anchored to the workspace-first vision and continue to treat disk tools as secondary rather than the main chat-facing surface.
Finish the 3.1.0 secondary disk-tools milestone so stable workspaces can be
stopped, inspected offline, exported as raw ext4 images, and started again
without changing the primary workspace-first interaction model.
Add workspace stop/start plus workspace disk export/list/read across the CLI,
SDK, and MCP, backed by a new offline debugfs inspection helper and guest-only
validation. Scrub runtime-only guest state before disk inspection/export, and
fix the real guest reliability gaps by flushing the filesystem on stop and
removing stale Firecracker socket files before restart.
Update the docs, examples, changelog, and roadmap to mark 3.1.0 done, and
cover the new lifecycle/disk paths with API, CLI, manager, contract, and
package-surface tests.
Validation: uv lock; UV_CACHE_DIR=.uv-cache make check; UV_CACHE_DIR=.uv-cache
make dist-check; real guest-backed smoke for create, shell/service activity,
stop, workspace disk list/read/export, start, exec, and delete.
Freeze the current workspace-first surface as the stable 3.0 contract and reposition the
landing docs, CLI help, and public contract around the stable workspace path after the
one-shot proof.
Bump the package and catalog compatibility to 3.0.0, add a dedicated workspace walkthrough
tape/GIF, and mark the 3.0.0 roadmap milestone done while keeping runtime capability
unchanged in this release.
Validation: uv lock; UV_CACHE_DIR=.uv-cache make check; UV_CACHE_DIR=.uv-cache make dist-check;
UV_CACHE_DIR=.uv-cache uv build; UV_CACHE_DIR=.uv-cache uvx --from twine twine check dist/*;
built-wheel CLI smoke for pyro --help and pyro workspace --help; vhs validate plus rendered
workspace-first-run.gif outside the sandbox because vhs crashes when sandboxed.
Replace the workspace-level boolean network toggle with explicit network policies and attach localhost TCP publication to workspace services.
Persist network_policy in workspace records, validate --publish requests, and run host-side proxy helpers that follow the service lifecycle so published ports are cleaned up on failure, stop, reset, and delete.
Update the CLI, SDK, MCP contract, docs, roadmap, and examples for the new policy model, add coverage for the proxy and manager edge cases, and validate with uv lock, UV_CACHE_DIR=.uv-cache make check, UV_CACHE_DIR=.uv-cache make dist-check, and a real guest-backed published-port probe smoke.
Add explicit workspace secrets across the CLI, SDK, and MCP, with create-time secret definitions and per-call secret-to-env mapping for exec, shell open, and service start. Persist only safe secret metadata in workspace records, materialize secret files under /run/pyro-secrets, and redact secret values from exec output, shell reads, service logs, and surfaced errors.
Fix the remaining real-guest shell gap by shipping bundled guest init alongside the guest agent and patching both into guest-backed workspace rootfs images before boot. The new init mounts devpts so PTY shells work on Firecracker guests, while reset continues to recreate the sandbox and re-materialize secrets from stored task-local secret material.
Validation: uv lock; UV_CACHE_DIR=.uv-cache make check; UV_CACHE_DIR=.uv-cache make dist-check; and a real guest-backed Firecracker smoke covering workspace create with secrets, secret-backed exec, shell, service, reset, and delete.
Implement the 2.8.0 workspace milestone with named snapshots and full-sandbox reset across the CLI, Python SDK, and MCP server.
Persist the immutable baseline plus named snapshot archives under each workspace, add workspace reset metadata, and make reset recreate the sandbox while clearing command history, shells, and services without changing the workspace identity or diff baseline.
Refresh the 2.8.0 docs, roadmap, and Python example around reset-over-repair, then validate with uv lock, UV_CACHE_DIR=.uv-cache make check, UV_CACHE_DIR=.uv-cache make dist-check, and a real guest-backed create/snapshot/reset/diff smoke test outside the sandbox.
Make persistent workspaces capable of running long-lived background processes instead of forcing everything through one-shot exec calls.
Add workspace service start/list/status/logs/stop across the CLI, Python SDK, and MCP server, with multiple named services per workspace, typed readiness probes (file, tcp, http, and command), and aggregate service counts on workspace status. Keep service state and logs outside /workspace so diff and export semantics stay workspace-scoped, and extend the guest agent plus backends to persist service records and logs across separate calls.
Update the 2.7.0 docs, examples, changelog, and roadmap milestone to reflect the shipped surface.
Validation: uv lock; UV_CACHE_DIR=.uv-cache make check; UV_CACHE_DIR=.uv-cache make dist-check; real guest-backed Firecracker smoke for workspace create, two service starts, list/status/logs, diff unaffected, stop, and delete.
Complete the 2.6.0 workspace milestone by adding explicit host-out export and immutable-baseline diff across the CLI, Python SDK, and MCP server.
Capture a baseline archive at workspace creation, export live /workspace paths through the guest agent, and compute structured whole-workspace diffs on the host without affecting command logs or shell state. The docs, roadmap, bundled guest agent, and workspace example now reflect the new create -> sync -> diff -> export workflow.
Validation: uv lock, UV_CACHE_DIR=.uv-cache make check, UV_CACHE_DIR=.uv-cache make dist-check, and a real guest-backed Firecracker smoke covering workspace create, sync push, diff, export, and delete.
Let agents inhabit a workspace across separate calls instead of only submitting one-shot execs.
Add workspace shell open/read/write/signal/close across the CLI, Python SDK, and MCP server, with persisted shell records, a local PTY-backed mock implementation, and guest-agent support for real Firecracker workspaces.
Mark the 2.5.0 roadmap milestone done, refresh docs/examples and the release metadata, and verify with uv lock, UV_CACHE_DIR=.uv-cache make check, and UV_CACHE_DIR=.uv-cache make dist-check.
Rewrite the user-facing persistent sandbox story around pyro workspace ..., including the install guide, first-run transcript, integrations notes, and public contract reference.
Rename the Python example to examples/python_workspace.py and update the docs to use the new workspace create, sync, exec, status, logs, and delete flows with seed_path/workspace_id terminology.
Mark the 2.4.0 workspace-contract pivot as done in the roadmap now that the shipped CLI, SDK, MCP, docs, and tests all use the workspace-first surface.
Replace the public persistent-sandbox contract with workspace-first naming across CLI, SDK, MCP, payloads, and on-disk state.
Rename the task surface to workspace equivalents, switch create-time seeding to `seed_path`, and store records under `workspaces/<workspace_id>/workspace.json` without carrying legacy task aliases or migrating old local task state.
Keep `pyro run` and `vm_*` unchanged. Validation covered `uv lock`, focused public-contract/API/CLI/manager tests, `UV_CACHE_DIR=.uv-cache make check`, and `UV_CACHE_DIR=.uv-cache make dist-check`.
Break the updated workspace vision into a checked-in roadmap from 2.4.0 through 3.1.0 so later implementation can be driven milestone by milestone.
Link the roadmap from the vision doc and keep each release slice scoped to one product capability, from the workspace contract pivot through shells, export/diff, services, snapshots, secrets, networking, and GA promotion.
This is a docs-only planning scaffold; runtime behavior stays unchanged in this commit.
Clarify that pyro should evolve into a disposable workspace for agents instead of drifting into a secure CI or task-runner identity.
Add a dedicated vision doc that captures the product thesis, anti-goals, naming guidance, and the future interaction model around workspaces, shells, services, snapshots, and reset. Link that doc from the README landing path and persistent task section so the distinction is visible to new users.
Keep the proposed workspace and shell primitives explicitly illustrative so the vision sharpens direction without silently changing the current public contract.
Tasks could start from host content in 2.2.0, but there was still no post-create path to update a live workspace from the host. This change adds the next host-to-task step so repeated fix or review loops do not require recreating the task for every local change.
Add task sync push across the CLI, Python SDK, and MCP server, reusing the existing safe archive import path from seeded task creation instead of introducing a second transfer stack. The implementation keeps sync separate from workspace_seed metadata, validates destinations under /workspace, and documents the current non-atomic recovery path as delete-and-recreate.
Validation:
- uv lock
- UV_CACHE_DIR=.uv-cache uv run pytest --no-cov tests/test_cli.py tests/test_vm_manager.py tests/test_api.py tests/test_server.py tests/test_public_contract.py
- UV_CACHE_DIR=.uv-cache make check
- UV_CACHE_DIR=.uv-cache make dist-check
- real guest-backed smoke: task create --source-path, task sync push, task exec to verify both files, task delete
Current persistent tasks started with an empty workspace, which blocked the first useful host-to-task workflow in the task roadmap. This change lets task creation start from a host directory or tar archive without changing the one-shot VM surfaces.
Expose source_path on task create across the CLI, SDK, and MCP, add safe archive upload and extraction support for guest and host-compat backends, persist workspace_seed metadata, and patch the per-task rootfs with the bundled guest agent before boot so seeded guest tasks work without republishing environments. Also switch post--- command reconstruction to shlex.join() so documented sh -lc task examples preserve argument boundaries.
Validation:
- uv lock
- UV_CACHE_DIR=.uv-cache uv run pytest --no-cov tests/test_vm_guest.py tests/test_vm_manager.py tests/test_cli.py tests/test_api.py tests/test_server.py tests/test_public_contract.py
- UV_CACHE_DIR=.uv-cache make check
- UV_CACHE_DIR=.uv-cache make dist-check
- real guest-backed smoke: task create --source-path, task exec -- cat note.txt, task delete
Start the first workspace milestone toward the task-oriented product without changing the existing one-shot vm_run/pyro run contract.
Add a disk-backed task registry in the manager, auto-started task workspaces rooted at /workspace, repeated non-cleaning exec, and persisted command journals exposed through task create/exec/status/logs/delete across the CLI, Python SDK, and MCP server.
Update the public contract, docs, examples, and version/catalog metadata for 2.1.0, and cover the new surface with manager, CLI, SDK, and MCP tests. Validation: UV_CACHE_DIR=.uv-cache make check and UV_CACHE_DIR=.uv-cache make dist-check.
Fix the default one-shot install path so empty bundled profile directories no longer win over OCI-backed environment pulls or leave broken cached symlinks behind.
Treat cached installs as valid only when the manifest and boot artifacts are all present, repair invalid installs on the next pull, and add human-mode phase markers for env pull and run without changing JSON output.
Align the Python lifecycle example and public docs with the current exec_vm/vm_exec auto-clean semantics, and validate the slice with focused pytest coverage, make check, make dist-check, and a real default-path pull/inspect/run smoke.