11 KiB
11 KiB
LLM Chat Ergonomics Roadmap
This roadmap picks up after the completed workspace GA plan and focuses on one goal:
make the core agent-workspace use cases feel trivial from a chat-driven LLM interface.
Current baseline is 4.5.0:
pyro mcp serveis now the default product entrypointworkspace-coreis now the default MCP profile- one-shot
pyro runstill exists as the terminal companion path - workspaces already support seeding, sync push, exec, export, diff, snapshots, reset, services, PTY shells, secrets, network policy, and published ports
- host-specific onramps exist for Claude Code, Codex, and OpenCode
- the five documented use cases are now recipe-backed and smoke-tested
- stopped-workspace disk tools now exist, but remain explicitly secondary
What "Trivial In Chat" Means
The roadmap is done only when a chat-driven LLM can cover the main use cases without awkward shell choreography or hidden host-side glue:
- cold-start repo validation
- repro plus fix loops
- parallel isolated workspaces for multiple issues or PRs
- unsafe or untrusted code inspection
- review and evaluation workflows
More concretely, the model should not need to:
- patch files through shell-escaped
printfor heredoc tricks - rely on opaque workspace IDs without a discovery surface
- consume raw terminal control sequences as normal shell output
- choose from an unnecessarily large tool surface when a smaller profile would work
The next gaps for the narrowed persona are now about real-project credibility:
- current-checkout startup is still brittle for messy local repos with unreadable, generated, or permission-sensitive files
- the guest-backed smoke pack is strong, but it still proves shaped scenarios better than arbitrary local-repo readiness
- the chat-host path still does not let users choose the sandbox environment as a first-class part of host connection and server startup
- the product should not claim full whole-project development readiness until it qualifies a real-project loop beyond fixture-shaped use cases
Locked Decisions
- keep the workspace product identity central; do not drift toward CI, queue, or runner abstractions
- keep disk tools secondary and do not make them the main chat-facing surface
- prefer narrow tool profiles and structured outputs over more raw shell calls
- optimize the MCP/chat-host path first and keep the CLI companion path good enough to validate and debug it
- lower-level SDK and repo substrate work can continue, but they should not drive milestone scope or naming
- CLI-only ergonomics are allowed when the SDK and MCP surfaces already have the structured behavior natively
- prioritize repo-aware startup, trust, and daily-loop speed before adding more low-level workspace surface area
- for repo-root auto-detection and
--project-pathinside a Git checkout, the default project source should become Git-tracked files only --repo-urlremains the clean-clone path when users do not want to trust the local checkout as the startup source- environment selection must become first-class in the chat-host path before the product claims whole-project development readiness
- real-project readiness must be proven with guest-backed qualification smokes that cover ignored, generated, and unreadable-file cases
- breaking changes are acceptable while there are still no users and the chat-host product is still being shaped
- every milestone below must also update docs, help text, runnable examples, and at least one real smoke scenario
Milestones
3.2.0Model-Native Workspace File Ops - Done3.3.0Workspace Naming And Discovery - Done3.4.0Tool Profiles And Canonical Chat Flows - Done3.5.0Chat-Friendly Shell Output - Done3.6.0Use-Case Recipes And Smoke Packs - Done3.7.0Handoff Shortcuts And File Input Sources - Done3.8.0Chat-Host Onramp And Recommended Defaults - Done3.9.0Content-Only Reads And Human Output Polish - Done3.10.0Use-Case Smoke Trust And Recipe Fidelity - Done3.11.0Host-Specific MCP Onramps - Done4.0.0Workspace-Core Default Profile - Done4.1.0Project-Aware Chat Startup - Done4.2.0Host Bootstrap And Repair - Done4.3.0Reviewable Agent Output - Done4.4.0Opinionated Use-Case Modes - Done4.5.0Faster Daily Loops - Done4.6.0Git-Tracked Project Sources - Planned4.7.0Project Source Diagnostics And Recovery - Planned4.8.0First-Class Chat Environment Selection - Planned4.9.0Real-Repo Qualification Smokes - Planned5.0.0Whole-Project Sandbox Development - Planned
Completed so far:
3.2.0added model-nativeworkspace file *andworkspace patch applyso chat-driven agents can inspect and edit/workspacewithout shell-escaped file mutation flows.3.3.0added workspace names, key/value labels,workspace list,workspace update, andlast_activity_attracking so humans and chat-driven agents can rediscover and resume the right workspace without external notes.3.4.0added stable MCP/server tool profiles withvm-run,workspace-core, andworkspace-full, plus canonical profile-based OpenAI and MCP examples so chat hosts can start narrow and widen only when needed.3.5.0added chat-friendly shell reads with plain-text rendering and idle batching so PTY sessions are readable enough to feed directly back into a chat model.3.6.0added recipe docs and real guest-backed smoke packs for the five core workspace use cases so the stable product is now demonstrated as repeatable end-to-end stories instead of only isolated feature surfaces.3.7.0removed the remaining shell glue from canonical CLI workspace flows with--id-only,--text-file, and--patch-file, so the shortest handoff path no longer depends onpython -cextraction or$(cat ...)expansion.3.8.0madeworkspace-corethe obvious first MCP/chat-host profile from the first help and docs pass while keepingworkspace-fullas the 3.x compatibility default.3.9.0added content-only workspace file and disk reads plus cleaner default human-mode transcript separation for files that do not end with a trailing newline.3.10.0aligned the five guest-backed use-case smokes with their recipe docs and promotedmake smoke-use-casesas the trustworthy verification path for the advertised workspace flows.3.11.0added exact host-specific MCP onramps for Claude Code, Codex, and OpenCode so new chat-host users can copy one known-good setup example instead of translating the generic MCP config manually.4.0.0flipped the default MCP/server profile toworkspace-core, so the bare entrypoint now matches the recommended narrow chat-host profile across CLI, SDK, and package-level factories.4.1.0made repo-root startup native for chat hosts, so barepyro mcp servecan auto-detect the current Git checkout and let the firstworkspace_createomitseed_path, with explicit--project-pathand--repo-urlfallbacks when cwd is not the source of truth.4.2.0adds first-class host bootstrap and repair helpers so Claude Code, Codex, and OpenCode users can connect or repair the supported chat-host path without manually composing raw MCP commands or config edits.4.3.0adds a concise workspace review surface so users can inspect what the agent changed and ran since the last reset without reconstructing the session from several lower-level views by hand.4.4.0adds named use-case modes so chat hosts can start fromrepro-fix,inspect,cold-start, orreview-evalinstead of choosing from the full generic workspace surface first.4.5.0addspyro prepare, daily-loop readiness inpyro doctor, and a realmake smoke-daily-loopverification path so the local machine warmup story is explicit before the chat host connects.
Planned next:
4.6.0Git-Tracked Project Sources4.7.0Project Source Diagnostics And Recovery4.8.0First-Class Chat Environment Selection4.9.0Real-Repo Qualification Smokes5.0.0Whole-Project Sandbox Development
Expected Outcome
After this roadmap, the product should still look like an agent workspace, not like a CI runner with more isolation.
The intended model-facing shape is:
- one-shot work starts with
vm_run - persistent work moves to a small workspace-first contract
- file edits are structured and model-native
- workspace discovery is human and model-friendly
- shells are readable in chat
- CLI handoff paths do not depend on ad hoc shell parsing
- the recommended chat-host profile is obvious from the first MCP example
- the documented smoke pack is trustworthy enough to use as a release gate
- major chat hosts have copy-pasteable MCP setup examples instead of only a generic config template
- human-mode content reads are copy-paste safe
- the default bare MCP server entrypoint matches the recommended narrow profile
- the five core use cases are documented and smoke-tested end to end
- starting from the current repo feels native from the first chat-host setup
- supported hosts can be connected or repaired without manual config spelunking
- users can review one concise summary of what the agent changed and ran
- the main workflows feel like named modes instead of one giant reference
- reset and retry loops are fast enough to encourage daily use
- repo-root startup is robust even when the local checkout contains ignored, generated, or unreadable files
- chat-host users can choose the sandbox environment as part of the normal connect/start path
- the product has guest-backed qualification for real local repos, not only shaped fixture scenarios
- it becomes credible to tell a user they can develop a real project inside sandboxes, not just evaluate or patch one