Capture the next UX pass after the workspace-core readiness review so the roadmap reflects the remaining friction a new chat-host user still feels. Add milestones for trustworthy use-case smoke coverage, host-specific Claude/Codex/OpenCode MCP onramps, and the planned 4.0 default flip to workspace-core so the bare server entrypoint finally matches the recommended path. This is a docs-only roadmap update based on the live use-case review and integration validation, with the full advanced surface kept as an explicit opt-in rather than the default.
6.9 KiB
6.9 KiB
LLM Chat Ergonomics Roadmap
This roadmap picks up after the completed workspace GA plan and focuses on one goal:
make the core agent-workspace use cases feel trivial from a chat-driven LLM interface.
Current baseline is 3.9.0:
- the stable workspace contract exists across CLI, SDK, and MCP
- one-shot
pyro runstill exists as the narrow entrypoint - workspaces already support seeding, sync push, exec, export, diff, snapshots, reset, services, PTY shells, secrets, network policy, and published ports
- stopped-workspace disk tools now exist, but remain explicitly secondary
What "Trivial In Chat" Means
The roadmap is done only when a chat-driven LLM can cover the main use cases without awkward shell choreography or hidden host-side glue:
- cold-start repo validation
- repro plus fix loops
- parallel isolated workspaces for multiple issues or PRs
- unsafe or untrusted code inspection
- review and evaluation workflows
More concretely, the model should not need to:
- patch files through shell-escaped
printfor heredoc tricks - rely on opaque workspace IDs without a discovery surface
- consume raw terminal control sequences as normal shell output
- choose from an unnecessarily large tool surface when a smaller profile would work
The remaining UX friction for a technically strong new user is now narrower:
- the recommended chat-host onramp is now explicit, but human-mode file reads still need final transcript polish for copy-paste and chat logs
- the five use-case smokes now exist, but the advertised smoke pack is only as trustworthy as its weakest scenario and exact recipe fidelity
- generic MCP guidance is strong, but Codex and OpenCode still ask the user to translate the generic config into host-specific setup steps
workspace-coreis clearly the recommended profile, butpyro mcp serveandcreate_server()still default toworkspace-fullfor3.xcompatibility
Locked Decisions
- keep the workspace product identity central; do not drift toward CI, queue, or runner abstractions
- keep disk tools secondary and do not make them the main chat-facing surface
- prefer narrow tool profiles and structured outputs over more raw shell calls
- capability milestones should update CLI, SDK, and MCP together
- CLI-only ergonomics are allowed when the SDK and MCP surfaces already have the structured behavior natively
- every milestone below must also update docs, help text, runnable examples, and at least one real smoke scenario
Milestones
3.2.0Model-Native Workspace File Ops - Done3.3.0Workspace Naming And Discovery - Done3.4.0Tool Profiles And Canonical Chat Flows - Done3.5.0Chat-Friendly Shell Output - Done3.6.0Use-Case Recipes And Smoke Packs - Done3.7.0Handoff Shortcuts And File Input Sources - Done3.8.0Chat-Host Onramp And Recommended Defaults - Done3.9.0Content-Only Reads And Human Output Polish - Done3.10.0Use-Case Smoke Trust And Recipe Fidelity3.11.0Host-Specific MCP Onramps4.0.0Workspace-Core Default Profile
Completed so far:
3.2.0added model-nativeworkspace file *andworkspace patch applyso chat-driven agents can inspect and edit/workspacewithout shell-escaped file mutation flows.3.3.0added workspace names, key/value labels,workspace list,workspace update, andlast_activity_attracking so humans and chat-driven agents can rediscover and resume the right workspace without external notes.3.4.0added stable MCP/server tool profiles withvm-run,workspace-core, andworkspace-full, plus canonical profile-based OpenAI and MCP examples so chat hosts can start narrow and widen only when needed.3.5.0added chat-friendly shell reads with plain-text rendering and idle batching so PTY sessions are readable enough to feed directly back into a chat model.3.6.0added recipe docs and real guest-backed smoke packs for the five core workspace use cases so the stable product is now demonstrated as repeatable end-to-end stories instead of only isolated feature surfaces.3.7.0removed the remaining shell glue from canonical CLI workspace flows with--id-only,--text-file, and--patch-file, so the shortest handoff path no longer depends onpython -cextraction or$(cat ...)expansion.3.8.0madeworkspace-corethe obvious first MCP/chat-host profile from the first help and docs pass while keepingworkspace-fullas the 3.x compatibility default.3.9.0added content-only workspace file and disk reads plus cleaner default human-mode transcript separation for files that do not end with a trailing newline.
Planned next:
3.10.0makes the use-case recipe set fully trustworthy by requiringmake smoke-use-casesto pass cleanly, aligning recipe docs with what the smoke harness actually proves, and removing brittle assertions against human-mode output when structured results are already available.3.11.0adds exact host-specific onramps for Claude, Codex, and OpenCode so a new chat-host user can copy one known-good config or command instead of translating the generic MCP example by hand.4.0.0flips the default MCP profile fromworkspace-fulltoworkspace-coreso the no-flag server entrypoint finally matches the recommended docs path, while keeping explicit opt-in access to the full advanced surface.
Expected Outcome
After this roadmap, the product should still look like an agent workspace, not like a CI runner with more isolation.
The intended model-facing shape is:
- one-shot work starts with
vm_run - persistent work moves to a small workspace-first contract
- file edits are structured and model-native
- workspace discovery is human and model-friendly
- shells are readable in chat
- CLI handoff paths do not depend on ad hoc shell parsing
- the recommended chat-host profile is obvious from the first MCP example
- the documented smoke pack is trustworthy enough to use as a release gate
- major chat hosts have copy-pasteable MCP setup examples instead of only a generic config template
- human-mode content reads are copy-paste safe
- the default bare MCP server entrypoint matches the recommended narrow profile
- the five core use cases are documented and smoke-tested end to end