pyro-mcp/docs/roadmap/llm-chat-ergonomics.md
2026-03-14 11:18:48 -03:00

11 KiB

LLM Chat Ergonomics Roadmap

This roadmap picks up after the completed workspace GA plan and focuses on one goal:

make the core agent-workspace use cases feel trivial from a chat-driven LLM interface.

Current baseline is 4.5.0:

  • pyro mcp serve is now the default product entrypoint
  • workspace-core is now the default MCP profile
  • one-shot pyro run still exists as the terminal companion path
  • workspaces already support seeding, sync push, exec, export, diff, snapshots, reset, services, PTY shells, secrets, network policy, and published ports
  • host-specific onramps exist for Claude Code, Codex, and OpenCode
  • the five documented use cases are now recipe-backed and smoke-tested
  • stopped-workspace disk tools now exist, but remain explicitly secondary

What "Trivial In Chat" Means

The roadmap is done only when a chat-driven LLM can cover the main use cases without awkward shell choreography or hidden host-side glue:

  • cold-start repo validation
  • repro plus fix loops
  • parallel isolated workspaces for multiple issues or PRs
  • unsafe or untrusted code inspection
  • review and evaluation workflows

More concretely, the model should not need to:

  • patch files through shell-escaped printf or heredoc tricks
  • rely on opaque workspace IDs without a discovery surface
  • consume raw terminal control sequences as normal shell output
  • choose from an unnecessarily large tool surface when a smaller profile would work

The next gaps for the narrowed persona are now about real-project credibility:

  • current-checkout startup is still brittle for messy local repos with unreadable, generated, or permission-sensitive files
  • the guest-backed smoke pack is strong, but it still proves shaped scenarios better than arbitrary local-repo readiness
  • the chat-host path still does not let users choose the sandbox environment as a first-class part of host connection and server startup
  • the product should not claim full whole-project development readiness until it qualifies a real-project loop beyond fixture-shaped use cases

Locked Decisions

  • keep the workspace product identity central; do not drift toward CI, queue, or runner abstractions
  • keep disk tools secondary and do not make them the main chat-facing surface
  • prefer narrow tool profiles and structured outputs over more raw shell calls
  • optimize the MCP/chat-host path first and keep the CLI companion path good enough to validate and debug it
  • lower-level SDK and repo substrate work can continue, but they should not drive milestone scope or naming
  • CLI-only ergonomics are allowed when the SDK and MCP surfaces already have the structured behavior natively
  • prioritize repo-aware startup, trust, and daily-loop speed before adding more low-level workspace surface area
  • for repo-root auto-detection and --project-path inside a Git checkout, the default project source should become Git-tracked files only
  • --repo-url remains the clean-clone path when users do not want to trust the local checkout as the startup source
  • environment selection must become first-class in the chat-host path before the product claims whole-project development readiness
  • real-project readiness must be proven with guest-backed qualification smokes that cover ignored, generated, and unreadable-file cases
  • breaking changes are acceptable while there are still no users and the chat-host product is still being shaped
  • every milestone below must also update docs, help text, runnable examples, and at least one real smoke scenario

Milestones

  1. 3.2.0 Model-Native Workspace File Ops - Done
  2. 3.3.0 Workspace Naming And Discovery - Done
  3. 3.4.0 Tool Profiles And Canonical Chat Flows - Done
  4. 3.5.0 Chat-Friendly Shell Output - Done
  5. 3.6.0 Use-Case Recipes And Smoke Packs - Done
  6. 3.7.0 Handoff Shortcuts And File Input Sources - Done
  7. 3.8.0 Chat-Host Onramp And Recommended Defaults - Done
  8. 3.9.0 Content-Only Reads And Human Output Polish - Done
  9. 3.10.0 Use-Case Smoke Trust And Recipe Fidelity - Done
  10. 3.11.0 Host-Specific MCP Onramps - Done
  11. 4.0.0 Workspace-Core Default Profile - Done
  12. 4.1.0 Project-Aware Chat Startup - Done
  13. 4.2.0 Host Bootstrap And Repair - Done
  14. 4.3.0 Reviewable Agent Output - Done
  15. 4.4.0 Opinionated Use-Case Modes - Done
  16. 4.5.0 Faster Daily Loops - Done
  17. 4.6.0 Git-Tracked Project Sources - Planned
  18. 4.7.0 Project Source Diagnostics And Recovery - Planned
  19. 4.8.0 First-Class Chat Environment Selection - Planned
  20. 4.9.0 Real-Repo Qualification Smokes - Planned
  21. 5.0.0 Whole-Project Sandbox Development - Planned

Completed so far:

  • 3.2.0 added model-native workspace file * and workspace patch apply so chat-driven agents can inspect and edit /workspace without shell-escaped file mutation flows.
  • 3.3.0 added workspace names, key/value labels, workspace list, workspace update, and last_activity_at tracking so humans and chat-driven agents can rediscover and resume the right workspace without external notes.
  • 3.4.0 added stable MCP/server tool profiles with vm-run, workspace-core, and workspace-full, plus canonical profile-based OpenAI and MCP examples so chat hosts can start narrow and widen only when needed.
  • 3.5.0 added chat-friendly shell reads with plain-text rendering and idle batching so PTY sessions are readable enough to feed directly back into a chat model.
  • 3.6.0 added recipe docs and real guest-backed smoke packs for the five core workspace use cases so the stable product is now demonstrated as repeatable end-to-end stories instead of only isolated feature surfaces.
  • 3.7.0 removed the remaining shell glue from canonical CLI workspace flows with --id-only, --text-file, and --patch-file, so the shortest handoff path no longer depends on python -c extraction or $(cat ...) expansion.
  • 3.8.0 made workspace-core the obvious first MCP/chat-host profile from the first help and docs pass while keeping workspace-full as the 3.x compatibility default.
  • 3.9.0 added content-only workspace file and disk reads plus cleaner default human-mode transcript separation for files that do not end with a trailing newline.
  • 3.10.0 aligned the five guest-backed use-case smokes with their recipe docs and promoted make smoke-use-cases as the trustworthy verification path for the advertised workspace flows.
  • 3.11.0 added exact host-specific MCP onramps for Claude Code, Codex, and OpenCode so new chat-host users can copy one known-good setup example instead of translating the generic MCP config manually.
  • 4.0.0 flipped the default MCP/server profile to workspace-core, so the bare entrypoint now matches the recommended narrow chat-host profile across CLI, SDK, and package-level factories.
  • 4.1.0 made repo-root startup native for chat hosts, so bare pyro mcp serve can auto-detect the current Git checkout and let the first workspace_create omit seed_path, with explicit --project-path and --repo-url fallbacks when cwd is not the source of truth.
  • 4.2.0 adds first-class host bootstrap and repair helpers so Claude Code, Codex, and OpenCode users can connect or repair the supported chat-host path without manually composing raw MCP commands or config edits.
  • 4.3.0 adds a concise workspace review surface so users can inspect what the agent changed and ran since the last reset without reconstructing the session from several lower-level views by hand.
  • 4.4.0 adds named use-case modes so chat hosts can start from repro-fix, inspect, cold-start, or review-eval instead of choosing from the full generic workspace surface first.
  • 4.5.0 adds pyro prepare, daily-loop readiness in pyro doctor, and a real make smoke-daily-loop verification path so the local machine warmup story is explicit before the chat host connects.

Planned next:

Expected Outcome

After this roadmap, the product should still look like an agent workspace, not like a CI runner with more isolation.

The intended model-facing shape is:

  • one-shot work starts with vm_run
  • persistent work moves to a small workspace-first contract
  • file edits are structured and model-native
  • workspace discovery is human and model-friendly
  • shells are readable in chat
  • CLI handoff paths do not depend on ad hoc shell parsing
  • the recommended chat-host profile is obvious from the first MCP example
  • the documented smoke pack is trustworthy enough to use as a release gate
  • major chat hosts have copy-pasteable MCP setup examples instead of only a generic config template
  • human-mode content reads are copy-paste safe
  • the default bare MCP server entrypoint matches the recommended narrow profile
  • the five core use cases are documented and smoke-tested end to end
  • starting from the current repo feels native from the first chat-host setup
  • supported hosts can be connected or repaired without manual config spelunking
  • users can review one concise summary of what the agent changed and ran
  • the main workflows feel like named modes instead of one giant reference
  • reset and retry loops are fast enough to encourage daily use
  • repo-root startup is robust even when the local checkout contains ignored, generated, or unreadable files
  • chat-host users can choose the sandbox environment as part of the normal connect/start path
  • the product has guest-backed qualification for real local repos, not only shaped fixture scenarios
  • it becomes credible to tell a user they can develop a real project inside sandboxes, not just evaluate or patch one