Turn the stable workspace surface into five documented, runnable stories with a shared guest-backed smoke runner, new docs/use-cases recipes, and Make targets for cold-start validation, repro/fix loops, parallel workspaces, untrusted inspection, and review/eval workflows. Bump the package and catalog surface to 3.6.0, update the main docs to point users from the stable workspace walkthrough into the recipe index and smoke packs, and mark the 3.6.0 roadmap milestone done. Fix a regression uncovered by the real parallel-workspaces smoke: workspace_file_read must not bump last_activity_at. Verified with uv lock, UV_CACHE_DIR=.uv-cache make check, UV_CACHE_DIR=.uv-cache make dist-check, and USE_CASE_ENVIRONMENT=debian:12 UV_CACHE_DIR=.uv-cache make smoke-use-cases.
82 lines
3.7 KiB
Markdown
82 lines
3.7 KiB
Markdown
# LLM Chat Ergonomics Roadmap
|
|
|
|
This roadmap picks up after the completed workspace GA plan and focuses on one
|
|
goal:
|
|
|
|
make the core agent-workspace use cases feel trivial from a chat-driven LLM
|
|
interface.
|
|
|
|
Current baseline is `3.6.0`:
|
|
|
|
- the stable workspace contract exists across CLI, SDK, and MCP
|
|
- one-shot `pyro run` still exists as the narrow entrypoint
|
|
- workspaces already support seeding, sync push, exec, export, diff, snapshots,
|
|
reset, services, PTY shells, secrets, network policy, and published ports
|
|
- stopped-workspace disk tools now exist, but remain explicitly secondary
|
|
|
|
## What "Trivial In Chat" Means
|
|
|
|
The roadmap is done only when a chat-driven LLM can cover the main use cases
|
|
without awkward shell choreography or hidden host-side glue:
|
|
|
|
- cold-start repo validation
|
|
- repro plus fix loops
|
|
- parallel isolated workspaces for multiple issues or PRs
|
|
- unsafe or untrusted code inspection
|
|
- review and evaluation workflows
|
|
|
|
More concretely, the model should not need to:
|
|
|
|
- patch files through shell-escaped `printf` or heredoc tricks
|
|
- rely on opaque workspace IDs without a discovery surface
|
|
- consume raw terminal control sequences as normal shell output
|
|
- choose from an unnecessarily large tool surface when a smaller profile would
|
|
work
|
|
|
|
## Locked Decisions
|
|
|
|
- keep the workspace product identity central; do not drift toward CI, queue,
|
|
or runner abstractions
|
|
- keep disk tools secondary and do not make them the main chat-facing surface
|
|
- prefer narrow tool profiles and structured outputs over more raw shell calls
|
|
- every milestone below must update CLI, SDK, and MCP together
|
|
- every milestone below must also update docs, help text, runnable examples,
|
|
and at least one real smoke scenario
|
|
|
|
## Milestones
|
|
|
|
1. [`3.2.0` Model-Native Workspace File Ops](llm-chat-ergonomics/3.2.0-model-native-workspace-file-ops.md) - Done
|
|
2. [`3.3.0` Workspace Naming And Discovery](llm-chat-ergonomics/3.3.0-workspace-naming-and-discovery.md) - Done
|
|
3. [`3.4.0` Tool Profiles And Canonical Chat Flows](llm-chat-ergonomics/3.4.0-tool-profiles-and-canonical-chat-flows.md) - Done
|
|
4. [`3.5.0` Chat-Friendly Shell Output](llm-chat-ergonomics/3.5.0-chat-friendly-shell-output.md) - Done
|
|
5. [`3.6.0` Use-Case Recipes And Smoke Packs](llm-chat-ergonomics/3.6.0-use-case-recipes-and-smoke-packs.md) - Done
|
|
|
|
Completed so far:
|
|
|
|
- `3.2.0` added model-native `workspace file *` and `workspace patch apply` so chat-driven agents
|
|
can inspect and edit `/workspace` without shell-escaped file mutation flows.
|
|
- `3.3.0` added workspace names, key/value labels, `workspace list`, `workspace update`, and
|
|
`last_activity_at` tracking so humans and chat-driven agents can rediscover and resume the right
|
|
workspace without external notes.
|
|
- `3.4.0` added stable MCP/server tool profiles with `vm-run`, `workspace-core`, and
|
|
`workspace-full`, plus canonical profile-based OpenAI and MCP examples so chat hosts can start
|
|
narrow and widen only when needed.
|
|
- `3.5.0` added chat-friendly shell reads with plain-text rendering and idle batching so PTY
|
|
sessions are readable enough to feed directly back into a chat model.
|
|
- `3.6.0` added recipe docs and real guest-backed smoke packs for the five core workspace use
|
|
cases so the stable product is now demonstrated as repeatable end-to-end stories instead of
|
|
only isolated feature surfaces.
|
|
|
|
## Expected Outcome
|
|
|
|
After this roadmap, the product should still look like an agent workspace, not
|
|
like a CI runner with more isolation.
|
|
|
|
The intended model-facing shape is:
|
|
|
|
- one-shot work starts with `vm_run`
|
|
- persistent work moves to a small workspace-first contract
|
|
- file edits are structured and model-native
|
|
- workspace discovery is human and model-friendly
|
|
- shells are readable in chat
|
|
- the five core use cases are documented and smoke-tested end to end
|