From dbb71a31746e39185df7069364bcebc36bb50021 Mon Sep 17 00:00:00 2001 From: Thales Maciel Date: Thu, 12 Mar 2026 21:06:14 -0300 Subject: [PATCH] Add chat-first workspace roadmap Document the post-3.1 milestones needed to make the stable workspace product feel natural in chat-driven LLM interfaces. Add a follow-on roadmap for model-native file ops, workspace naming and discovery, tool profiles, shell output cleanup, and use-case recipes with smoke coverage. Link it from the README, vision doc, and completed workspace GA roadmap so the next phase is explicit. Keep the sequence anchored to the workspace-first vision and continue to treat disk tools as secondary rather than the main chat-facing surface. --- README.md | 3 +- docs/roadmap/llm-chat-ergonomics.md | 66 +++++++++++++++++++ .../3.2.0-model-native-workspace-file-ops.md | 59 +++++++++++++++++ .../3.3.0-workspace-naming-and-discovery.md | 52 +++++++++++++++ ...-tool-profiles-and-canonical-chat-flows.md | 51 ++++++++++++++ .../3.5.0-chat-friendly-shell-output.md | 46 +++++++++++++ .../3.6.0-use-case-recipes-and-smoke-packs.md | 42 ++++++++++++ docs/roadmap/task-workspace-ga.md | 3 +- docs/vision.md | 8 ++- 9 files changed, 326 insertions(+), 4 deletions(-) create mode 100644 docs/roadmap/llm-chat-ergonomics.md create mode 100644 docs/roadmap/llm-chat-ergonomics/3.2.0-model-native-workspace-file-ops.md create mode 100644 docs/roadmap/llm-chat-ergonomics/3.3.0-workspace-naming-and-discovery.md create mode 100644 docs/roadmap/llm-chat-ergonomics/3.4.0-tool-profiles-and-canonical-chat-flows.md create mode 100644 docs/roadmap/llm-chat-ergonomics/3.5.0-chat-friendly-shell-output.md create mode 100644 docs/roadmap/llm-chat-ergonomics/3.6.0-use-case-recipes-and-smoke-packs.md diff --git a/README.md b/README.md index 674dedd..9171bd6 100644 --- a/README.md +++ b/README.md @@ -16,7 +16,8 @@ It exposes the same runtime in three public forms: - Install: [docs/install.md](docs/install.md) - Vision: [docs/vision.md](docs/vision.md) -- Workspace roadmap: [docs/roadmap/task-workspace-ga.md](docs/roadmap/task-workspace-ga.md) +- Workspace GA roadmap: [docs/roadmap/task-workspace-ga.md](docs/roadmap/task-workspace-ga.md) +- LLM chat roadmap: [docs/roadmap/llm-chat-ergonomics.md](docs/roadmap/llm-chat-ergonomics.md) - First run transcript: [docs/first-run.md](docs/first-run.md) - Stable workspace walkthrough GIF: [docs/assets/workspace-first-run.gif](docs/assets/workspace-first-run.gif) - Terminal walkthrough GIF: [docs/assets/first-run.gif](docs/assets/first-run.gif) diff --git a/docs/roadmap/llm-chat-ergonomics.md b/docs/roadmap/llm-chat-ergonomics.md new file mode 100644 index 0000000..07549f7 --- /dev/null +++ b/docs/roadmap/llm-chat-ergonomics.md @@ -0,0 +1,66 @@ +# LLM Chat Ergonomics Roadmap + +This roadmap picks up after the completed workspace GA plan and focuses on one +goal: + +make the core agent-workspace use cases feel trivial from a chat-driven LLM +interface. + +Current baseline is `3.1.0`: + +- the stable workspace contract exists across CLI, SDK, and MCP +- one-shot `pyro run` still exists as the narrow entrypoint +- workspaces already support seeding, sync push, exec, export, diff, snapshots, + reset, services, PTY shells, secrets, network policy, and published ports +- stopped-workspace disk tools now exist, but remain explicitly secondary + +## What "Trivial In Chat" Means + +The roadmap is done only when a chat-driven LLM can cover the main use cases +without awkward shell choreography or hidden host-side glue: + +- cold-start repo validation +- repro plus fix loops +- parallel isolated workspaces for multiple issues or PRs +- unsafe or untrusted code inspection +- review and evaluation workflows + +More concretely, the model should not need to: + +- patch files through shell-escaped `printf` or heredoc tricks +- rely on opaque workspace IDs without a discovery surface +- consume raw terminal control sequences as normal shell output +- choose from an unnecessarily large tool surface when a smaller profile would + work + +## Locked Decisions + +- keep the workspace product identity central; do not drift toward CI, queue, + or runner abstractions +- keep disk tools secondary and do not make them the main chat-facing surface +- prefer narrow tool profiles and structured outputs over more raw shell calls +- every milestone below must update CLI, SDK, and MCP together +- every milestone below must also update docs, help text, runnable examples, + and at least one real smoke scenario + +## Milestones + +1. [`3.2.0` Model-Native Workspace File Ops](llm-chat-ergonomics/3.2.0-model-native-workspace-file-ops.md) +2. [`3.3.0` Workspace Naming And Discovery](llm-chat-ergonomics/3.3.0-workspace-naming-and-discovery.md) +3. [`3.4.0` Tool Profiles And Canonical Chat Flows](llm-chat-ergonomics/3.4.0-tool-profiles-and-canonical-chat-flows.md) +4. [`3.5.0` Chat-Friendly Shell Output](llm-chat-ergonomics/3.5.0-chat-friendly-shell-output.md) +5. [`3.6.0` Use-Case Recipes And Smoke Packs](llm-chat-ergonomics/3.6.0-use-case-recipes-and-smoke-packs.md) + +## Expected Outcome + +After this roadmap, the product should still look like an agent workspace, not +like a CI runner with more isolation. + +The intended model-facing shape is: + +- one-shot work starts with `vm_run` +- persistent work moves to a small workspace-first contract +- file edits are structured and model-native +- workspace discovery is human and model-friendly +- shells are readable in chat +- the five core use cases are documented and smoke-tested end to end diff --git a/docs/roadmap/llm-chat-ergonomics/3.2.0-model-native-workspace-file-ops.md b/docs/roadmap/llm-chat-ergonomics/3.2.0-model-native-workspace-file-ops.md new file mode 100644 index 0000000..056528b --- /dev/null +++ b/docs/roadmap/llm-chat-ergonomics/3.2.0-model-native-workspace-file-ops.md @@ -0,0 +1,59 @@ +# `3.2.0` Model-Native Workspace File Ops + +Status: Planned + +## Goal + +Remove shell quoting and hidden host-temp-file choreography from normal +chat-driven workspace editing loops. + +## Public API Changes + +Planned additions: + +- `pyro workspace file list WORKSPACE_ID [PATH] [--recursive]` +- `pyro workspace file read WORKSPACE_ID PATH [--max-bytes N]` +- `pyro workspace file write WORKSPACE_ID PATH --text TEXT` +- `pyro workspace patch apply WORKSPACE_ID --patch TEXT` +- matching Python SDK methods: + - `list_workspace_files` + - `read_workspace_file` + - `write_workspace_file` + - `apply_workspace_patch` +- matching MCP tools: + - `workspace_file_list` + - `workspace_file_read` + - `workspace_file_write` + - `workspace_patch_apply` + +## Implementation Boundaries + +- scope all operations strictly under `/workspace` +- keep these tools text-first and bounded in size +- make patch application explicit and deterministic +- keep `workspace export` as the host-out path for copying results back +- keep shell and exec available for process-oriented work, not as the only way + to mutate files + +## Non-Goals + +- no arbitrary host filesystem access +- no generic SFTP or file-manager product identity +- no replacement of shell or exec for process lifecycle work +- no hidden auto-merge behavior for conflicting patches + +## Acceptance Scenarios + +- an agent reads a file, applies a patch, reruns tests, and exports the result + without shell-escaped editing tricks +- an agent inspects a repo tree and targeted files inside one workspace without + relying on host-side temp paths +- a repro-plus-fix loop is practical from MCP alone, not only from a custom + host wrapper + +## Required Repo Updates + +- public contract updates across CLI, SDK, and MCP +- docs and examples that show model-native file editing instead of shell-heavy + file writes +- at least one real smoke scenario centered on a repro-plus-fix loop diff --git a/docs/roadmap/llm-chat-ergonomics/3.3.0-workspace-naming-and-discovery.md b/docs/roadmap/llm-chat-ergonomics/3.3.0-workspace-naming-and-discovery.md new file mode 100644 index 0000000..355e413 --- /dev/null +++ b/docs/roadmap/llm-chat-ergonomics/3.3.0-workspace-naming-and-discovery.md @@ -0,0 +1,52 @@ +# `3.3.0` Workspace Naming And Discovery + +Status: Planned + +## Goal + +Make multiple concurrent workspaces manageable from chat without forcing the +user or model to juggle opaque IDs. + +## Public API Changes + +Planned additions: + +- `pyro workspace create ... --name NAME` +- `pyro workspace create ... --label KEY=VALUE` +- `pyro workspace list` +- `pyro workspace update WORKSPACE_ID [--name NAME] [--label KEY=VALUE] [--clear-label KEY]` +- matching Python SDK methods: + - `list_workspaces` + - `update_workspace` +- matching MCP tools: + - `workspace_list` + - `workspace_update` + +## Implementation Boundaries + +- keep workspace IDs as the stable machine identifier +- treat names and labels as operator-friendly metadata and discovery aids +- surface last activity, expiry, service counts, and summary metadata in + `workspace list` +- make name and label metadata visible in create, status, and list responses + +## Non-Goals + +- no scheduler or queue abstractions +- no project-wide branch manager +- no hidden background cleanup policy beyond the existing TTL model + +## Acceptance Scenarios + +- a user can keep separate workspaces for two issues or PRs and discover them + again without external notes +- a chat agent can list active workspaces, choose the right one, and continue + work after a later prompt +- review and evaluation flows can tag or name workspaces by repo, bug, or task + intent + +## Required Repo Updates + +- README and install docs that show parallel named workspaces +- examples that demonstrate issue-oriented workspace naming +- smoke coverage for at least one multi-workspace flow diff --git a/docs/roadmap/llm-chat-ergonomics/3.4.0-tool-profiles-and-canonical-chat-flows.md b/docs/roadmap/llm-chat-ergonomics/3.4.0-tool-profiles-and-canonical-chat-flows.md new file mode 100644 index 0000000..50a3a9c --- /dev/null +++ b/docs/roadmap/llm-chat-ergonomics/3.4.0-tool-profiles-and-canonical-chat-flows.md @@ -0,0 +1,51 @@ +# `3.4.0` Tool Profiles And Canonical Chat Flows + +Status: Planned + +## Goal + +Make the model-facing surface intentionally small for chat hosts, while keeping +the full workspace product available when needed. + +## Public API Changes + +Planned additions: + +- `pyro mcp serve --profile {vm-run,workspace-core,workspace-full}` +- matching Python SDK and server factory configuration for the same profiles +- one canonical OpenAI Responses example that uses the workspace-core profile +- one canonical MCP/chat example that uses the same profile progression + +Representative profile intent: + +- `vm-run`: one-shot only +- `workspace-core`: create, status, exec, file ops, diff, reset, export, delete +- `workspace-full`: shells, services, snapshots, secrets, network policy, and + the rest of the stable workspace surface + +## Implementation Boundaries + +- keep the current full surface available for advanced users +- add profiles as an exposure control, not as a second product line +- make profile behavior explicit in docs and help text +- keep profile names stable once shipped + +## Non-Goals + +- no framework-specific wrappers inside the core package +- no server-side planner that chooses tools on the model's behalf +- no hidden feature gating by provider or client + +## Acceptance Scenarios + +- a chat host can expose only `vm_run` for one-shot work +- a chat host can promote the same agent to `workspace-core` without suddenly + dumping the full advanced surface on the model +- a new integrator can copy one example and understand the intended progression + from one-shot to stable workspace + +## Required Repo Updates + +- integration docs that explain when to use each profile +- canonical chat examples for both provider tool calling and MCP +- smoke coverage for at least one profile-limited chat loop diff --git a/docs/roadmap/llm-chat-ergonomics/3.5.0-chat-friendly-shell-output.md b/docs/roadmap/llm-chat-ergonomics/3.5.0-chat-friendly-shell-output.md new file mode 100644 index 0000000..747454d --- /dev/null +++ b/docs/roadmap/llm-chat-ergonomics/3.5.0-chat-friendly-shell-output.md @@ -0,0 +1,46 @@ +# `3.5.0` Chat-Friendly Shell Output + +Status: Planned + +## Goal + +Keep persistent PTY shells powerful, but make their output clean enough to feed +directly back into a chat model. + +## Public API Changes + +Planned additions: + +- `pyro workspace shell read ... --plain` +- `pyro workspace shell read ... --wait-for-idle-ms N` +- matching Python SDK parameters: + - `plain=True` + - `wait_for_idle_ms=...` +- matching MCP request fields on `shell_read` + +## Implementation Boundaries + +- keep raw PTY reads available for advanced clients +- plain mode should strip terminal control sequences and normalize line endings +- idle waiting should batch the next useful chunk of output without turning the + shell into a separate job scheduler +- keep cursor-based reads so polling clients stay deterministic + +## Non-Goals + +- no replacement of the PTY shell with a fake line-based shell +- no automatic command synthesis inside shell reads +- no shell-only workflow that replaces `workspace exec`, services, or file ops + +## Acceptance Scenarios + +- a chat agent can open a shell, write a command, and read back plain text + output without ANSI noise +- long-running interactive setup or debugging flows are readable in chat +- shell output is useful as model input without extra client-side cleanup + +## Required Repo Updates + +- help text that makes raw versus plain shell reads explicit +- examples that show a clean interactive shell loop +- smoke coverage for at least one shell-driven debugging scenario diff --git a/docs/roadmap/llm-chat-ergonomics/3.6.0-use-case-recipes-and-smoke-packs.md b/docs/roadmap/llm-chat-ergonomics/3.6.0-use-case-recipes-and-smoke-packs.md new file mode 100644 index 0000000..a174528 --- /dev/null +++ b/docs/roadmap/llm-chat-ergonomics/3.6.0-use-case-recipes-and-smoke-packs.md @@ -0,0 +1,42 @@ +# `3.6.0` Use-Case Recipes And Smoke Packs + +Status: Planned + +## Goal + +Turn the five target workflows into first-class documented stories and runnable +verification paths. + +## Public API Changes + +No new core API is required in this milestone. + +The main deliverable is packaging the now-mature workspace surface into clear +recipes, examples, and smoke scenarios that prove the intended user experience. + +## Implementation Boundaries + +- build on the existing stable workspace contract and the earlier chat-first + milestones +- keep the focus on user-facing flows, not internal test harness complexity +- treat the recipes as product documentation, not private maintainer notes + +## Non-Goals + +- no new CI or scheduler abstractions +- no speculative cloud orchestration work +- no broad expansion of disk tooling as the main story + +## Acceptance Scenarios + +- cold-start repo validation has a documented and smoke-tested flow +- repro-plus-fix loops have a documented and smoke-tested flow +- parallel isolated workspaces have a documented and smoke-tested flow +- unsafe or untrusted code inspection has a documented and smoke-tested flow +- review and evaluation workflows have a documented and smoke-tested flow + +## Required Repo Updates + +- a dedicated doc or section for each target use case +- at least one canonical example per use case in CLI, SDK, or MCP form +- smoke scenarios that prove each flow on a real Firecracker-backed path diff --git a/docs/roadmap/task-workspace-ga.md b/docs/roadmap/task-workspace-ga.md index b5482b4..3324e22 100644 --- a/docs/roadmap/task-workspace-ga.md +++ b/docs/roadmap/task-workspace-ga.md @@ -46,4 +46,5 @@ The planned workspace roadmap is complete. - `3.1.0` added secondary stopped-workspace disk export and offline inspection helpers without changing the stable workspace-first core contract. -- Future work, if any, is now outside the planned vision milestones tracked in this roadmap. +- The next follow-on milestones now live in [llm-chat-ergonomics.md](llm-chat-ergonomics.md) and + focus on making the stable workspace product feel trivial from chat-driven LLM interfaces. diff --git a/docs/vision.md b/docs/vision.md index 41a6600..cdf852c 100644 --- a/docs/vision.md +++ b/docs/vision.md @@ -170,8 +170,12 @@ Features should be prioritized in this order: 6. Explicit secrets and network policy 7. Secondary disk-level import/export and inspection tools -The implementation roadmap that turns those priorities into release-sized -milestones lives in [roadmap/task-workspace-ga.md](roadmap/task-workspace-ga.md). +The completed workspace GA roadmap lives in +[roadmap/task-workspace-ga.md](roadmap/task-workspace-ga.md). + +The next implementation milestones that make those workflows feel natural from +chat-driven LLM interfaces live in +[roadmap/llm-chat-ergonomics.md](roadmap/llm-chat-ergonomics.md). ## Naming Guidance