Add chat-first workspace roadmap
Document the post-3.1 milestones needed to make the stable workspace product feel natural in chat-driven LLM interfaces. Add a follow-on roadmap for model-native file ops, workspace naming and discovery, tool profiles, shell output cleanup, and use-case recipes with smoke coverage. Link it from the README, vision doc, and completed workspace GA roadmap so the next phase is explicit. Keep the sequence anchored to the workspace-first vision and continue to treat disk tools as secondary rather than the main chat-facing surface.
This commit is contained in:
parent
287f6d100f
commit
dbb71a3174
9 changed files with 326 additions and 4 deletions
|
|
@ -16,7 +16,8 @@ It exposes the same runtime in three public forms:
|
||||||
|
|
||||||
- Install: [docs/install.md](docs/install.md)
|
- Install: [docs/install.md](docs/install.md)
|
||||||
- Vision: [docs/vision.md](docs/vision.md)
|
- Vision: [docs/vision.md](docs/vision.md)
|
||||||
- Workspace roadmap: [docs/roadmap/task-workspace-ga.md](docs/roadmap/task-workspace-ga.md)
|
- Workspace GA roadmap: [docs/roadmap/task-workspace-ga.md](docs/roadmap/task-workspace-ga.md)
|
||||||
|
- LLM chat roadmap: [docs/roadmap/llm-chat-ergonomics.md](docs/roadmap/llm-chat-ergonomics.md)
|
||||||
- First run transcript: [docs/first-run.md](docs/first-run.md)
|
- First run transcript: [docs/first-run.md](docs/first-run.md)
|
||||||
- Stable workspace walkthrough GIF: [docs/assets/workspace-first-run.gif](docs/assets/workspace-first-run.gif)
|
- Stable workspace walkthrough GIF: [docs/assets/workspace-first-run.gif](docs/assets/workspace-first-run.gif)
|
||||||
- Terminal walkthrough GIF: [docs/assets/first-run.gif](docs/assets/first-run.gif)
|
- Terminal walkthrough GIF: [docs/assets/first-run.gif](docs/assets/first-run.gif)
|
||||||
|
|
|
||||||
66
docs/roadmap/llm-chat-ergonomics.md
Normal file
66
docs/roadmap/llm-chat-ergonomics.md
Normal file
|
|
@ -0,0 +1,66 @@
|
||||||
|
# LLM Chat Ergonomics Roadmap
|
||||||
|
|
||||||
|
This roadmap picks up after the completed workspace GA plan and focuses on one
|
||||||
|
goal:
|
||||||
|
|
||||||
|
make the core agent-workspace use cases feel trivial from a chat-driven LLM
|
||||||
|
interface.
|
||||||
|
|
||||||
|
Current baseline is `3.1.0`:
|
||||||
|
|
||||||
|
- the stable workspace contract exists across CLI, SDK, and MCP
|
||||||
|
- one-shot `pyro run` still exists as the narrow entrypoint
|
||||||
|
- workspaces already support seeding, sync push, exec, export, diff, snapshots,
|
||||||
|
reset, services, PTY shells, secrets, network policy, and published ports
|
||||||
|
- stopped-workspace disk tools now exist, but remain explicitly secondary
|
||||||
|
|
||||||
|
## What "Trivial In Chat" Means
|
||||||
|
|
||||||
|
The roadmap is done only when a chat-driven LLM can cover the main use cases
|
||||||
|
without awkward shell choreography or hidden host-side glue:
|
||||||
|
|
||||||
|
- cold-start repo validation
|
||||||
|
- repro plus fix loops
|
||||||
|
- parallel isolated workspaces for multiple issues or PRs
|
||||||
|
- unsafe or untrusted code inspection
|
||||||
|
- review and evaluation workflows
|
||||||
|
|
||||||
|
More concretely, the model should not need to:
|
||||||
|
|
||||||
|
- patch files through shell-escaped `printf` or heredoc tricks
|
||||||
|
- rely on opaque workspace IDs without a discovery surface
|
||||||
|
- consume raw terminal control sequences as normal shell output
|
||||||
|
- choose from an unnecessarily large tool surface when a smaller profile would
|
||||||
|
work
|
||||||
|
|
||||||
|
## Locked Decisions
|
||||||
|
|
||||||
|
- keep the workspace product identity central; do not drift toward CI, queue,
|
||||||
|
or runner abstractions
|
||||||
|
- keep disk tools secondary and do not make them the main chat-facing surface
|
||||||
|
- prefer narrow tool profiles and structured outputs over more raw shell calls
|
||||||
|
- every milestone below must update CLI, SDK, and MCP together
|
||||||
|
- every milestone below must also update docs, help text, runnable examples,
|
||||||
|
and at least one real smoke scenario
|
||||||
|
|
||||||
|
## Milestones
|
||||||
|
|
||||||
|
1. [`3.2.0` Model-Native Workspace File Ops](llm-chat-ergonomics/3.2.0-model-native-workspace-file-ops.md)
|
||||||
|
2. [`3.3.0` Workspace Naming And Discovery](llm-chat-ergonomics/3.3.0-workspace-naming-and-discovery.md)
|
||||||
|
3. [`3.4.0` Tool Profiles And Canonical Chat Flows](llm-chat-ergonomics/3.4.0-tool-profiles-and-canonical-chat-flows.md)
|
||||||
|
4. [`3.5.0` Chat-Friendly Shell Output](llm-chat-ergonomics/3.5.0-chat-friendly-shell-output.md)
|
||||||
|
5. [`3.6.0` Use-Case Recipes And Smoke Packs](llm-chat-ergonomics/3.6.0-use-case-recipes-and-smoke-packs.md)
|
||||||
|
|
||||||
|
## Expected Outcome
|
||||||
|
|
||||||
|
After this roadmap, the product should still look like an agent workspace, not
|
||||||
|
like a CI runner with more isolation.
|
||||||
|
|
||||||
|
The intended model-facing shape is:
|
||||||
|
|
||||||
|
- one-shot work starts with `vm_run`
|
||||||
|
- persistent work moves to a small workspace-first contract
|
||||||
|
- file edits are structured and model-native
|
||||||
|
- workspace discovery is human and model-friendly
|
||||||
|
- shells are readable in chat
|
||||||
|
- the five core use cases are documented and smoke-tested end to end
|
||||||
|
|
@ -0,0 +1,59 @@
|
||||||
|
# `3.2.0` Model-Native Workspace File Ops
|
||||||
|
|
||||||
|
Status: Planned
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
|
||||||
|
Remove shell quoting and hidden host-temp-file choreography from normal
|
||||||
|
chat-driven workspace editing loops.
|
||||||
|
|
||||||
|
## Public API Changes
|
||||||
|
|
||||||
|
Planned additions:
|
||||||
|
|
||||||
|
- `pyro workspace file list WORKSPACE_ID [PATH] [--recursive]`
|
||||||
|
- `pyro workspace file read WORKSPACE_ID PATH [--max-bytes N]`
|
||||||
|
- `pyro workspace file write WORKSPACE_ID PATH --text TEXT`
|
||||||
|
- `pyro workspace patch apply WORKSPACE_ID --patch TEXT`
|
||||||
|
- matching Python SDK methods:
|
||||||
|
- `list_workspace_files`
|
||||||
|
- `read_workspace_file`
|
||||||
|
- `write_workspace_file`
|
||||||
|
- `apply_workspace_patch`
|
||||||
|
- matching MCP tools:
|
||||||
|
- `workspace_file_list`
|
||||||
|
- `workspace_file_read`
|
||||||
|
- `workspace_file_write`
|
||||||
|
- `workspace_patch_apply`
|
||||||
|
|
||||||
|
## Implementation Boundaries
|
||||||
|
|
||||||
|
- scope all operations strictly under `/workspace`
|
||||||
|
- keep these tools text-first and bounded in size
|
||||||
|
- make patch application explicit and deterministic
|
||||||
|
- keep `workspace export` as the host-out path for copying results back
|
||||||
|
- keep shell and exec available for process-oriented work, not as the only way
|
||||||
|
to mutate files
|
||||||
|
|
||||||
|
## Non-Goals
|
||||||
|
|
||||||
|
- no arbitrary host filesystem access
|
||||||
|
- no generic SFTP or file-manager product identity
|
||||||
|
- no replacement of shell or exec for process lifecycle work
|
||||||
|
- no hidden auto-merge behavior for conflicting patches
|
||||||
|
|
||||||
|
## Acceptance Scenarios
|
||||||
|
|
||||||
|
- an agent reads a file, applies a patch, reruns tests, and exports the result
|
||||||
|
without shell-escaped editing tricks
|
||||||
|
- an agent inspects a repo tree and targeted files inside one workspace without
|
||||||
|
relying on host-side temp paths
|
||||||
|
- a repro-plus-fix loop is practical from MCP alone, not only from a custom
|
||||||
|
host wrapper
|
||||||
|
|
||||||
|
## Required Repo Updates
|
||||||
|
|
||||||
|
- public contract updates across CLI, SDK, and MCP
|
||||||
|
- docs and examples that show model-native file editing instead of shell-heavy
|
||||||
|
file writes
|
||||||
|
- at least one real smoke scenario centered on a repro-plus-fix loop
|
||||||
|
|
@ -0,0 +1,52 @@
|
||||||
|
# `3.3.0` Workspace Naming And Discovery
|
||||||
|
|
||||||
|
Status: Planned
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
|
||||||
|
Make multiple concurrent workspaces manageable from chat without forcing the
|
||||||
|
user or model to juggle opaque IDs.
|
||||||
|
|
||||||
|
## Public API Changes
|
||||||
|
|
||||||
|
Planned additions:
|
||||||
|
|
||||||
|
- `pyro workspace create ... --name NAME`
|
||||||
|
- `pyro workspace create ... --label KEY=VALUE`
|
||||||
|
- `pyro workspace list`
|
||||||
|
- `pyro workspace update WORKSPACE_ID [--name NAME] [--label KEY=VALUE] [--clear-label KEY]`
|
||||||
|
- matching Python SDK methods:
|
||||||
|
- `list_workspaces`
|
||||||
|
- `update_workspace`
|
||||||
|
- matching MCP tools:
|
||||||
|
- `workspace_list`
|
||||||
|
- `workspace_update`
|
||||||
|
|
||||||
|
## Implementation Boundaries
|
||||||
|
|
||||||
|
- keep workspace IDs as the stable machine identifier
|
||||||
|
- treat names and labels as operator-friendly metadata and discovery aids
|
||||||
|
- surface last activity, expiry, service counts, and summary metadata in
|
||||||
|
`workspace list`
|
||||||
|
- make name and label metadata visible in create, status, and list responses
|
||||||
|
|
||||||
|
## Non-Goals
|
||||||
|
|
||||||
|
- no scheduler or queue abstractions
|
||||||
|
- no project-wide branch manager
|
||||||
|
- no hidden background cleanup policy beyond the existing TTL model
|
||||||
|
|
||||||
|
## Acceptance Scenarios
|
||||||
|
|
||||||
|
- a user can keep separate workspaces for two issues or PRs and discover them
|
||||||
|
again without external notes
|
||||||
|
- a chat agent can list active workspaces, choose the right one, and continue
|
||||||
|
work after a later prompt
|
||||||
|
- review and evaluation flows can tag or name workspaces by repo, bug, or task
|
||||||
|
intent
|
||||||
|
|
||||||
|
## Required Repo Updates
|
||||||
|
|
||||||
|
- README and install docs that show parallel named workspaces
|
||||||
|
- examples that demonstrate issue-oriented workspace naming
|
||||||
|
- smoke coverage for at least one multi-workspace flow
|
||||||
|
|
@ -0,0 +1,51 @@
|
||||||
|
# `3.4.0` Tool Profiles And Canonical Chat Flows
|
||||||
|
|
||||||
|
Status: Planned
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
|
||||||
|
Make the model-facing surface intentionally small for chat hosts, while keeping
|
||||||
|
the full workspace product available when needed.
|
||||||
|
|
||||||
|
## Public API Changes
|
||||||
|
|
||||||
|
Planned additions:
|
||||||
|
|
||||||
|
- `pyro mcp serve --profile {vm-run,workspace-core,workspace-full}`
|
||||||
|
- matching Python SDK and server factory configuration for the same profiles
|
||||||
|
- one canonical OpenAI Responses example that uses the workspace-core profile
|
||||||
|
- one canonical MCP/chat example that uses the same profile progression
|
||||||
|
|
||||||
|
Representative profile intent:
|
||||||
|
|
||||||
|
- `vm-run`: one-shot only
|
||||||
|
- `workspace-core`: create, status, exec, file ops, diff, reset, export, delete
|
||||||
|
- `workspace-full`: shells, services, snapshots, secrets, network policy, and
|
||||||
|
the rest of the stable workspace surface
|
||||||
|
|
||||||
|
## Implementation Boundaries
|
||||||
|
|
||||||
|
- keep the current full surface available for advanced users
|
||||||
|
- add profiles as an exposure control, not as a second product line
|
||||||
|
- make profile behavior explicit in docs and help text
|
||||||
|
- keep profile names stable once shipped
|
||||||
|
|
||||||
|
## Non-Goals
|
||||||
|
|
||||||
|
- no framework-specific wrappers inside the core package
|
||||||
|
- no server-side planner that chooses tools on the model's behalf
|
||||||
|
- no hidden feature gating by provider or client
|
||||||
|
|
||||||
|
## Acceptance Scenarios
|
||||||
|
|
||||||
|
- a chat host can expose only `vm_run` for one-shot work
|
||||||
|
- a chat host can promote the same agent to `workspace-core` without suddenly
|
||||||
|
dumping the full advanced surface on the model
|
||||||
|
- a new integrator can copy one example and understand the intended progression
|
||||||
|
from one-shot to stable workspace
|
||||||
|
|
||||||
|
## Required Repo Updates
|
||||||
|
|
||||||
|
- integration docs that explain when to use each profile
|
||||||
|
- canonical chat examples for both provider tool calling and MCP
|
||||||
|
- smoke coverage for at least one profile-limited chat loop
|
||||||
|
|
@ -0,0 +1,46 @@
|
||||||
|
# `3.5.0` Chat-Friendly Shell Output
|
||||||
|
|
||||||
|
Status: Planned
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
|
||||||
|
Keep persistent PTY shells powerful, but make their output clean enough to feed
|
||||||
|
directly back into a chat model.
|
||||||
|
|
||||||
|
## Public API Changes
|
||||||
|
|
||||||
|
Planned additions:
|
||||||
|
|
||||||
|
- `pyro workspace shell read ... --plain`
|
||||||
|
- `pyro workspace shell read ... --wait-for-idle-ms N`
|
||||||
|
- matching Python SDK parameters:
|
||||||
|
- `plain=True`
|
||||||
|
- `wait_for_idle_ms=...`
|
||||||
|
- matching MCP request fields on `shell_read`
|
||||||
|
|
||||||
|
## Implementation Boundaries
|
||||||
|
|
||||||
|
- keep raw PTY reads available for advanced clients
|
||||||
|
- plain mode should strip terminal control sequences and normalize line endings
|
||||||
|
- idle waiting should batch the next useful chunk of output without turning the
|
||||||
|
shell into a separate job scheduler
|
||||||
|
- keep cursor-based reads so polling clients stay deterministic
|
||||||
|
|
||||||
|
## Non-Goals
|
||||||
|
|
||||||
|
- no replacement of the PTY shell with a fake line-based shell
|
||||||
|
- no automatic command synthesis inside shell reads
|
||||||
|
- no shell-only workflow that replaces `workspace exec`, services, or file ops
|
||||||
|
|
||||||
|
## Acceptance Scenarios
|
||||||
|
|
||||||
|
- a chat agent can open a shell, write a command, and read back plain text
|
||||||
|
output without ANSI noise
|
||||||
|
- long-running interactive setup or debugging flows are readable in chat
|
||||||
|
- shell output is useful as model input without extra client-side cleanup
|
||||||
|
|
||||||
|
## Required Repo Updates
|
||||||
|
|
||||||
|
- help text that makes raw versus plain shell reads explicit
|
||||||
|
- examples that show a clean interactive shell loop
|
||||||
|
- smoke coverage for at least one shell-driven debugging scenario
|
||||||
|
|
@ -0,0 +1,42 @@
|
||||||
|
# `3.6.0` Use-Case Recipes And Smoke Packs
|
||||||
|
|
||||||
|
Status: Planned
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
|
||||||
|
Turn the five target workflows into first-class documented stories and runnable
|
||||||
|
verification paths.
|
||||||
|
|
||||||
|
## Public API Changes
|
||||||
|
|
||||||
|
No new core API is required in this milestone.
|
||||||
|
|
||||||
|
The main deliverable is packaging the now-mature workspace surface into clear
|
||||||
|
recipes, examples, and smoke scenarios that prove the intended user experience.
|
||||||
|
|
||||||
|
## Implementation Boundaries
|
||||||
|
|
||||||
|
- build on the existing stable workspace contract and the earlier chat-first
|
||||||
|
milestones
|
||||||
|
- keep the focus on user-facing flows, not internal test harness complexity
|
||||||
|
- treat the recipes as product documentation, not private maintainer notes
|
||||||
|
|
||||||
|
## Non-Goals
|
||||||
|
|
||||||
|
- no new CI or scheduler abstractions
|
||||||
|
- no speculative cloud orchestration work
|
||||||
|
- no broad expansion of disk tooling as the main story
|
||||||
|
|
||||||
|
## Acceptance Scenarios
|
||||||
|
|
||||||
|
- cold-start repo validation has a documented and smoke-tested flow
|
||||||
|
- repro-plus-fix loops have a documented and smoke-tested flow
|
||||||
|
- parallel isolated workspaces have a documented and smoke-tested flow
|
||||||
|
- unsafe or untrusted code inspection has a documented and smoke-tested flow
|
||||||
|
- review and evaluation workflows have a documented and smoke-tested flow
|
||||||
|
|
||||||
|
## Required Repo Updates
|
||||||
|
|
||||||
|
- a dedicated doc or section for each target use case
|
||||||
|
- at least one canonical example per use case in CLI, SDK, or MCP form
|
||||||
|
- smoke scenarios that prove each flow on a real Firecracker-backed path
|
||||||
|
|
@ -46,4 +46,5 @@ The planned workspace roadmap is complete.
|
||||||
|
|
||||||
- `3.1.0` added secondary stopped-workspace disk export and offline inspection helpers without
|
- `3.1.0` added secondary stopped-workspace disk export and offline inspection helpers without
|
||||||
changing the stable workspace-first core contract.
|
changing the stable workspace-first core contract.
|
||||||
- Future work, if any, is now outside the planned vision milestones tracked in this roadmap.
|
- The next follow-on milestones now live in [llm-chat-ergonomics.md](llm-chat-ergonomics.md) and
|
||||||
|
focus on making the stable workspace product feel trivial from chat-driven LLM interfaces.
|
||||||
|
|
|
||||||
|
|
@ -170,8 +170,12 @@ Features should be prioritized in this order:
|
||||||
6. Explicit secrets and network policy
|
6. Explicit secrets and network policy
|
||||||
7. Secondary disk-level import/export and inspection tools
|
7. Secondary disk-level import/export and inspection tools
|
||||||
|
|
||||||
The implementation roadmap that turns those priorities into release-sized
|
The completed workspace GA roadmap lives in
|
||||||
milestones lives in [roadmap/task-workspace-ga.md](roadmap/task-workspace-ga.md).
|
[roadmap/task-workspace-ga.md](roadmap/task-workspace-ga.md).
|
||||||
|
|
||||||
|
The next implementation milestones that make those workflows feel natural from
|
||||||
|
chat-driven LLM interfaces live in
|
||||||
|
[roadmap/llm-chat-ergonomics.md](roadmap/llm-chat-ergonomics.md).
|
||||||
|
|
||||||
## Naming Guidance
|
## Naming Guidance
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue