pyro-mcp/docs/roadmap/llm-chat-ergonomics.md
Thales Maciel 21a88312b6 Add chat-friendly shell read rendering
Make workspace shell reads usable as direct chat-model input without changing the PTY or cursor model. This adds optional plain rendering and idle-window batching across CLI, SDK, and MCP while keeping raw reads backward-compatible.

Implement the rendering and wait-for-idle logic in the manager layer so the existing guest/backend shell transport stays unchanged. The new helper strips ANSI and other terminal control noise, handles carriage-return overwrite and backspace, and preserves raw cursor semantics even when plain output is requested.

Refresh the stable shell docs/examples to recommend --plain --wait-for-idle-ms 300, mark the 3.5.0 roadmap milestone done, and bump the package/catalog version to 3.5.0.

Validation: uv lock; UV_CACHE_DIR=.uv-cache make check; UV_CACHE_DIR=.uv-cache make dist-check; real guest-backed Firecracker smoke covering shell open/write/read with ANSI plus delayed output.
2026-03-13 01:10:26 -03:00

3.5 KiB

LLM Chat Ergonomics Roadmap

This roadmap picks up after the completed workspace GA plan and focuses on one goal:

make the core agent-workspace use cases feel trivial from a chat-driven LLM interface.

Current baseline is 3.5.0:

  • the stable workspace contract exists across CLI, SDK, and MCP
  • one-shot pyro run still exists as the narrow entrypoint
  • workspaces already support seeding, sync push, exec, export, diff, snapshots, reset, services, PTY shells, secrets, network policy, and published ports
  • stopped-workspace disk tools now exist, but remain explicitly secondary

What "Trivial In Chat" Means

The roadmap is done only when a chat-driven LLM can cover the main use cases without awkward shell choreography or hidden host-side glue:

  • cold-start repo validation
  • repro plus fix loops
  • parallel isolated workspaces for multiple issues or PRs
  • unsafe or untrusted code inspection
  • review and evaluation workflows

More concretely, the model should not need to:

  • patch files through shell-escaped printf or heredoc tricks
  • rely on opaque workspace IDs without a discovery surface
  • consume raw terminal control sequences as normal shell output
  • choose from an unnecessarily large tool surface when a smaller profile would work

Locked Decisions

  • keep the workspace product identity central; do not drift toward CI, queue, or runner abstractions
  • keep disk tools secondary and do not make them the main chat-facing surface
  • prefer narrow tool profiles and structured outputs over more raw shell calls
  • every milestone below must update CLI, SDK, and MCP together
  • every milestone below must also update docs, help text, runnable examples, and at least one real smoke scenario

Milestones

  1. 3.2.0 Model-Native Workspace File Ops - Done
  2. 3.3.0 Workspace Naming And Discovery - Done
  3. 3.4.0 Tool Profiles And Canonical Chat Flows - Done
  4. 3.5.0 Chat-Friendly Shell Output - Done
  5. 3.6.0 Use-Case Recipes And Smoke Packs

Completed so far:

  • 3.2.0 added model-native workspace file * and workspace patch apply so chat-driven agents can inspect and edit /workspace without shell-escaped file mutation flows.
  • 3.3.0 added workspace names, key/value labels, workspace list, workspace update, and last_activity_at tracking so humans and chat-driven agents can rediscover and resume the right workspace without external notes.
  • 3.4.0 added stable MCP/server tool profiles with vm-run, workspace-core, and workspace-full, plus canonical profile-based OpenAI and MCP examples so chat hosts can start narrow and widen only when needed.
  • 3.5.0 added chat-friendly shell reads with plain-text rendering and idle batching so PTY sessions are readable enough to feed directly back into a chat model.

Expected Outcome

After this roadmap, the product should still look like an agent workspace, not like a CI runner with more isolation.

The intended model-facing shape is:

  • one-shot work starts with vm_run
  • persistent work moves to a small workspace-first contract
  • file edits are structured and model-native
  • workspace discovery is human and model-friendly
  • shells are readable in chat
  • the five core use cases are documented and smoke-tested end to end