pyro-mcp/docs/roadmap/llm-chat-ergonomics/4.3.0-reviewable-agent-output.md
Thales Maciel dc86d84e96 Add workspace review summaries
Add workspace summary across the CLI, SDK, and MCP, and include it in the workspace-core profile so chat hosts can review one concise view of the current session.

Persist lightweight review events for syncs, file edits, patch applies, exports, service lifecycle, and snapshot activity, then synthesize them with command history, current services, snapshot state, and current diff data since the last reset.

Update the walkthroughs, use-case docs, public contract, changelog, and roadmap for 4.3.0, and make dist-check invoke the CLI module directly so local package reinstall quirks do not break the packaging gate.

Validation: uv lock; ./.venv/bin/pytest --no-cov tests/test_vm_manager.py tests/test_cli.py tests/test_api.py tests/test_server.py tests/test_public_contract.py tests/test_workspace_use_case_smokes.py; UV_OFFLINE=1 UV_CACHE_DIR=.uv-cache make check; UV_OFFLINE=1 UV_CACHE_DIR=.uv-cache make dist-check; real guest-backed workspace create -> patch apply -> workspace summary --json -> delete smoke.
2026-03-13 19:21:11 -03:00

1.7 KiB

4.3.0 Reviewable Agent Output

Status: Done

Goal

Make it easy for a human to review what the agent actually did inside the sandbox without manually reconstructing the session from diffs, logs, and raw artifacts.

Public API Changes

The product should expose a concise workspace review surface, for example:

  • pyro workspace summary WORKSPACE_ID
  • workspace_summary on the MCP side
  • structured JSON plus a short human-readable summary view

The summary should cover the things a chat-host user cares about:

  • commands run
  • files changed
  • diff or patch summary
  • services started
  • artifacts exported
  • final workspace outcome

Implementation Boundaries

  • prefer concise review surfaces over raw event firehoses
  • keep raw logs, diffs, and exported files available as drill-down tools
  • summarize only the sandbox activity the product can actually observe
  • make the summary good enough to paste into a chat, bug report, or PR comment

Non-Goals

  • no full compliance or audit product
  • no attempt to summarize the model's hidden reasoning
  • no remote storage backend for session history

Acceptance Scenarios

  • after a repro-fix or review-eval run, a user can inspect one summary and understand what changed and what to review next
  • the summary is useful enough to accompany exported patches or artifacts
  • unsafe-inspection and review-eval flows become easier to trust because the user can review agent-visible actions in one place

Required Repo Updates

  • public contract, help text, README, and recipe docs updated with the new summary path
  • at least one host-facing example showing how to ask for or export the summary
  • at least one real smoke scenario validating the review surface end to end