Thales Maciel d0cf6d8f21 Add opinionated MCP modes for workspace workflows

Introduce explicit repro-fix, inspect, cold-start, and review-eval modes across the MCP server, CLI, and host helpers, with canonical mode-to-tool mappings, narrowed schemas, and mode-specific tool descriptions on top of the existing workspace runtime.

Reposition the docs, host onramps, and use-case recipes so named modes are the primary user-facing startup story while the generic no-mode workspace-core path remains the escape hatch, and update the shared smoke runner to validate repro-fix and cold-start through mode-backed servers.

Validation: UV_OFFLINE=1 UV_CACHE_DIR=.uv-cache uv run pytest --no-cov tests/test_api.py tests/test_server.py tests/test_host_helpers.py tests/test_public_contract.py tests/test_cli.py tests/test_workspace_use_case_smokes.py; UV_OFFLINE=1 UV_CACHE_DIR=.uv-cache make check; UV_OFFLINE=1 UV_CACHE_DIR=.uv-cache make dist-check; real guest-backed make smoke-repro-fix-loop smoke-cold-start-validation outside the sandbox.

2026-03-13 20:00:35 -03:00

950 B

Raw Blame History

Review And Evaluation Workflows

Recommended mode: review-eval

Recommended startup:

pyro host connect claude-code --mode review-eval

Smoke target:

make smoke-review-eval

Use this flow when an agent needs to read a checklist interactively, run an evaluation script, checkpoint or reset its changes, and export the final report.

Chat-host recipe:

Create a named snapshot before the review starts.
Open a readable PTY shell and inspect the checklist interactively.
Run the review or evaluation script in the same workspace.
Capture workspace summary to review what changed and what to export.
Export the final report.
Reset back to the snapshot if the review branch goes sideways.
Delete the workspace when the evaluation is done.

This is the stable shell-facing story: readable PTY output for chat loops, checkpointed evaluation, explicit export, and reset when a review branch goes sideways.

950 B Raw Blame History

Review And Evaluation Workflows

950 B

Raw Blame History