pyro-mcp/docs/use-cases/review-eval-workflows.md

# Review And Evaluation Workflows

Recommended profile: `workspace-full`

Smoke target:

```bash
make smoke-review-eval
```

Use this flow when an agent needs to read a checklist interactively, run an
evaluation script, checkpoint or reset its changes, and export the final report.

Canonical SDK flow:

```python
from pyro_mcp import Pyro

pyro = Pyro()
created = pyro.create_workspace(environment="debian:12", seed_path="./review-fixture")
workspace_id = str(created["workspace_id"])

pyro.create_snapshot(workspace_id, "pre-review")
shell = pyro.open_shell(workspace_id)
pyro.write_shell(workspace_id, shell["shell_id"], input="cat CHECKLIST.md")
pyro.read_shell(
    workspace_id,
    shell["shell_id"],
    plain=True,
    wait_for_idle_ms=300,
)
pyro.close_shell(workspace_id, shell["shell_id"])
pyro.exec_workspace(workspace_id, command="sh review.sh")
pyro.export_workspace(workspace_id, "review-report.txt", output_path="./review-report.txt")
pyro.reset_workspace(workspace_id, snapshot="pre-review")
pyro.delete_workspace(workspace_id)
```

This is the stable shell-facing story: readable PTY output for chat loops,
checkpointed evaluation, explicit export, and reset when a review branch goes
sideways.