pyro-mcp/docs/vision.md

# Vision

`pyro-mcp` should become the disposable sandbox where an agent can do real
development work safely, repeatedly, and reproducibly.

That is a different product from a generic VM wrapper, a secure CI runner, or a
task queue with better isolation.

## Core Thesis

The goal is not just to run one command in a microVM.

The goal is to give an LLM or coding agent a bounded workspace where it can:

- inspect a repo
- install dependencies
- edit files
- run tests
- start and inspect services
- reset and retry
- export patches and artifacts
- destroy the sandbox when the task is done

The sandbox is the execution boundary for agentic software work.

## What This Is Not

`pyro-mcp` should not drift into:

- a YAML pipeline system
- a build farm
- a generic CI job runner
- a scheduler or queueing platform
- a broad VM orchestration product

Those products optimize for queued work, throughput, retries, matrix builds, and
shared infrastructure.

`pyro-mcp` should optimize for agent loops:

- explore
- edit
- test
- observe
- reset
- export

## Why This Can Look Like CI

Any sandbox product starts to look like CI if the main abstraction is:

- submit a command
- wait
- collect logs
- fetch artifacts

That shape is useful, but it is not the center of the vision.

To stay aligned, the primary abstraction should be a workspace the agent
inhabits, not a job the agent submits.

## Product Principles

### Workspace-First

The default mental model should be "open a disposable workspace" rather than
"enqueue a task".

### Stateful Interaction

The product should support repeated interaction in one sandbox. One-shot command
execution matters, but it is the entry point, not the destination.

### Explicit Host Crossing

Anything that crosses the host boundary should be intentional and visible:

- seeding a workspace
- syncing changes in
- exporting artifacts out
- granting secrets or network access

### Reset Over Repair

Agents should be able to checkpoint, reset, and retry cheaply. Disposable state
is a feature, not a limitation.

### Same Contract Across Surfaces

CLI, Python, and MCP should expose the same underlying workspace model so the
product feels coherent no matter how it is consumed.

### Agent-Native Observability

The sandbox should expose the things an agent actually needs to reason about:

- command output
- file diffs
- service status
- logs
- readiness
- exported results

## The Shape Of An LLM-First Sandbox

The strongest future direction is a small, agent-native contract built around
workspaces, shells, files, services, and reset.

Representative primitives:

- `workspace.create`
- `workspace.status`
- `workspace.delete`
- `workspace.sync_push`
- `workspace.export`
- `workspace.diff`
- `workspace.snapshot`
- `workspace.reset`
- `shell.open`
- `shell.read`
- `shell.write`
- `shell.signal`
- `shell.close`
- `workspace.exec`
- `service.start`
- `service.status`
- `service.logs`
- `service.stop`

These names are illustrative, not a committed public API.

The important point is the interaction model:

- a shell session is interactive state inside the sandbox
- a workspace is durable for the life of the task
- services are first-class, not accidental background jobs
- reset is a core workflow primitive

## Interactive Shells And Disk Operations

Interactive shells are aligned with the vision because they make the agent feel
present inside the sandbox rather than reduced to one-shot job submission.

That does not mean `pyro-mcp` should become a raw SSH replacement. The shell
should sit inside a higher-level workspace model with structured file, service,
diff, and reset operations around it.

Disk-level operations are also useful, but they should remain supporting tools.
They are good for:

- fast workspace seeding
- snapshotting
- offline inspection
- diffing
- export/import without a full boot

They should not become the primary product identity. If the center of the
product becomes "operate on VM disks", it will read as image tooling rather
than an agent workspace.

## What To Build Next

Features should be prioritized in this order:

1. Repeated commands in one persistent workspace
2. Interactive shell sessions with PTY semantics
3. Structured workspace sync and export
4. Service lifecycle and readiness checks
5. Snapshot and reset workflows
6. Explicit secrets and network policy
7. Secondary disk-level import/export and inspection tools

The completed workspace GA roadmap lives in
[roadmap/task-workspace-ga.md](roadmap/task-workspace-ga.md).

The next implementation milestones that make those workflows feel natural from
chat-driven LLM interfaces live in
[roadmap/llm-chat-ergonomics.md](roadmap/llm-chat-ergonomics.md).

## Naming Guidance

Prefer language that reinforces the workspace model:

- `workspace`
- `sandbox`
- `shell`
- `service`
- `snapshot`
- `reset`

Avoid centering language that makes the product feel like CI infrastructure:

- `job`
- `runner`
- `pipeline`
- `worker`
- `queue`
- `build matrix`

## Litmus Test

When evaluating a new feature, ask:

"Does this help an agent inhabit a safe disposable workspace and do real
software work inside it?"

If the better description is "it helps submit, schedule, and report jobs", the
feature is probably pushing the product in the wrong direction.