One-command development sandboxes on Firecracker microVMs. https://git.thaloco.com/thaloco/banger/
Find a file
Thales Maciel 72882e45d7
daemon: serialise concurrent image/kernel pulls + atomic-rename seed refresh
Three concurrency bugs surfaced by `make smoke JOBS=4` that all stem
from `vm.create` paths assuming single-caller semantics:

1. **Kernel auto-pull manifest race.** Parallel `vm.create` calls that
   each need to auto-pull the same kernel ref both run kernelcat.Fetch
   in parallel against the same /var/lib/banger/kernels/<name>/. Fetch
   writes manifest.json non-atomically (truncate + write); the peer
   reads it back mid-write and trips
   "parse manifest for X: unexpected end of JSON input".

   Fix: per-name `sync.Mutex` map on `ImageService` (kernelPullLock).
   `KernelPull` and `readOrAutoPullKernel` both acquire it and re-check
   `kernelcat.ReadLocal` after the lock so a peer who finished while we
   waited is treated as success — `readOrAutoPullKernel` does NOT call
   `s.KernelPull` because that path errors with "already pulled" on a
   peer-success, which would be wrong for auto-pull. Different kernels
   stay parallel.

2. **Image auto-pull race.** Same shape as the kernel race but on the
   image side: parallel `vm.create` calls both run pullFromBundle /
   pullFromOCI for the missing image (each ~minutes of OCI fetch +
   ext4 build). The publishImage atom under imageOpsMu only protects
   the rename + UpsertImage commit, so the loser does all the work
   only to fail at the recheck with "image already exists".

   Fix: per-name `sync.Mutex` map on `ImageService` (imagePullLock).
   `findOrAutoPullImage` acquires it, re-checks FindImage, and only
   then calls PullImage. Loser short-circuits with the
   freshly-published image instead of redoing minutes of work.
   PullImage's own publishImage recheck stays as defense-in-depth
   for callers that bypass the auto-pull path.

3. **Work-seed refresh race.** When the host's SSH key has rotated
   since an image was last refreshed, `ensureAuthorizedKeyOnWorkDisk`
   triggers `refreshManagedWorkSeedFingerprint`, which rewrote the
   shared work-seed.ext4 in place via e2rm + e2cp. Peer `vm.create`
   calls doing parallel `MaterializeWorkDisk` rdumps observed a torn
   ext4 image — "Superblock checksum does not match superblock".

   Fix: stage the rewrite on a sibling tmpfile (`<seed>.refresh.<pid>-<ns>.tmp`)
   and atomic-rename. Concurrent readers either have the file open
   (kernel keeps the pre-rename inode alive) or open after the rename
   (see the new inode) — never observe a partial state. Two parallel
   refreshes are idempotent (same daemon, same SSH key) so unique tmp
   names are enough; whichever rename lands last wins, with identical
   content. UpsertImage runs after the rename so the recorded
   fingerprint always matches what's on disk.

Plus one smoke harness fix: reclassify `vm_prune` from `pure` to
`global`. `vm prune -f` removes ALL stopped VMs system-wide, not just
the ones the scenario created — so a parallel peer scenario that
happens to have its VM in `created`/`stopped` momentarily gets wiped.
Moving prune to the post-pool serial phase keeps it from racing with
in-flight scenarios.

After all four fixes, `make smoke JOBS=4` passes 21/21 in 174s
(serial baseline 141s; the small overhead is the buffered-output and
`wait -n` semaphore cost — well worth the parallelism for fast-iter
work on a 32-core box).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 17:24:11 -03:00
cmd cli: maturity polish — color, error translation, tabwriter consistency 2026-04-26 22:27:07 -03:00
configs Generic kernel + init= boot path for OCI-pulled images 2026-04-16 20:12:56 -03:00
docs docs: add the privilege model document 2026-04-26 12:55:18 -03:00
images/golden supply chain: verify signatures and pins across image + kernel builds 2026-04-21 19:38:13 -03:00
internal daemon: serialise concurrent image/kernel pulls + atomic-rename seed refresh 2026-04-27 17:24:11 -03:00
scripts daemon: serialise concurrent image/kernel pulls + atomic-rename seed refresh 2026-04-27 17:24:11 -03:00
.gitignore gitignore: exclude the entire build directory 2026-04-26 12:55:11 -03:00
AGENTS.md ssh-config: narrow the legacy-dir cleanup so it can't delete a user key 2026-04-22 16:31:07 -03:00
go.mod Phase 1: imagepull package — pull, flatten, ext4 2026-04-16 17:22:13 -03:00
go.sum Phase 1: imagepull package — pull, flatten, ext4 2026-04-16 17:22:13 -03:00
LICENSE Add LICENSE, update .gitignore, add security note to README 2026-04-14 16:54:33 -03:00
Makefile smoke: discoverable scenarios + selectable runs + parallel dispatch 2026-04-27 16:56:57 -03:00
mise.toml mise: pin go and shellcheck 2026-04-26 13:11:51 -03:00
README.md daemon: split owner daemon from root helper 2026-04-26 12:43:17 -03:00

banger

One-command development sandboxes on Firecracker microVMs.

Quick start

make build
sudo ./build/bin/banger system install --owner "$USER"
banger vm run --name sandbox

That's it. banger vm run auto-pulls the default golden image (Debian bookworm with systemd, sshd, Docker CE, git, jq, mise, and the usual dev tools) and kernel, creates a VM, starts it, and drops you into an interactive ssh session. First run takes a couple minutes (bundle download); subsequent vm runs are seconds.

Supported host path

banger's supported host/runtime path is:

  • Linux on x86_64 / amd64
  • systemd as the host init/service manager
  • bangerd.service running as the installed owner user
  • bangerd-root.service running as the privileged host helper

Other setups may work with manual adaptation, but they are not the supported operating model for this repo.

Requirements

  • x86_64 / amd64 Linux — arm64 is not supported today. The companion binaries, the published kernel catalog, and the OCI import path all assume linux/amd64. banger doctor surfaces this as a failing check on other architectures.
  • systemd on the host — this is the supported service-management path. banger's supported install/run model is the owner-user bangerd.service plus the privileged bangerd-root.service installed by banger system install.
  • /dev/kvm
  • sudo for the install/admin commands (system install, system restart, system uninstall)
  • Firecracker on PATH, or firecracker_bin set in config
  • host tools checked by banger doctor

Build + install

make build
sudo ./build/bin/banger system install --owner "$USER"

This installs two systemd units, copies the current banger, bangerd, and banger-vsock-agent binaries into /usr/local, writes install metadata under /etc/banger, and starts both services:

  • bangerd.service runs as the configured owner user and exposes the public CLI socket at /run/banger/bangerd.sock.
  • bangerd-root.service runs as root and handles the narrow set of privileged host operations over the private helper socket at /run/banger-root/bangerd-root.sock.

After that, normal daily commands such as banger vm run and banger image pull are unprivileged.

This systemd service flow is the supported path. If you're not on a host that can run both services, you're outside the supported host model even if some pieces happen to work.

The split matters:

  • bangerd.service runs as the owner user, keeps its writable state in /var/lib/banger, /var/cache/banger, and /run/banger, and sees the owner home read-only.
  • bangerd-root.service is the only process that keeps elevated host capabilities, and that capability set is limited to the host-kernel primitives banger actually uses (CAP_CHOWN, CAP_SYS_ADMIN, CAP_NET_ADMIN).

To inspect or refresh the services:

banger system status
sudo banger system restart

To remove the system services:

sudo banger system uninstall

Add --purge if you also want to remove system-owned VM/image/cache state under /var/lib/banger, /var/cache/banger, /run/banger, and /run/banger-root. User config stays in place under your home directory:

  • ~/.config/banger/ — config, optional ssh_config
  • ~/.local/state/banger/ssh/ — user SSH key + known_hosts

Shell completion

banger ships completion scripts for bash, zsh, fish, and powershell. Tab-completion covers subcommands, flags, and live resource names (VM, image, kernel) looked up from the installed services. With the services down, resource completion silently returns nothing — no file-completion fallback.

# bash (system-wide)
banger completion bash | sudo tee /etc/bash_completion.d/banger

# zsh (user-local; ~/.zfunc must be on fpath)
banger completion zsh > ~/.zfunc/_banger

# fish
banger completion fish > ~/.config/fish/completions/banger.fish

banger completion --help shows the shell-specific loading recipes.

vm run

One command, four common shapes:

banger vm run                          # bare sandbox — drops into ssh
banger vm run ./repo                   # workspace at /root/repo — drops into ssh
banger vm run ./repo -- make test      # workspace + run command, exits with its status
banger vm run --rm -- script.sh        # ephemeral: VM is deleted on exit
  • Bare mode gives you a clean shell.
  • Workspace mode (path given) copies the repo's git-tracked files into /root/repo and kicks off a best-effort mise tooling bootstrap from the repo's .mise.toml / .tool-versions. Log: /root/.cache/banger/vm-run-tooling-<repo>.log. Untracked files (including local .env, scratch notes, credentials that aren't gitignored) are skipped by default — pass --include-untracked to also ship them. Pass --dry-run to print the exact file list and exit without creating a VM.
  • Command mode (-- <cmd>) runs the command in the guest; exit code propagates through banger.

Disconnecting from an interactive session leaves the VM running. Use vm stop / vm delete to clean up — or pass --rm so the VM auto-deletes once the session / command exits.

--branch, --from, --include-untracked, and --dry-run apply only to workspace mode. --rm skips the delete when the initial ssh wait times out, so a wedged sshd leaves the VM alive for banger vm logs inspection.

Hostnames: reaching <vm>.vm

banger's owner daemon runs a DNS server for the .vm zone. With host-side DNS routing you can curl http://sandbox.vm:3000 from anywhere on the host — no copy-pasting guest IPs. On systemd-resolved hosts the owner daemon asks the root helper to auto-wire this and that is the supported path. Everywhere else there's a best-effort manual recipe. See docs/dns-routing.md.

Optional: ssh <name>.vm shortcut

banger vm ssh <name> works out of the box. If you'd also like plain ssh sandbox.vm from any terminal (using banger's key + known_hosts), opt in:

banger ssh-config --install    # adds `Include ~/.config/banger/ssh_config`
                               # to ~/.ssh/config in a marker-fenced block
banger ssh-config --uninstall  # reverse it
banger ssh-config              # show the include line to paste manually

banger never touches ~/.ssh/config on its own — the daemon keeps its own known_hosts under /var/lib/banger/ssh/known_hosts, while banger ssh-config keeps the user-facing file fresh at ~/.config/banger/ssh_config; whether and how it's pulled into your SSH config is up to you.

Image catalog

banger image pull <name> fetches a pre-built bundle from the embedded catalog. vm run calls this for you on demand.

Today's catalog:

Name What it is
debian-bookworm Debian 12 slim + sshd + docker + dev tools

See docs/image-catalog.md for the bundle format and how to publish a new entry.

Config

Config lives at ~/.config/banger/config.toml. All keys optional.

Most commonly set:

  • default_image_name — image used when --image is omitted (default debian-bookworm, auto-pulled from the catalog if not local).
  • ssh_key_path — host SSH key. If unset, banger creates ~/.local/state/banger/ssh/id_ed25519. Accepts absolute paths or ~/-anchored paths; ~/foo expands against $HOME. Relative paths are rejected at config load.
  • firecracker_bin — override the auto-resolved PATH lookup.

Full key list in internal/config/config.go.

vm_defaults — sizing for new VMs

Every vm run / vm create prints a spec: line up front showing the vCPU, RAM, and disk the VM will get. When the flags aren't set, those values come from:

  1. [vm_defaults] in config (if present, wins).
  2. Host-derived heuristics (roughly: cpus/4 capped at 4, ram/8 capped at 8 GiB, 8 GiB disk).
  3. Built-in constants (floor).

banger doctor prints the effective defaults with provenance.

[vm_defaults]
vcpu = 4
memory_mib = 4096
disk_size = "16G"

All keys optional — omit whichever you want banger to decide.

file_sync — host → guest file copies

[[file_sync]]
host = "~/.aws"          # whole directory, recursive
guest = "~/.aws"

[[file_sync]]
host = "~/.config/gh/hosts.yml"
guest = "~/.config/gh/hosts.yml"

[[file_sync]]
host = "~/bin/my-script"
guest = "~/bin/my-script"
mode = "0755"            # optional; default 0600 for files

Runs at vm create time. Each entry copies hostguest onto the VM's work disk (mounted at /root in the guest). Guest paths must live under ~/ or /root/.... Host paths must live under the installed owner's home directory; ~/... is the intended form, and absolute paths are accepted only when they still point inside that home. Default is no entries — add the ones you want. A top-level symlink is followed only when its resolved target stays inside the owner home. Symlinks encountered while recursing into a synced directory are skipped with a warning — they'd otherwise leak files from outside the named tree (e.g. a symlink inside ~/.aws pointing to an unrelated credential dir).

Advanced

The common path is vm run. Power-user flows (vm create, OCI pull for arbitrary images, image register, manual workspace prepare) are documented in docs/advanced.md.

Security

Guest VMs are single-user development sandboxes, not multi-tenant servers. Each guest's sshd is configured with:

PermitRootLogin prohibit-password
PubkeyAuthentication yes
PasswordAuthentication no
KbdInteractiveAuthentication no
AuthorizedKeysFile /root/.ssh/authorized_keys

The host SSH key is the only authentication mechanism. StrictModes is on (sshd's default); banger normalises /root, /root/.ssh, and authorized_keys perms at provisioning time so the default passes.

VMs are reachable only through the host bridge network (172.16.0.0/24 by default). Do not expose the bridge interface or guest IPs to an untrusted network.

Further reading