banger/internal/cli/ssh.go
Thales Maciel 672d7151e9
smoke: five more scenarios + fix exit-code propagation bug the new ones caught
Five new smoke scenarios layered on top of the existing bare + workspace
vm-run tests:

  - exit-code propagation: `sh -c 'exit 42'` must rc=42
  - workspace dry-run: --dry-run lists tracked files without a VM
  - workspace --include-untracked: opt-in ships files outside the git
    index (regression guard on the security-default flip from review 4)
  - concurrent vm runs: two --rm invocations in parallel both succeed
    (stresses per-VM locks, createVMMu reservation window, tap pool)
  - invalid spec rejection: --vcpu 0 must fail with no VM row left
    behind (the "cleanup on partial failure" path the review flagged)

The exit-code scenario caught a real bug on first run:

  `banger vm run --rm -- sh -c 'exit 42'` returned rc=0, not 42.

Root cause in internal/cli/ssh.go's sshCommandArgs: extra args were
appended to the ssh argv verbatim, relying on ssh(1)'s implicit
space-join to deliver the remote command. That works for single
tokens (echo hello) but re-tokenises multi-word commands on the
remote side: `ssh host sh -c 'exit 42'` becomes remote
`sh -c exit 42`, where `42` is $0 for the already-completed `exit`,
and the exit code the user asked for is lost.

Fix: shell-quote every element of extra (`'sh'` `'-c'` `'exit 42'`)
and join them into a single trailing argv entry. ssh's space-join
then produces exactly the command the user typed on the remote
shell. TestSSHCommandArgs was updated to pin the quoting; the
existing TestRunVMRunCommandModePropagatesExitCode test needed a
one-word quote tweak (`false` → `'false'`).

Smoke run after fix passes all seven scenarios in ~2 min on warm
state. cmd/banger coverage jumped to 100% (the invalid-spec
scenario hits the error-reporting path that wasn't covered
before).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 19:37:07 -03:00

138 lines
4.4 KiB
Go

package cli
import (
"context"
"errors"
"fmt"
"io"
"os/exec"
"strings"
"time"
"banger/internal/model"
"banger/internal/paths"
"banger/internal/system"
"banger/internal/vsockagent"
)
// runSSHSession executes ssh with the given args. On exit it decides
// whether to print the "vm is still running" reminder: we skip it if
// the caller asked (e.g. --rm is about to delete the VM), if the
// ctx is already done, or if the ssh error isn't the one that
// typically means "user disconnected cleanly".
func (d *deps) runSSHSession(ctx context.Context, socketPath, vmRef string, stdin io.Reader, stdout, stderr io.Writer, sshArgs []string, skipReminder bool) error {
sshErr := d.sshExec(ctx, stdin, stdout, stderr, sshArgs)
if skipReminder || !shouldCheckSSHReminder(sshErr) || ctx.Err() != nil {
return sshErr
}
pingCtx, cancel := context.WithTimeout(context.Background(), 3*time.Second)
defer cancel()
health, err := d.vmHealth(pingCtx, socketPath, vmRef)
if err != nil {
_, _ = fmt.Fprintln(stderr, vsockagent.WarningMessage(vmRef, err))
return sshErr
}
if health.Healthy {
name := health.Name
if strings.TrimSpace(name) == "" {
name = vmRef
}
_, _ = fmt.Fprintln(stderr, vsockagent.ReminderMessage(name))
}
return sshErr
}
func shouldCheckSSHReminder(err error) bool {
if err == nil {
return true
}
var exitErr *exec.ExitError
if !errors.As(err, &exitErr) {
return false
}
return exitErr.ExitCode() != 255
}
// sshCommandArgs builds the argv for `ssh` invocations against a VM.
// Host-key verification uses a banger-owned known_hosts file
// populated by the daemon's first successful Go-SSH dial to each VM
// (trust-on-first-use). `accept-new` means: accept-and-pin on first
// contact; strict-verify afterwards. The user's own
// ~/.ssh/known_hosts is never touched.
func sshCommandArgs(cfg model.DaemonConfig, guestIP string, extra []string) ([]string, error) {
if guestIP == "" {
return nil, errors.New("vm has no guest IP")
}
args := []string{}
args = append(args, "-F", "/dev/null")
if cfg.SSHKeyPath != "" {
args = append(args, "-i", cfg.SSHKeyPath)
}
knownHosts, khErr := bangerKnownHostsPath()
args = append(
args,
"-o", "IdentitiesOnly=yes",
"-o", "BatchMode=yes",
"-o", "PreferredAuthentications=publickey",
"-o", "PasswordAuthentication=no",
"-o", "KbdInteractiveAuthentication=no",
)
if khErr == nil {
args = append(args,
"-o", "UserKnownHostsFile="+knownHosts,
"-o", "StrictHostKeyChecking=accept-new",
)
} else {
// If we can't resolve the banger path (unusual — paths.Resolve
// basically can't fail), fall through to a hard-fail posture
// rather than silently disabling verification.
args = append(args,
"-o", "StrictHostKeyChecking=yes",
)
}
args = append(args, "root@"+guestIP)
// ssh(1) concatenates every argument after the host with spaces
// before sending to the remote shell. That means passing extra
// args raw — `ssh host sh -c 'exit 42'` — re-tokenises on the
// remote side to `sh -c exit 42`, where `42` is $0 for the
// already-completed `exit`, and the rc the user asked for is
// lost. Shell-quote each element and join them ourselves so the
// remote shell sees exactly the argv the user typed locally.
if len(extra) > 0 {
quoted := make([]string, len(extra))
for i, a := range extra {
quoted[i] = shellQuote(a)
}
args = append(args, strings.Join(quoted, " "))
}
return args, nil
}
// bangerKnownHostsPath resolves the TOFU file the daemon writes into
// and the CLI reads back. Both sides must agree on the path or the
// pin doesn't round-trip.
func bangerKnownHostsPath() (string, error) {
layout, err := paths.Resolve()
if err != nil {
return "", err
}
return layout.KnownHostsPath, nil
}
func validateSSHPrereqs(cfg model.DaemonConfig) error {
checks := system.NewPreflight()
checks.RequireCommand("ssh", "install openssh-client")
if strings.TrimSpace(cfg.SSHKeyPath) != "" {
checks.RequireFile(cfg.SSHKeyPath, "ssh private key", `set "ssh_key_path" or let banger create its default key`)
}
return checks.Err("ssh preflight failed")
}
func validateVMRunPrereqs(cfg model.DaemonConfig) error {
checks := system.NewPreflight()
checks.RequireCommand("git", "install git")
if strings.TrimSpace(cfg.SSHKeyPath) != "" {
checks.RequireFile(cfg.SSHKeyPath, "ssh private key", `set "ssh_key_path" or let banger create its default key`)
}
return checks.Err("vm run preflight failed")
}