ssh: trust-on-first-use host key pinning everywhere

Guest host-key verification was off in all three SSH paths:

  * Go SSH (internal/guest/ssh.go) used ssh.InsecureIgnoreHostKey
  * `banger vm ssh` passed StrictHostKeyChecking=no
    + UserKnownHostsFile=/dev/null
  * `~/.ssh/config` Host *.vm shipped the same posture into the
    user's global config

Now each path verifies against a banger-owned known_hosts file at
`~/.local/state/banger/ssh/known_hosts` with TOFU semantics:

  * First dial to a VM pins the key.
  * Subsequent dials require an exact match. A mismatch fails with
    an explicit "possible MITM" error.
  * `vm delete` removes the entries so a future VM reusing the IP
    or name re-pins cleanly.
  * The user's `~/.ssh/known_hosts` is untouched.

Changes:

  internal/guest/known_hosts.go (new) — OpenSSH-compatible parser,
    TOFUHostKeyCallback, RemoveKnownHosts. Process-wide mutex
    around the file.
  internal/guest/ssh.go — Dial and WaitForSSH grew a knownHostsPath
    parameter threaded through the callback. Empty path keeps the
    insecure callback (tests + throwaway tools only; documented).
  internal/daemon/{guest_sessions,session_attach,session_lifecycle,
    session_stream}.go — call sites pass d.layout.KnownHostsPath.
  internal/daemon/ssh_client_config.go — the ~/.ssh/config Host *.vm
    block now points at banger's known_hosts and uses
    StrictHostKeyChecking=accept-new. Missing path → fail closed.
  internal/daemon/vm_lifecycle.go — deleteVMLocked drops known_hosts
    entries for the VM's IP and DNS name via removeVMKnownHosts.
  internal/cli/banger.go — sshCommandArgs swaps StrictHostKeyChecking
    no + /dev/null for banger's file + accept-new. Path resolution
    failure falls through to StrictHostKeyChecking=yes.
  internal/paths/paths.go — Layout gains SSHDir + KnownHostsPath;
    Ensure creates SSHDir at 0700.

Tests (internal/guest/known_hosts_test.go): pin on first use, accept
matching key on second dial, reject mismatch, empty path skips
checking, RemoveKnownHosts drops the entry, re-pin works after
remove. Existing daemon + cli tests updated to assert the new
posture and regression-guard against the old flags.

Live verified: vm run writes the pin to banger's known_hosts at 0600
inside a 0700 dir; banger vm ssh + ssh root@<vm>.vm both succeed
using the pin; vm delete clears it.
This commit is contained in:
Thales Maciel 2026-04-19 16:46:03 -03:00
parent a59958d4f5
commit ae14b9499d
No known key found for this signature in database
GPG key ID: 33112E6833C34679
14 changed files with 634 additions and 47 deletions

View file

@ -131,10 +131,12 @@ var (
return rpc.Call[api.GuestSessionSendResult](ctx, socketPath, "guest.session.send", params)
}
guestWaitForSSHFunc = func(ctx context.Context, address, privateKeyPath string, interval time.Duration) error {
return guest.WaitForSSH(ctx, address, privateKeyPath, interval)
knownHosts, _ := bangerKnownHostsPath()
return guest.WaitForSSH(ctx, address, privateKeyPath, knownHosts, interval)
}
guestDialFunc = func(ctx context.Context, address, privateKeyPath string) (vmRunGuestClient, error) {
return guest.Dial(ctx, address, privateKeyPath)
knownHosts, _ := bangerKnownHostsPath()
return guest.Dial(ctx, address, privateKeyPath, knownHosts)
}
prepareVMRunRepoCopyFunc = prepareVMRunRepoCopy
buildVMRunToolingPlanFunc = toolingplan.Build
@ -2669,6 +2671,12 @@ func sshCommandArgs(cfg model.DaemonConfig, guestIP string, extra []string) ([]s
if cfg.SSHKeyPath != "" {
args = append(args, "-i", cfg.SSHKeyPath)
}
// Host-key verification uses a banger-owned known_hosts file
// populated by the daemon's first successful Go-SSH dial to each
// VM (trust-on-first-use). `accept-new` means: accept-and-pin on
// first contact; strict-verify afterwards. The user's own
// ~/.known_hosts is untouched.
knownHosts, khErr := bangerKnownHostsPath()
args = append(
args,
"-o", "IdentitiesOnly=yes",
@ -2676,14 +2684,36 @@ func sshCommandArgs(cfg model.DaemonConfig, guestIP string, extra []string) ([]s
"-o", "PreferredAuthentications=publickey",
"-o", "PasswordAuthentication=no",
"-o", "KbdInteractiveAuthentication=no",
"-o", "StrictHostKeyChecking=no",
"-o", "UserKnownHostsFile=/dev/null",
"root@"+guestIP,
)
if khErr == nil {
args = append(args,
"-o", "UserKnownHostsFile="+knownHosts,
"-o", "StrictHostKeyChecking=accept-new",
)
} else {
// If we can't resolve the banger path (unusual — paths.Resolve
// basically can't fail), fall through to a hard-fail posture
// rather than silently disabling verification.
args = append(args,
"-o", "StrictHostKeyChecking=yes",
)
}
args = append(args, "root@"+guestIP)
args = append(args, extra...)
return args, nil
}
// bangerKnownHostsPath resolves the TOFU file the daemon writes into
// and the CLI reads back. Both sides must agree on the path or the
// pin doesn't round-trip.
func bangerKnownHostsPath() (string, error) {
layout, err := paths.Resolve()
if err != nil {
return "", err
}
return layout.KnownHostsPath, nil
}
func validateSSHPrereqs(cfg model.DaemonConfig) error {
checks := system.NewPreflight()
checks.RequireCommand("ssh", "install openssh-client")