ssh: trust-on-first-use host key pinning everywhere

Guest host-key verification was off in all three SSH paths:

  * Go SSH (internal/guest/ssh.go) used ssh.InsecureIgnoreHostKey
  * `banger vm ssh` passed StrictHostKeyChecking=no
    + UserKnownHostsFile=/dev/null
  * `~/.ssh/config` Host *.vm shipped the same posture into the
    user's global config

Now each path verifies against a banger-owned known_hosts file at
`~/.local/state/banger/ssh/known_hosts` with TOFU semantics:

  * First dial to a VM pins the key.
  * Subsequent dials require an exact match. A mismatch fails with
    an explicit "possible MITM" error.
  * `vm delete` removes the entries so a future VM reusing the IP
    or name re-pins cleanly.
  * The user's `~/.ssh/known_hosts` is untouched.

Changes:

  internal/guest/known_hosts.go (new) — OpenSSH-compatible parser,
    TOFUHostKeyCallback, RemoveKnownHosts. Process-wide mutex
    around the file.
  internal/guest/ssh.go — Dial and WaitForSSH grew a knownHostsPath
    parameter threaded through the callback. Empty path keeps the
    insecure callback (tests + throwaway tools only; documented).
  internal/daemon/{guest_sessions,session_attach,session_lifecycle,
    session_stream}.go — call sites pass d.layout.KnownHostsPath.
  internal/daemon/ssh_client_config.go — the ~/.ssh/config Host *.vm
    block now points at banger's known_hosts and uses
    StrictHostKeyChecking=accept-new. Missing path → fail closed.
  internal/daemon/vm_lifecycle.go — deleteVMLocked drops known_hosts
    entries for the VM's IP and DNS name via removeVMKnownHosts.
  internal/cli/banger.go — sshCommandArgs swaps StrictHostKeyChecking
    no + /dev/null for banger's file + accept-new. Path resolution
    failure falls through to StrictHostKeyChecking=yes.
  internal/paths/paths.go — Layout gains SSHDir + KnownHostsPath;
    Ensure creates SSHDir at 0700.

Tests (internal/guest/known_hosts_test.go): pin on first use, accept
matching key on second dial, reject mismatch, empty path skips
checking, RemoveKnownHosts drops the entry, re-pin works after
remove. Existing daemon + cli tests updated to assert the new
posture and regression-guard against the old flags.

Live verified: vm run writes the pin to banger's known_hosts at 0600
inside a 0700 dir; banger vm ssh + ssh root@<vm>.vm both succeed
using the pin; vm delete clears it.
This commit is contained in:
Thales Maciel 2026-04-19 16:46:03 -03:00
parent a59958d4f5
commit ae14b9499d
No known key found for this signature in database
GPG key ID: 33112E6833C34679
14 changed files with 634 additions and 47 deletions

View file

@ -9,21 +9,23 @@ import (
)
type Layout struct {
ConfigHome string
StateHome string
CacheHome string
RuntimeHome string
ConfigDir string
StateDir string
CacheDir string
RuntimeDir string
SocketPath string
DBPath string
DaemonLog string
VMsDir string
ImagesDir string
KernelsDir string
OCICacheDir string
ConfigHome string
StateHome string
CacheHome string
RuntimeHome string
ConfigDir string
StateDir string
CacheDir string
RuntimeDir string
SocketPath string
DBPath string
DaemonLog string
VMsDir string
ImagesDir string
KernelsDir string
OCICacheDir string
SSHDir string
KnownHostsPath string
}
func Resolve() (Layout, error) {
@ -56,6 +58,8 @@ func Resolve() (Layout, error) {
layout.ImagesDir = filepath.Join(layout.StateDir, "images")
layout.KernelsDir = filepath.Join(layout.StateDir, "kernels")
layout.OCICacheDir = filepath.Join(layout.CacheDir, "oci")
layout.SSHDir = filepath.Join(layout.StateDir, "ssh")
layout.KnownHostsPath = filepath.Join(layout.SSHDir, "known_hosts")
return layout, nil
}
@ -65,6 +69,15 @@ func Ensure(layout Layout) error {
return err
}
}
// SSH material (private key, known_hosts) — 0700 like ~/.ssh so
// strict SSH clients don't complain and no other host user can
// read it. Empty SSHDir means the caller built a Layout by hand
// (tests) and doesn't need the subdir; skip silently.
if strings.TrimSpace(layout.SSHDir) != "" {
if err := os.MkdirAll(layout.SSHDir, 0o700); err != nil {
return err
}
}
return nil
}