ssh: trust-on-first-use host key pinning everywhere

Guest host-key verification was off in all three SSH paths:

  * Go SSH (internal/guest/ssh.go) used ssh.InsecureIgnoreHostKey
  * `banger vm ssh` passed StrictHostKeyChecking=no
    + UserKnownHostsFile=/dev/null
  * `~/.ssh/config` Host *.vm shipped the same posture into the
    user's global config

Now each path verifies against a banger-owned known_hosts file at
`~/.local/state/banger/ssh/known_hosts` with TOFU semantics:

  * First dial to a VM pins the key.
  * Subsequent dials require an exact match. A mismatch fails with
    an explicit "possible MITM" error.
  * `vm delete` removes the entries so a future VM reusing the IP
    or name re-pins cleanly.
  * The user's `~/.ssh/known_hosts` is untouched.

Changes:

  internal/guest/known_hosts.go (new) — OpenSSH-compatible parser,
    TOFUHostKeyCallback, RemoveKnownHosts. Process-wide mutex
    around the file.
  internal/guest/ssh.go — Dial and WaitForSSH grew a knownHostsPath
    parameter threaded through the callback. Empty path keeps the
    insecure callback (tests + throwaway tools only; documented).
  internal/daemon/{guest_sessions,session_attach,session_lifecycle,
    session_stream}.go — call sites pass d.layout.KnownHostsPath.
  internal/daemon/ssh_client_config.go — the ~/.ssh/config Host *.vm
    block now points at banger's known_hosts and uses
    StrictHostKeyChecking=accept-new. Missing path → fail closed.
  internal/daemon/vm_lifecycle.go — deleteVMLocked drops known_hosts
    entries for the VM's IP and DNS name via removeVMKnownHosts.
  internal/cli/banger.go — sshCommandArgs swaps StrictHostKeyChecking
    no + /dev/null for banger's file + accept-new. Path resolution
    failure falls through to StrictHostKeyChecking=yes.
  internal/paths/paths.go — Layout gains SSHDir + KnownHostsPath;
    Ensure creates SSHDir at 0700.

Tests (internal/guest/known_hosts_test.go): pin on first use, accept
matching key on second dial, reject mismatch, empty path skips
checking, RemoveKnownHosts drops the entry, re-pin works after
remove. Existing daemon + cli tests updated to assert the new
posture and regression-guard against the old flags.

Live verified: vm run writes the pin to banger's known_hosts at 0600
inside a 0700 dir; banger vm ssh + ssh root@<vm>.vm both succeed
using the pin; vm delete clears it.
This commit is contained in:
Thales Maciel 2026-04-19 16:46:03 -03:00
parent a59958d4f5
commit ae14b9499d
No known key found for this signature in database
GPG key ID: 33112E6833C34679
14 changed files with 634 additions and 47 deletions

View file

@ -35,12 +35,15 @@ type StreamSession struct {
closeOnce sync.Once
}
func WaitForSSH(ctx context.Context, address, privateKeyPath string, interval time.Duration) error {
// WaitForSSH polls Dial until it succeeds or ctx cancels. The
// knownHostsPath argument is the banger-owned TOFU file; empty
// disables host-key verification (tests only).
func WaitForSSH(ctx context.Context, address, privateKeyPath, knownHostsPath string, interval time.Duration) error {
if interval <= 0 {
interval = time.Second
}
for {
client, err := Dial(ctx, address, privateKeyPath)
client, err := Dial(ctx, address, privateKeyPath, knownHostsPath)
if err == nil {
_ = client.Close()
return nil
@ -53,7 +56,11 @@ func WaitForSSH(ctx context.Context, address, privateKeyPath string, interval ti
}
}
func Dial(ctx context.Context, address, privateKeyPath string) (*Client, error) {
// Dial opens an SSH client to address, authenticating with the key
// at privateKeyPath and verifying the remote host key against the
// TOFU known_hosts file at knownHostsPath. An empty knownHostsPath
// disables verification (tests / one-shot tools only).
func Dial(ctx context.Context, address, privateKeyPath, knownHostsPath string) (*Client, error) {
signer, err := privateKeySigner(privateKeyPath)
if err != nil {
return nil, err
@ -61,7 +68,7 @@ func Dial(ctx context.Context, address, privateKeyPath string) (*Client, error)
config := &ssh.ClientConfig{
User: "root",
Auth: []ssh.AuthMethod{ssh.PublicKeys(signer)},
HostKeyCallback: ssh.InsecureIgnoreHostKey(),
HostKeyCallback: TOFUHostKeyCallback(knownHostsPath),
Timeout: 10 * time.Second,
}
dialer := &net.Dialer{Timeout: 10 * time.Second}