claude: bash tool smorgasbord

This is a combination of a bunch of changes and bug fixes
accumulated over a week. It decided not to spend a bunch of time
teasing it apart into its components. I apologize.

Significant changes include:

- clean up background bash execution
- improve documentation
- remove TODO we're not gonna do
- add a "process completed" note to stderr
- convert background result printing to xml-ish
- combine stdout and stderr in background bash to match foreground use
- hint to sketch to kill the process group
- add missing timeouts propagation
- tell agent that bash calls are stateless
- thread pwd through explicitly
- unify command creation
- speed up missing command installation
    I tried a bunch of different ways to prompt engineer this to be faster.
    But Claude will be Claude. Solution: switch from agent to one-shot.

    This is marginally more brittle, and can only use a package manager,
    but that also prevents a bunch of possible curl|bash mess, etc.
    And this was always best-effort anyway.

    It's now MUCH faster to fail on non-existent commands, and about 2x
    faster on the real commands I tried (yamllint, htop)...now mostly down
    to the irreducible work involved in actually doing the installation.
- remove SKETCH_ from bash env, except SKETCH_PROXY_ID
- delay kill instructions until actually needed
- add simple GIT_SEQUENCE_EDITOR
- overhaul cancellation
- explicitly disable EDITOR to prevent hangs
    I have big plans here, but this will do for now.
- simplify and unify handling of long outputs
- switch to center trunctation of long outputs
- add zombie process cleanup using unix.Wait4
    Wow, I tried a bunch of things here.

    When running as PID 1, we are responsible for reaping zombies.
    Unfortunately, we can't do this in the simple/obvious way,
    because simply listening for SIGCHILD and reaping races
    with running cmd.Wait. We can't use a separate init process
    or double-init sketch, because then we lose our seccomp
    protection, and there's no particularly good way to extend it.

    Instead, (h/t to Philip asking a good question), observe
    that we are in a very controlled environment, and pretty much
    the only way to get zombies is via the bash tool.
    So we add reaping tied specifically to process groups started
    by the bash tool, with an explicit understanding of their lifecycle.

    Auto-installation of tools still creates zombies.
    We now know how to fix it, but it is rare, so who cares.
diff --git a/claudetool/bash_zombies_linux.go b/claudetool/bash_zombies_linux.go
new file mode 100644
index 0000000..c832caa
--- /dev/null
+++ b/claudetool/bash_zombies_linux.go
@@ -0,0 +1,66 @@
+//go:build linux
+
+package claudetool
+
+import (
+	"log/slog"
+	"os"
+	"syscall"
+	"time"
+
+	"golang.org/x/sys/unix"
+)
+
+// reapZombies attempts to reap zombie child processes from the specified
+// process group that may have been left behind after a process group cleanup.
+// This is important when running as PID 1 (init process) since no other process
+// will reap zombies.
+//
+// This function reaps zombies until the process group is empty or no more
+// zombies are available.
+func reapZombies(pgid int) {
+	if os.Getpid() != 1 {
+		return // not running as init (e.g. -unsafe), no need to reap
+	}
+	// Quick exit for the common case.
+	if !processGroupHasProcesses(pgid) {
+		return // no processes in the group, nothing to reap
+	}
+
+	// Reap in the background.
+	go func() {
+		maxAttempts := 1000 // shouldn't ever hit this, so be generous, this isn't particularly expensive
+
+		for range maxAttempts {
+			if !processGroupHasProcesses(pgid) {
+				return
+			}
+
+			var wstatus unix.WaitStatus
+			pid, err := unix.Wait4(-pgid, &wstatus, unix.WNOHANG, nil)
+
+			switch err {
+			case syscall.EINTR:
+				// interrupted, retry
+				continue
+			case syscall.ECHILD:
+				// no children, therefore no zombies
+				return
+			case nil:
+				// fall through to handle pid
+			default:
+				slog.Debug("unexpected error in reapZombies", "error", err, "pgid", pgid)
+				return
+			}
+
+			if pid == 0 {
+				// No zombies available right now, wait and check again
+				// There's no great rush, so give it some time.
+				time.Sleep(100 * time.Millisecond)
+				continue
+			}
+
+			slog.Debug("reaped zombie process", "pid", pid, "pgid", pgid, "status", wstatus)
+		}
+	}()
+}