claude: bash tool smorgasbord
This is a combination of a bunch of changes and bug fixes
accumulated over a week. It decided not to spend a bunch of time
teasing it apart into its components. I apologize.
Significant changes include:
- clean up background bash execution
- improve documentation
- remove TODO we're not gonna do
- add a "process completed" note to stderr
- convert background result printing to xml-ish
- combine stdout and stderr in background bash to match foreground use
- hint to sketch to kill the process group
- add missing timeouts propagation
- tell agent that bash calls are stateless
- thread pwd through explicitly
- unify command creation
- speed up missing command installation
I tried a bunch of different ways to prompt engineer this to be faster.
But Claude will be Claude. Solution: switch from agent to one-shot.
This is marginally more brittle, and can only use a package manager,
but that also prevents a bunch of possible curl|bash mess, etc.
And this was always best-effort anyway.
It's now MUCH faster to fail on non-existent commands, and about 2x
faster on the real commands I tried (yamllint, htop)...now mostly down
to the irreducible work involved in actually doing the installation.
- remove SKETCH_ from bash env, except SKETCH_PROXY_ID
- delay kill instructions until actually needed
- add simple GIT_SEQUENCE_EDITOR
- overhaul cancellation
- explicitly disable EDITOR to prevent hangs
I have big plans here, but this will do for now.
- simplify and unify handling of long outputs
- switch to center trunctation of long outputs
- add zombie process cleanup using unix.Wait4
Wow, I tried a bunch of things here.
When running as PID 1, we are responsible for reaping zombies.
Unfortunately, we can't do this in the simple/obvious way,
because simply listening for SIGCHILD and reaping races
with running cmd.Wait. We can't use a separate init process
or double-init sketch, because then we lose our seccomp
protection, and there's no particularly good way to extend it.
Instead, (h/t to Philip asking a good question), observe
that we are in a very controlled environment, and pretty much
the only way to get zombies is via the bash tool.
So we add reaping tied specifically to process groups started
by the bash tool, with an explicit understanding of their lifecycle.
Auto-installation of tools still creates zombies.
We now know how to fix it, but it is rare, so who cares.
diff --git a/loop/testdata/agent_loop.httprr b/loop/testdata/agent_loop.httprr
index 82e0423..fce2958 100644
--- a/loop/testdata/agent_loop.httprr
+++ b/loop/testdata/agent_loop.httprr
@@ -1,9 +1,9 @@
httprr trace v1
-18170 2455
+18228 2477
POST https://api.anthropic.com/v1/messages HTTP/1.1
Host: api.anthropic.com
User-Agent: Go-http-client/1.1
-Content-Length: 17972
+Content-Length: 18030
Anthropic-Version: 2023-06-01
Content-Type: application/json
@@ -27,7 +27,7 @@
"tools": [
{
"name": "bash",
- "description": "Executes shell commands via bash -c, returning combined stdout/stderr.\n\nWith background=true, returns immediately while process continues running\nwith output redirected to files. Kill process group when done.\nUse background for servers/demos that need to stay running.\n\nMUST set slow_ok=true for potentially slow commands: builds, downloads,\ninstalls, tests, or any other substantive operation.",
+ "description": "Executes shell commands via bash -c, returning combined stdout/stderr.\nBash state changes (working dir, variables, aliases) don't persist between calls.\n\nWith background=true, returns immediately, with output redirected to a file.\nUse background for servers/demos that need to stay running.\n\nMUST set slow_ok=true for potentially slow commands: builds, downloads,\ninstalls, tests, or any other substantive operation.\n\n\u003cpwd\u003e/\u003c/pwd\u003e",
"input_schema": {
"type": "object",
"required": [
@@ -557,24 +557,24 @@
Anthropic-Organization-Id: 3c473a21-7208-450a-a9f8-80aebda45c1b
Anthropic-Ratelimit-Input-Tokens-Limit: 2000000
Anthropic-Ratelimit-Input-Tokens-Remaining: 2000000
-Anthropic-Ratelimit-Input-Tokens-Reset: 2025-07-26T04:46:24Z
+Anthropic-Ratelimit-Input-Tokens-Reset: 2025-07-30T18:29:45Z
Anthropic-Ratelimit-Output-Tokens-Limit: 400000
Anthropic-Ratelimit-Output-Tokens-Remaining: 400000
-Anthropic-Ratelimit-Output-Tokens-Reset: 2025-07-26T04:46:30Z
+Anthropic-Ratelimit-Output-Tokens-Reset: 2025-07-30T18:29:50Z
Anthropic-Ratelimit-Requests-Limit: 4000
Anthropic-Ratelimit-Requests-Remaining: 3999
-Anthropic-Ratelimit-Requests-Reset: 2025-07-26T04:46:23Z
+Anthropic-Ratelimit-Requests-Reset: 2025-07-30T18:29:44Z
Anthropic-Ratelimit-Tokens-Limit: 2400000
Anthropic-Ratelimit-Tokens-Remaining: 2400000
-Anthropic-Ratelimit-Tokens-Reset: 2025-07-26T04:46:24Z
+Anthropic-Ratelimit-Tokens-Reset: 2025-07-30T18:29:45Z
Cf-Cache-Status: DYNAMIC
-Cf-Ray: 965140c378c0fc54-SJC
+Cf-Ray: 9676ec577ae3eb24-SJC
Content-Type: application/json
-Date: Sat, 26 Jul 2025 04:46:30 GMT
-Request-Id: req_011CRV2HDgCqjHTZ6M9m9CS6
+Date: Wed, 30 Jul 2025 18:29:50 GMT
+Request-Id: req_011CRdgJS1bQMPR17iFwT9eP
Server: cloudflare
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
Via: 1.1 google
X-Robots-Tag: none
-{"id":"msg_01Q3vJkUd9kohDxvJvG7Nmjt","type":"message","role":"assistant","model":"claude-sonnet-4-20250514","content":[{"type":"text","text":"Here are the tools available to me:\n\n**File & Code Management:**\n- `bash` - Execute shell commands\n- `patch` - Edit files with precise text operations\n- `keyword_search` - Search codebase by keywords/concepts\n\n**Task Management:**\n- `todo_read` - Read current task list\n- `todo_write` - Create/update structured task list\n- `think` - Record thoughts and plans\n\n**Development Workflow:**\n- `codereview` - Run automated code review\n- `done` - Complete work with checklist verification\n\n**Browser Automation:**\n- `browser_navigate` - Navigate to URLs\n- `browser_click` - Click elements\n- `browser_type` - Type into inputs\n- `browser_wait_for` - Wait for elements\n- `browser_get_text` - Read element text\n- `browser_eval` - Execute JavaScript\n- `browser_scroll_into_view` - Scroll to elements\n- `browser_resize` - Resize browser window\n- `browser_recent_console_logs` - Get console logs\n- `browser_clear_console_logs` - Clear console logs\n- `browser_take_screenshot` - Take screenshots\n\n**Utility:**\n- `read_image` - Read and encode image files\n- `about_sketch` - Get help with Sketch functionality"}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":3,"cache_creation_input_tokens":4161,"cache_read_input_tokens":0,"output_tokens":302,"service_tier":"standard"}}
\ No newline at end of file
+{"id":"msg_01Gj3ha6GnJh81AQoF5vUENq","type":"message","role":"assistant","model":"claude-sonnet-4-20250514","content":[{"type":"text","text":"Here are the tools available to me:\n\n**File & Code Operations:**\n- `bash` - Execute shell commands\n- `patch` - Modify files with precise text edits\n- `keyword_search` - Search for files in unfamiliar codebases\n\n**Task Management:**\n- `todo_read` - Read current todo list\n- `todo_write` - Create and manage task lists\n- `think` - Think out loud, take notes, form plans\n\n**Development Workflow:**\n- `codereview` - Run automated code review\n- `done` - Mark work as complete with checklist verification\n\n**Browser Automation:**\n- `browser_navigate` - Navigate to URLs\n- `browser_click` - Click elements\n- `browser_type` - Type into input fields\n- `browser_wait_for` - Wait for elements\n- `browser_get_text` - Read page text\n- `browser_eval` - Execute JavaScript\n- `browser_scroll_into_view` - Scroll to elements\n- `browser_resize` - Resize browser window\n- `browser_take_screenshot` - Capture screenshots\n- `browser_recent_console_logs` - Get console logs\n- `browser_clear_console_logs` - Clear console logs\n\n**Utility:**\n- `about_sketch` - Get help with Sketch functionality\n- `read_image` - Read and encode image files"}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":3,"cache_creation_input_tokens":4179,"cache_read_input_tokens":0,"output_tokens":314,"service_tier":"standard"}}
\ No newline at end of file