loop: add todo checklist

This should improve Sketch's executive function and user communication.
diff --git a/loop/testdata/agent_loop.httprr b/loop/testdata/agent_loop.httprr
index e7e5bb7..fbafd31 100644
--- a/loop/testdata/agent_loop.httprr
+++ b/loop/testdata/agent_loop.httprr
@@ -1,9 +1,9 @@
 httprr trace v1
-18179 2458
+20299 2495
 POST https://api.anthropic.com/v1/messages HTTP/1.1

 Host: api.anthropic.com

 User-Agent: Go-http-client/1.1

-Content-Length: 17981

+Content-Length: 20101

 Anthropic-Version: 2023-06-01

 Content-Type: application/json

 

@@ -90,6 +90,57 @@
    }
   },
   {
+   "name": "todo_read",
+   "description": "Reads the current todo list. Use frequently to track progress and understand what's pending.",
+   "input_schema": {
+    "type": "object",
+    "properties": {}
+   }
+  },
+  {
+   "name": "todo_write",
+   "description": "todo_write: Creates and manages a structured task list for tracking work and communicating progress to users. Use early and often.\n\nUse for:\n- multi-step tasks\n- complex work\n- when users provide multiple requests\n- conversations that start trivial but grow in scope\n- when users request additional work (directly or via feedback)\n\nSkip for:\n- trivial single-step tasks\n- purely conversational exchanges\n\nUpdate dynamically as work evolves - conversations can spawn tasks, simple tasks can become complex, and new discoveries may require additional work.\n\nRules:\n- Update immediately when task states or task list changes\n- Only one task \"in-progress\" at any time\n- Each update completely replaces the task list - include all tasks (past and present)\n- Never modify or delete completed tasks\n- Queued and in-progress tasks may be restructured as understanding evolves\n- Tasks should be atomic, clear, precise, and actionable\n- If the user adds new tasks: append, don't replace\n",
+   "input_schema": {
+    "type": "object",
+    "required": [
+     "tasks"
+    ],
+    "properties": {
+     "tasks": {
+      "type": "array",
+      "description": "Array of tasks to write",
+      "items": {
+       "type": "object",
+       "required": [
+        "id",
+        "task",
+        "status"
+       ],
+       "properties": {
+        "id": {
+         "type": "string",
+         "description": "stable, unique hyphenated slug"
+        },
+        "task": {
+         "type": "string",
+         "description": "actionable step in active tense, sentence case, plain text only, displayed to user"
+        },
+        "status": {
+         "type": "string",
+         "enum": [
+          "queued",
+          "in-progress",
+          "completed"
+         ],
+         "description": "current task status"
+        }
+       }
+      }
+     }
+    }
+   }
+  },
+  {
    "name": "title",
    "description": "Sets the conversation title.",
    "input_schema": {
@@ -543,7 +594,7 @@
  ],
  "system": [
   {
-   "text": "You are the expert software engineer and architect powering Sketch,\nan agentic coding environment that helps users accomplish coding tasks through autonomous analysis and implementation.\n\n\u003cworkflow\u003e\nStart by asking concise clarifying questions as needed.\nOnce the intent is clear, work autonomously.\nWhenever possible, do end-to-end testing, to ensure fully working functionality.\nAim for a small diff size while thoroughly completing the requested task.\nPrioritize thoughtful analysis and critical engagement over agreeability.\n\nCall the title tool as soon as the topic of conversation is clear, often immediately.\n\nBreak down the overall goal into a series of smaller steps.\n(The first step is often: \"Make a plan.\")\nThen execute each step using tools.\nUpdate the plan if you have encountered problems or learned new information.\n\nWhen in doubt about a step, follow this broad workflow:\n\n- Think about how the current step fits into the overall plan.\n- Do research. Good tool choices: bash, think, keyword_search\n- Make edits.\n- Repeat.\n\nTo make edits reliably and efficiently, first think about the intent of the edit,\nand what set of patches will achieve that intent.\nThen use the patch tool to make those edits. Combine all edits to any given file into a single patch tool call.\n\nComplete every task exhaustively - no matter how repetitive or tedious.\nPartial work, pattern demonstrations, or stubs with TODOs are not acceptable, unless explicitly permitted by the user.\n\nThe done tool provides a checklist of items you MUST verify and\nreview before declaring that you are done. Before executing\nthe done tool, run all the tools the done tool checklist asks\nfor, including creating a git commit. Do not forget to run tests.\n\u003c/workflow\u003e\n\n\u003cstyle\u003e\nDefault coding guidelines:\n- Clear is better than clever.\n- Minimal inline comments: non-obvious logic and key decisions only.\n\u003c/style\u003e\n\n\u003csystem_info\u003e\n\u003cplatform\u003e\nlinux/amd64\n\u003c/platform\u003e\n\u003cpwd\u003e\n/\n\u003c/pwd\u003e\n\u003c/system_info\u003e\n\n\u003cgit_info\u003e\n\u003cgit_root\u003e\n\n\u003c/git_root\u003e\n\u003cHEAD\u003e\nHEAD\n\u003c/HEAD\u003e\n\u003c/git_info\u003e\n\n",
+   "text": "You are the expert software engineer and architect powering Sketch,\nan agentic coding environment that helps users accomplish coding tasks through autonomous analysis and implementation.\n\n\u003cworkflow\u003e\nStart by asking concise clarifying questions as needed.\nOnce the intent is clear, work autonomously.\nWhenever possible, do end-to-end testing, to ensure fully working functionality.\nAim for a small diff size while thoroughly completing the requested task.\nPrioritize thoughtful analysis and critical engagement over agreeability.\n\nCall the title tool as soon as the topic of conversation is clear, often immediately.\n\nBreak down the overall goal into a series of smaller steps.\nUse the todo_read and todo_write tools to organize and track your work systematically.\n\nFollow this broad workflow:\n\n- Think about how the current step fits into the overall plan.\n- Do research. Good tool choices: bash, think, keyword_search\n- Make edits.\n- If you have completed a standalone chunk of work, make a git commit.\n- Update your todo task list.\n- Repeat.\n\nTo make edits reliably and efficiently, first think about the intent of the edit,\nand what set of patches will achieve that intent.\nThen use the patch tool to make those edits. Combine all edits to any given file into a single patch tool call.\n\nComplete every task exhaustively - no matter how repetitive or tedious.\nPartial work, pattern demonstrations, or stubs with TODOs are not acceptable, unless explicitly permitted by the user.\n\nThe done tool provides a checklist of items you MUST verify and\nreview before declaring that you are done. Before executing\nthe done tool, run all the tools the done tool checklist asks\nfor, including creating a git commit. Do not forget to run tests.\n\u003c/workflow\u003e\n\n\u003cstyle\u003e\nDefault coding guidelines:\n- Clear is better than clever.\n- Minimal inline comments: non-obvious logic and key decisions only.\n\u003c/style\u003e\n\n\u003csystem_info\u003e\n\u003cplatform\u003e\nlinux/amd64\n\u003c/platform\u003e\n\u003cpwd\u003e\n/\n\u003c/pwd\u003e\n\u003c/system_info\u003e\n\n\u003cgit_info\u003e\n\u003cgit_root\u003e\n\n\u003c/git_root\u003e\n\u003cHEAD\u003e\nHEAD\n\u003c/HEAD\u003e\n\u003c/git_info\u003e\n\n",
    "type": "text",
    "cache_control": {
     "type": "ephemeral"
@@ -554,24 +605,24 @@
 Anthropic-Organization-Id: 3c473a21-7208-450a-a9f8-80aebda45c1b

 Anthropic-Ratelimit-Input-Tokens-Limit: 200000

 Anthropic-Ratelimit-Input-Tokens-Remaining: 200000

-Anthropic-Ratelimit-Input-Tokens-Reset: 2025-05-29T14:23:03Z

+Anthropic-Ratelimit-Input-Tokens-Reset: 2025-05-29T17:27:03Z

 Anthropic-Ratelimit-Output-Tokens-Limit: 80000

 Anthropic-Ratelimit-Output-Tokens-Remaining: 80000

-Anthropic-Ratelimit-Output-Tokens-Reset: 2025-05-29T14:23:09Z

+Anthropic-Ratelimit-Output-Tokens-Reset: 2025-05-29T17:27:08Z

 Anthropic-Ratelimit-Requests-Limit: 4000

 Anthropic-Ratelimit-Requests-Remaining: 3999

-Anthropic-Ratelimit-Requests-Reset: 2025-05-29T14:23:02Z

+Anthropic-Ratelimit-Requests-Reset: 2025-05-29T17:27:01Z

 Anthropic-Ratelimit-Tokens-Limit: 280000

 Anthropic-Ratelimit-Tokens-Remaining: 280000

-Anthropic-Ratelimit-Tokens-Reset: 2025-05-29T14:23:03Z

+Anthropic-Ratelimit-Tokens-Reset: 2025-05-29T17:27:03Z

 Cf-Cache-Status: DYNAMIC

-Cf-Ray: 9476a5b83ed85c1b-SJC

+Cf-Ray: 9477b339e89417ee-SJC

 Content-Type: application/json

-Date: Thu, 29 May 2025 14:23:09 GMT

-Request-Id: req_011CPbyLZAnxoL5v6A5ZJhjY

+Date: Thu, 29 May 2025 17:27:09 GMT

+Request-Id: req_011CPcDNHrEZY5uavYRd4Wsk

 Server: cloudflare

 Strict-Transport-Security: max-age=31536000; includeSubDomains; preload

 Via: 1.1 google

 X-Robots-Tag: none

 

-{"id":"msg_01HQ9GtFkx6LSC7JabvDUyYF","type":"message","role":"assistant","model":"claude-sonnet-4-20250514","content":[{"type":"text","text":"Here are the tools available to me:\n\n**Core Development:**\n- `bash` - Execute shell commands\n- `patch` - File modification and editing\n- `keyword_search` - Search codebase for files/content\n- `think` - Internal note-taking and planning\n\n**Git & Code Quality:**\n- `precommit` - Create git branch for work tracking\n- `codereview` - Run automated code review\n- `done` - Completion checklist verification\n\n**Browser Automation:**\n- `browser_navigate` - Navigate to URLs\n- `browser_click` - Click elements\n- `browser_type` - Type into inputs\n- `browser_wait_for` - Wait for elements\n- `browser_get_text` - Read page content\n- `browser_eval` - Execute JavaScript\n- `browser_scroll_into_view` - Scroll to elements\n- `browser_resize` - Resize browser window\n- `browser_take_screenshot` - Capture screenshots\n- `browser_read_image` - Read image files\n- `browser_recent_console_logs` - Get console logs\n- `browser_clear_console_logs` - Clear console logs\n\n**User Interaction:**\n- `title` - Set conversation title\n- `multiplechoice` - Present multiple choice questions\n- `about_sketch` - Get Sketch environment help"}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":3,"cache_creation_input_tokens":4182,"cache_read_input_tokens":0,"output_tokens":306,"service_tier":"standard"}}
\ No newline at end of file
+{"id":"msg_01PbteLNxXTMFWyxMUWxhpPM","type":"message","role":"assistant","model":"claude-sonnet-4-20250514","content":[{"type":"text","text":"Here are the tools available to me:\n\n**File & Code Operations:**\n- `patch` - Modify files with precise text edits\n- `keyword_search` - Search codebase with keywords\n- `bash` - Execute shell commands\n- `codereview` - Run automated code review\n\n**Project Management:**\n- `todo_read` / `todo_write` - Manage task lists\n- `think` - Record thoughts and plans\n- `precommit` - Create git branch for work\n- `done` - Complete work with checklist verification\n\n**Browser Automation:**\n- `browser_navigate` - Navigate to URLs\n- `browser_click` - Click elements\n- `browser_type` - Type into inputs\n- `browser_wait_for` - Wait for elements\n- `browser_get_text` - Read page text\n- `browser_eval` - Execute JavaScript\n- `browser_scroll_into_view` - Scroll elements into view\n- `browser_resize` - Resize browser window\n- `browser_take_screenshot` - Capture screenshots\n- `browser_read_image` - Read image files\n- `browser_recent_console_logs` / `browser_clear_console_logs` - Manage console logs\n\n**User Interface:**\n- `title` - Set conversation title\n- `multiplechoice` - Present multiple choice questions\n- `about_sketch` - Get Sketch platform information"}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":3,"cache_creation_input_tokens":4648,"cache_read_input_tokens":0,"output_tokens":320,"service_tier":"standard"}}
\ No newline at end of file