loop: encourage e2e testing
diff --git a/loop/testdata/agent_loop.httprr b/loop/testdata/agent_loop.httprr
index 9da5ea3..e7e5bb7 100644
--- a/loop/testdata/agent_loop.httprr
+++ b/loop/testdata/agent_loop.httprr
@@ -1,9 +1,9 @@
httprr trace v1
-18097 2496
+18179 2458
POST https://api.anthropic.com/v1/messages HTTP/1.1
Host: api.anthropic.com
User-Agent: Go-http-client/1.1
-Content-Length: 17899
+Content-Length: 17981
Anthropic-Version: 2023-06-01
Content-Type: application/json
@@ -543,7 +543,7 @@
],
"system": [
{
- "text": "You are the expert software engineer and architect powering Sketch,\nan agentic coding environment that helps users accomplish coding tasks through autonomous analysis and implementation.\n\n\u003cworkflow\u003e\nStart by asking concise clarifying questions as needed.\nOnce the intent is clear, work autonomously.\nAim for a small diff size while thoroughly completing the requested task.\nPrioritize thoughtful analysis and critical engagement over agreeability.\n\nCall the title tool as soon as the topic of conversation is clear, often immediately.\n\nBreak down the overall goal into a series of smaller steps.\n(The first step is often: \"Make a plan.\")\nThen execute each step using tools.\nUpdate the plan if you have encountered problems or learned new information.\n\nWhen in doubt about a step, follow this broad workflow:\n\n- Think about how the current step fits into the overall plan.\n- Do research. Good tool choices: bash, think, keyword_search\n- Make edits.\n- Repeat.\n\nTo make edits reliably and efficiently, first think about the intent of the edit,\nand what set of patches will achieve that intent.\nThen use the patch tool to make those edits. Combine all edits to any given file into a single patch tool call.\n\nComplete every task exhaustively - no matter how repetitive or tedious.\nPartial work, pattern demonstrations, or stubs with TODOs are not acceptable, unless explicitly permitted by the user.\n\nThe done tool provides a checklist of items you MUST verify and\nreview before declaring that you are done. Before executing\nthe done tool, run all the tools the done tool checklist asks\nfor, including creating a git commit. Do not forget to run tests.\n\u003c/workflow\u003e\n\n\u003cstyle\u003e\nDefault coding guidelines:\n- Clear is better than clever.\n- Minimal inline comments: non-obvious logic and key decisions only.\n\u003c/style\u003e\n\n\u003csystem_info\u003e\n\u003cplatform\u003e\nlinux/amd64\n\u003c/platform\u003e\n\u003cpwd\u003e\n/\n\u003c/pwd\u003e\n\u003c/system_info\u003e\n\n\u003cgit_info\u003e\n\u003cgit_root\u003e\n\n\u003c/git_root\u003e\n\u003cHEAD\u003e\nHEAD\n\u003c/HEAD\u003e\n\u003c/git_info\u003e\n\n",
+ "text": "You are the expert software engineer and architect powering Sketch,\nan agentic coding environment that helps users accomplish coding tasks through autonomous analysis and implementation.\n\n\u003cworkflow\u003e\nStart by asking concise clarifying questions as needed.\nOnce the intent is clear, work autonomously.\nWhenever possible, do end-to-end testing, to ensure fully working functionality.\nAim for a small diff size while thoroughly completing the requested task.\nPrioritize thoughtful analysis and critical engagement over agreeability.\n\nCall the title tool as soon as the topic of conversation is clear, often immediately.\n\nBreak down the overall goal into a series of smaller steps.\n(The first step is often: \"Make a plan.\")\nThen execute each step using tools.\nUpdate the plan if you have encountered problems or learned new information.\n\nWhen in doubt about a step, follow this broad workflow:\n\n- Think about how the current step fits into the overall plan.\n- Do research. Good tool choices: bash, think, keyword_search\n- Make edits.\n- Repeat.\n\nTo make edits reliably and efficiently, first think about the intent of the edit,\nand what set of patches will achieve that intent.\nThen use the patch tool to make those edits. Combine all edits to any given file into a single patch tool call.\n\nComplete every task exhaustively - no matter how repetitive or tedious.\nPartial work, pattern demonstrations, or stubs with TODOs are not acceptable, unless explicitly permitted by the user.\n\nThe done tool provides a checklist of items you MUST verify and\nreview before declaring that you are done. Before executing\nthe done tool, run all the tools the done tool checklist asks\nfor, including creating a git commit. Do not forget to run tests.\n\u003c/workflow\u003e\n\n\u003cstyle\u003e\nDefault coding guidelines:\n- Clear is better than clever.\n- Minimal inline comments: non-obvious logic and key decisions only.\n\u003c/style\u003e\n\n\u003csystem_info\u003e\n\u003cplatform\u003e\nlinux/amd64\n\u003c/platform\u003e\n\u003cpwd\u003e\n/\n\u003c/pwd\u003e\n\u003c/system_info\u003e\n\n\u003cgit_info\u003e\n\u003cgit_root\u003e\n\n\u003c/git_root\u003e\n\u003cHEAD\u003e\nHEAD\n\u003c/HEAD\u003e\n\u003c/git_info\u003e\n\n",
"type": "text",
"cache_control": {
"type": "ephemeral"
@@ -554,24 +554,24 @@
Anthropic-Organization-Id: 3c473a21-7208-450a-a9f8-80aebda45c1b
Anthropic-Ratelimit-Input-Tokens-Limit: 200000
Anthropic-Ratelimit-Input-Tokens-Remaining: 200000
-Anthropic-Ratelimit-Input-Tokens-Reset: 2025-05-24T19:19:39Z
+Anthropic-Ratelimit-Input-Tokens-Reset: 2025-05-29T14:23:03Z
Anthropic-Ratelimit-Output-Tokens-Limit: 80000
Anthropic-Ratelimit-Output-Tokens-Remaining: 80000
-Anthropic-Ratelimit-Output-Tokens-Reset: 2025-05-24T19:19:45Z
+Anthropic-Ratelimit-Output-Tokens-Reset: 2025-05-29T14:23:09Z
Anthropic-Ratelimit-Requests-Limit: 4000
Anthropic-Ratelimit-Requests-Remaining: 3999
-Anthropic-Ratelimit-Requests-Reset: 2025-05-24T19:19:37Z
+Anthropic-Ratelimit-Requests-Reset: 2025-05-29T14:23:02Z
Anthropic-Ratelimit-Tokens-Limit: 280000
Anthropic-Ratelimit-Tokens-Remaining: 280000
-Anthropic-Ratelimit-Tokens-Reset: 2025-05-24T19:19:39Z
+Anthropic-Ratelimit-Tokens-Reset: 2025-05-29T14:23:03Z
Cf-Cache-Status: DYNAMIC
-Cf-Ray: 944f254d6862171a-SJC
+Cf-Ray: 9476a5b83ed85c1b-SJC
Content-Type: application/json
-Date: Sat, 24 May 2025 19:19:45 GMT
-Request-Id: req_011CPStukaCC4JCpQWvySWrY
+Date: Thu, 29 May 2025 14:23:09 GMT
+Request-Id: req_011CPbyLZAnxoL5v6A5ZJhjY
Server: cloudflare
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
Via: 1.1 google
X-Robots-Tag: none
-{"id":"msg_01LmNpBXK7SGcuZNpqBXDbY8","type":"message","role":"assistant","model":"claude-sonnet-4-20250514","content":[{"type":"text","text":"Here are the tools available to me:\n\n**File & Code Operations:**\n- `bash` - Execute shell commands\n- `patch` - Modify files with precise text edits\n- `keyword_search` - Search codebase for relevant files\n- `codereview` - Run automated code review\n\n**Git Operations:**\n- `precommit` - Create git branch and get commit guidance\n\n**Browser Automation:**\n- `browser_navigate` - Navigate to URLs\n- `browser_click` - Click elements\n- `browser_type` - Type into inputs\n- `browser_wait_for` - Wait for elements\n- `browser_get_text` - Read element text\n- `browser_eval` - Execute JavaScript\n- `browser_scroll_into_view` - Scroll to elements\n- `browser_resize` - Resize browser window\n- `browser_take_screenshot` - Capture screenshots\n- `browser_read_image` - Read image files\n- `browser_recent_console_logs` - Get console logs\n- `browser_clear_console_logs` - Clear console logs\n\n**Workflow & Communication:**\n- `think` - Internal reasoning and planning\n- `title` - Set conversation title\n- `multiplechoice` - Present multiple choice questions\n- `about_sketch` - Get help with Sketch functionality\n- `done` - Complete tasks with verification checklist"}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":3,"cache_creation_input_tokens":4163,"cache_read_input_tokens":0,"output_tokens":311,"service_tier":"standard"}}
\ No newline at end of file
+{"id":"msg_01HQ9GtFkx6LSC7JabvDUyYF","type":"message","role":"assistant","model":"claude-sonnet-4-20250514","content":[{"type":"text","text":"Here are the tools available to me:\n\n**Core Development:**\n- `bash` - Execute shell commands\n- `patch` - File modification and editing\n- `keyword_search` - Search codebase for files/content\n- `think` - Internal note-taking and planning\n\n**Git & Code Quality:**\n- `precommit` - Create git branch for work tracking\n- `codereview` - Run automated code review\n- `done` - Completion checklist verification\n\n**Browser Automation:**\n- `browser_navigate` - Navigate to URLs\n- `browser_click` - Click elements\n- `browser_type` - Type into inputs\n- `browser_wait_for` - Wait for elements\n- `browser_get_text` - Read page content\n- `browser_eval` - Execute JavaScript\n- `browser_scroll_into_view` - Scroll to elements\n- `browser_resize` - Resize browser window\n- `browser_take_screenshot` - Capture screenshots\n- `browser_read_image` - Read image files\n- `browser_recent_console_logs` - Get console logs\n- `browser_clear_console_logs` - Clear console logs\n\n**User Interaction:**\n- `title` - Set conversation title\n- `multiplechoice` - Present multiple choice questions\n- `about_sketch` - Get Sketch environment help"}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":3,"cache_creation_input_tokens":4182,"cache_read_input_tokens":0,"output_tokens":306,"service_tier":"standard"}}
\ No newline at end of file