)]}'
{
  "log": [
    {
      "commit": "72252cbcb97840d724133be67c4c69cc69ebb2d3",
      "tree": "a361499dc3fa6b9af2be3e74cfd59fd8ba34690e",
      "parents": [
        "7ce5fb76d8748ebf73c5adf9d6cd8eb67716fba8"
      ],
      "author": {
        "name": "Philip Zeyliger",
        "email": "philip@bold.dev",
        "time": "Sat May 10 17:00:08 2025 -0700"
      },
      "committer": {
        "name": "Philip Zeyliger",
        "email": "philip@bold.dev",
        "time": "Sat May 10 17:00:08 2025 -0700"
      },
      "message": "llm and everything: Update ToolResult to use []Content instead of string for multimodal support\n\nThis was a journey. The sketch-generated summary below is acceptable,\nbut I want to tell you about it in my voice too. The goal was to send\nscreenshots to Claude, so that it could... look at them. Currently\nthe take screenshot and read screenshot tools are different, and they\u0027ll\nneed to be renamed/prompt-engineered a bit, but that\u0027s all fine.\n\nThe miserable part was that we had to change the return value\nof tool from string to Content[], and this crosses several layers:\n - llm.Tool\n - llm.Content\n - ant.Content \u0026 openai and gemini friends\n - AgentMessage [we left this alone]\n\nExtra fun is that Claude\u0027s API for sending images has nested Content\nfields, and empty string and missing needs to be distinguished for the\nText field (because lots of shell commands return the empty string!).\n\nFor the UI, I made us transform the results into a string, dropping\nimages. This would have been yet more churn for not much obvious\nbenefit. Plus, it was going to break skaband\u0027s compatibility, and ...\nyet more work.\n\nOpenAI and Gemini don\u0027t obviously support images in this same way,\nso they just don\u0027t get the tools.\n\n~~~~~~~~~~ Sketch said:\n\nThis architectural change transforms tool results from plain strings to []Content arrays, enabling multimodal interaction in the system. Key changes include:\n\n- Core structural changes:\n  - Modified ToolResult type from string to []Content across all packages\n  - Added MediaType field to Content struct for MIME type support\n  - Created TextContent and ImageContent helper functions\n  - Updated all tool.Run implementations to return []Content\n\n- Image handling:\n  - Implemented base64 image support in Anthropic adapter\n  - Added proper media type detection and content formatting\n  - Created browser_read_image tool for displaying screenshots\n  - Updated browser_screenshot to provide usable image paths\n\n- Adapter improvements:\n  - Updated all LLM adapters (ANT, OAI, GEM) to handle content arrays\n  - Added specialized image content handling in the Anthropic adapter\n  - Ensured proper JSON serialization/deserialization for all content types\n  - Improved test coverage for content arrays\n\n- UI enhancements:\n  - Added omitempty tags to reduce JSON response size\n  - Updated TypeScript types to handle array content\n  - Made field naming consistent (tool_error vs is_error)\n  - Preserved backward compatibility for existing consumers\n\nCo-Authored-By: sketch \u003chello@sketch.dev\u003e\nChange-ID: s1a2b3c4d5e6f7g8h\n"
    }
  ]
}
