Git Concurrency Solution: Per-Agent Repository Clones

Problem Statement

Git is not thread-safe, which creates critical race conditions when multiple AI agents try to perform Git operations concurrently:

  • Repository Corruption: Multiple agents modifying the same .git folder simultaneously
  • Branch Conflicts: Agents creating branches with the same names or overwriting each other's work
  • Push Failures: Concurrent pushes causing merge conflicts and failed operations
  • Index Lock Errors: Git index.lock conflicts when multiple processes access the repository

Solution: Per-Agent Git Clones

Instead of using mutexes (which would serialize all Git operations and hurt performance), we give each agent its own Git repository clone:

workspace/
├── agent-backend-engineer/     # Backend engineer's clone
│   ├── .git/
│   ├── tasks/
│   └── ...
├── agent-frontend-engineer/    # Frontend engineer's clone  
│   ├── .git/
│   ├── tasks/
│   └── ...
└── agent-qa-engineer/         # QA engineer's clone
    ├── .git/
    ├── tasks/
    └── ...

Key Benefits

🚀 True Concurrency

  • Multiple agents can work simultaneously without blocking each other
  • No waiting for Git lock releases
  • Scales to hundreds of concurrent agents

🛡️ Complete Isolation

  • Each agent has its own .git directory and working tree
  • No shared state or race conditions
  • Agent failures don't affect other agents

🔄 Automatic Synchronization

  • Each clone automatically pulls latest changes before creating branches
  • All branches push to the same remote repository
  • PRs are created against the main repository

🧹 Easy Cleanup

  • staff cleanup-clones removes all agent workspaces
  • Clones are recreated on-demand when agents start working
  • No manual Git state management required

Implementation Details

CloneManager (git/clone_manager.go)

type CloneManager struct {
    baseRepoURL    string                // Source repository URL
    workspacePath  string                // Base workspace directory  
    agentClones    map[string]string     // agent name -> clone path
    mu             sync.RWMutex          // Thread-safe map access
}

Key Methods:

  • GetAgentClonePath(agentName) - Get/create agent's clone directory
  • RefreshAgentClone(agentName) - Pull latest changes for an agent
  • CleanupAgentClone(agentName) - Remove specific agent's clone
  • CleanupAllClones() - Remove all agent clones

Agent Integration

Each agent's Git operations are automatically routed to its dedicated clone:

// Get agent's dedicated Git clone
clonePath, err := am.cloneManager.GetAgentClonePath(agent.Name)
if err != nil {
    return fmt.Errorf("failed to get agent clone: %w", err)
}

// All Git operations use the agent's clone directory
gitCmd := func(args ...string) *exec.Cmd {
    return exec.CommandContext(ctx, "git", append([]string{"-C", clonePath}, args...)...)
}

Workflow Example

  1. Agent Starts Task:

    Agent backend-engineer gets task: "Add user authentication"
    Creating clone: workspace/agent-backend-engineer/
    
  2. Concurrent Operations:

    # These happen simultaneously without conflicts:
    Agent backend-engineer:  git clone -> workspace/agent-backend-engineer/
    Agent frontend-engineer: git clone -> workspace/agent-frontend-engineer/  
    Agent qa-engineer:       git clone -> workspace/agent-qa-engineer/
    
  3. Branch Creation:

    # Each agent creates branches in their own clone:
    backend-engineer:  git checkout -b task-123-auth-backend
    frontend-engineer: git checkout -b task-124-auth-ui
    qa-engineer:      git checkout -b task-125-auth-tests
    
  4. Concurrent Pushes:

    # All agents push to origin simultaneously:
    git push -u origin task-123-auth-backend    # ✅ Success
    git push -u origin task-124-auth-ui         # ✅ Success  
    git push -u origin task-125-auth-tests      # ✅ Success
    

Management Commands

List Agent Clones

staff list-agents  # Shows which agents are running and their clone status

Cleanup All Clones

staff cleanup-clones  # Removes all agent workspace directories

Monitor Disk Usage

du -sh workspace/  # Check total workspace disk usage

Resource Considerations

Disk Space

  • Each clone uses ~repository size (typically 10-100MB per agent)
  • For 10 agents with 50MB repo = ~500MB total
  • Use staff cleanup-clones to free space when needed

Network Usage

  • Initial clone downloads full repository
  • Subsequent git pull operations are incremental
  • All agents share the same remote repository

Performance

  • Clone creation: ~2-5 seconds per agent (one-time cost)
  • Git operations: Full speed, no waiting for locks
  • Parallel processing: Linear scalability with agent count

Comparison to Alternatives

SolutionConcurrencyComplexityPerformanceRisk
Per-Agent Clones✅ Full🟡 Medium✅ High🟢 Low
Global Git Mutex❌ None🟢 Low❌ Poor🟡 Medium
File Locking🟡 Limited🔴 High🟡 Medium🔴 High
Separate Repositories✅ Full🔴 Very High✅ High🔴 High

Future Enhancements

  • Lazy Cleanup: Auto-remove unused clones after N days
  • Clone Sharing: Share clones between agents with similar tasks
  • Compressed Clones: Use git clone --depth=1 for space efficiency
  • Remote Workspaces: Support for distributed agent execution

The per-agent clone solution provides the optimal balance of performance, safety, and maintainability for concurrent AI agent operations.