Merge Coordinator¶
Comprehensive documentation for the merge coordinator component used by ll-parallel and ll-auto.
Overview¶
The MergeCoordinator is a production-grade git operations state machine that handles sequential integration of parallel worker changes back to the main branch. It goes far beyond simple merge queueing, implementing sophisticated error detection, recovery mechanisms, and adaptive strategies to handle real-world edge cases in concurrent git workflows.
Location: scripts/little_loops/parallel/merge_coordinator.py
Key Characteristics: - Sequential processing to avoid merge conflicts - Background daemon thread for non-blocking operation - Comprehensive error detection and recovery - Adaptive strategies based on failure patterns - Preserves user's local changes through stash management - Coordinates with orchestrator to avoid git index races
Architecture¶
Threading Model¶
┌─────────────────────────────────────────┐
│ Main Thread │
│ ┌────────────────────────────────┐ │
│ │ Orchestrator │ │
│ │ - Dispatches workers │ │
│ │ - Manages lifecycle │ │
│ └────────┬───────────────────────┘ │
│ │ queue_merge() │
│ ▼ │
│ ┌────────────────────────────────┐ │
│ │ Merge Queue (FIFO) │ │
│ └────────┬───────────────────────┘ │
└───────────┼──────────────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Background Thread (daemon) │
│ ┌────────────────────────────────┐ │
│ │ Merge Loop │ │
│ │ - Process one merge at a time │ │
│ │ - Stash/unstash local changes│ │
│ │ - Detect/recover from errors │ │
│ │ - Track results │ │
│ └─────────────────────────────────┘ │
└─────────────────────────────────────────┘
Git Lock Coordination¶
The merge coordinator shares a GitLock with the orchestrator to serialize operations on the main repository:
# Shared lock prevents index.lock race conditions
git_lock = GitLock(logger)
orchestrator = ParallelOrchestrator(..., git_lock=git_lock)
merge_coordinator = MergeCoordinator(..., git_lock=git_lock)
This prevents scenarios like: - Worker commits to main repo while merge coordinator is pulling - Orchestrator updates state file while merge coordinator is checking status - Multiple components trying to modify the git index simultaneously
Core Features¶
1. Sequential Merge Queue¶
Issues are merged one at a time to avoid conflicts:
coordinator = MergeCoordinator(config, logger, repo_path)
coordinator.start() # Start background thread
# Queue merges (non-blocking)
for worker_result in completed_workers:
coordinator.queue_merge(worker_result)
# Wait for all merges to complete
coordinator.wait_for_completion(timeout=300)
coordinator.shutdown()
2. Automatic Stash Management¶
Handles user's uncommitted local changes transparently:
Stash Before Merge:
# Automatically stashes tracked changes, excluding:
# - State file (orchestrator manages it)
# - Lifecycle file moves (to prevent pop conflicts)
# - Files in completed/ directory
# - Claude Code context state file
Pop After Merge:
# Restores stashed changes after merge completes
# Preserves merge even if pop fails (never does reset --hard)
# Tracks pop failures for user notification
Exclusion Logic (BUG-008, BUG-018): - Lifecycle file moves are excluded because they can conflict with the merge when popping - State file is excluded because orchestrator continuously updates it - Completed directory files are excluded because they're lifecycle-managed
3. Lifecycle File Move Coordination¶
Auto-commits pending issue file moves to completed/ directory:
# Before merge, commit any uncommitted lifecycle moves
# Prevents "local changes would be overwritten" errors
self._commit_pending_lifecycle_moves()
Why This Matters (BUG-018):
- Orchestrator moves issue files to completed/ after successful merge
- These moves are excluded from stash (to prevent pop conflicts)
- But uncommitted moves block subsequent merges
- Solution: Auto-commit them with descriptive message
4. State File Protection¶
Uses git update-index --assume-unchanged to hide state file modifications:
# Mark state file as assume-unchanged before pull
self._mark_state_file_assume_unchanged()
# Pull with rebase (won't see state file changes)
git pull --rebase origin main
# Restore normal tracking after merge
self._restore_state_file_tracking()
Why This Matters:
- Orchestrator continuously updates the state file during processing
- git pull --rebase would fail if state file has uncommitted changes
- assume-unchanged tells git to ignore the modifications temporarily
Sophisticated Error Handling¶
Error Detection Methods¶
The coordinator implements 7 specialized error detectors:
| Method | Purpose |
|---|---|
_is_local_changes_error() |
Detects "local changes would be overwritten" |
_is_untracked_files_error() |
Detects untracked files blocking merge |
_is_index_error() |
Detects corrupted git index |
_is_unmerged_files_error() |
Detects pre-existing unmerged files |
_is_rebase_in_progress() |
Checks for incomplete rebase |
_detect_conflict_commit() |
Extracts commit hash from rebase conflict output |
_is_lifecycle_file_move() |
Identifies issue file renames to completed/ or deferred/ (legacy path; post-ENH-1418 completion is frontmatter-based) |
Index Recovery System¶
Automatically detects and recovers from corrupted git state:
def _check_and_recover_index() -> bool:
"""Health check before every merge."""
# 1. Check for incomplete merge (MERGE_HEAD exists)
if merge_head.exists():
git merge --abort
# 2. Check for incomplete rebase (rebase-merge/ exists)
if rebase_dir.exists():
git rebase --abort
git reset --hard HEAD # Defensive after abort
# 3. Check for unmerged files (UU, AA, DD, AU, UA, DU, UD)
if has_unmerged:
git reset --hard HEAD
# 4. Final safety check
if MERGE_HEAD still exists:
git reset --hard HEAD
When It Runs:
- At the start of every _process_merge() call
- Before merge attempts (to ensure clean state)
- After detecting index errors during operations
Why It Matters: - Previous operations can leave git in dirty state - Merge/rebase aborts don't always clean up completely - Prevents cascading failures from one bad merge
Adaptive Pull Strategy¶
Tracks commits that cause repeated rebase conflicts and adapts:
self._problematic_commits: set[str] = set()
# First conflict with commit abc123...
conflict_commit = self._detect_conflict_commit(error_output)
self._problematic_commits.add(conflict_commit)
git rebase --abort
# Continue without pull (existing behavior)
# Second conflict with same commit abc123...
if conflict_commit in self._problematic_commits:
# Use merge strategy instead of rebase
git pull --no-rebase origin main
Commit Detection (ENH-037):
# Extracts 40-character commit hash from rebase output
# Pattern: "dropping ae3b85ec...f6e5da362be37be1c99801 feat(ai): add stall detectio"
match = re.search(r"dropping\s+([a-f0-9]{40})\s+", error_output)
Benefits: - Avoids repeated failed rebase attempts - Reduces log noise - Ensures upstream changes are integrated (merge instead of skip)
Untracked File Conflict Handling¶
Backs up conflicting untracked files and retries:
# Parse conflicting files from error message
conflicting_files = ["path/to/file.txt", ...]
# Create backup directory
backup_dir = .ll-backup/{issue_id}/
# Move conflicting files to backup
for file_path in conflicting_files:
shutil.move(src, backup_dir / file_path)
# Retry merge (now succeeds without conflicts)
Why This Matters: - Worker branches may create files that don't exist in main - Git refuses to merge if it would overwrite untracked files - Backing up preserves the files for user review
Circuit Breaker¶
Pauses after consecutive failures to prevent cascading issues:
self._consecutive_failures = 0
self._paused = False
# On merge failure
self._consecutive_failures += 1
if self._consecutive_failures >= 3:
self._paused = True
# Skip all subsequent merges until manual intervention
# On merge success
self._consecutive_failures = 0 # Reset counter
Why This Matters: - Prevents hundreds of failed merge attempts - Signals that manual intervention is needed - Preserves log readability
Conflict Retry Logic¶
Attempts rebase in worktree on merge conflicts:
# Merge failed with conflict
git merge --abort
# Stash worktree changes if any
cd {worktree_path}
git stash push
# Fetch latest main before rebase (BUG-180)
git fetch origin main
# Rebase branch onto latest main
git rebase origin/main
# If successful, restore stash and retry merge
if success:
git stash pop
coordinator.queue_merge(request) # Re-queue for retry
else:
git rebase --abort
git stash pop
mark_as_failed()
Retry Limit: Configurable via ParallelConfig.max_merge_retries (default: 3)
Skip Retry When (BUG-079): - Merge strategy was used during pull (rebase would fail on same conflicts) - Max retries exceeded
Stash Pop Failure Handling¶
Preserves successful merge even if stash restoration fails:
def _pop_stash() -> bool:
"""Restore stashed changes."""
result = git stash pop
if result.returncode != 0:
# Check for conflicts from stash pop
if has_unmerged:
# Clean up conflicted stash pop WITHOUT affecting merge
git checkout --theirs . # Restore post-merge state
git reset HEAD
# Leave stash intact for manual recovery
# NEVER do: git reset --hard HEAD (would undo the merge!)
# Track failure for user notification
self._stash_pop_failures[issue_id] = "Run 'git stash pop' to recover"
return False # Pop failed, but merge preserved
Key Principle: A successful merge is never undone, even if cleanup fails.
Configuration¶
ParallelConfig¶
@dataclass
class ParallelConfig:
max_workers: int = 3
max_merge_retries: int = 3
state_file: str = ".ll-parallel-state.json"
# ... other fields
Constructor¶
MergeCoordinator(
config: ParallelConfig,
logger: Logger,
repo_path: Path | None = None,
git_lock: GitLock | None = None,
)
Parameters:
- config: Configuration for parallel processing
- logger: Logger instance for output
- repo_path: Path to git repository (default: current directory)
- git_lock: Shared lock for git operations (created if not provided)
API Reference¶
Methods¶
| Method | Description |
|---|---|
start() |
Start background merge thread |
shutdown(wait=True, timeout=30.0) |
Stop background thread |
queue_merge(worker_result) |
Queue a worker result for merging |
wait_for_completion(timeout=None) |
Wait for all pending merges |
Properties¶
| Property | Type | Description |
|---|---|---|
merged_ids |
list[str] |
Successfully merged issue IDs |
failed_merges |
dict[str, str] |
Failed issue IDs → error messages |
pending_count |
int |
Number of pending merge requests |
stash_pop_failures |
dict[str, str] |
Issue IDs where stash pop failed (merge succeeded) |
State Tracking¶
# Check results after completion
print(f"Merged: {coordinator.merged_ids}")
print(f"Failed: {coordinator.failed_merges}")
print(f"Stash issues: {coordinator.stash_pop_failures}")
Usage Example¶
Basic Usage¶
from little_loops.parallel import MergeCoordinator, ParallelConfig
from little_loops.logger import Logger
config = ParallelConfig(max_merge_retries=3)
logger = Logger(verbose=True)
coordinator = MergeCoordinator(config, logger)
coordinator.start()
# Process worker results
for result in completed_workers:
coordinator.queue_merge(result)
# Wait for completion
if coordinator.wait_for_completion(timeout=300):
print(f"Successfully merged: {coordinator.merged_ids}")
print(f"Failed: {coordinator.failed_merges}")
# Check for stash pop failures
if coordinator.stash_pop_failures:
print("Warning: Some stashes could not be restored:")
for issue_id, msg in coordinator.stash_pop_failures.items():
print(f" {issue_id}: {msg}")
else:
print("Timeout waiting for merges")
coordinator.shutdown()
Integration with Orchestrator¶
from little_loops.parallel import ParallelOrchestrator, ParallelConfig
from little_loops.config import BRConfig
# Orchestrator creates and manages merge coordinator
br_config = BRConfig(Path.cwd())
parallel_config = ParallelConfig(max_workers=3)
orchestrator = ParallelOrchestrator(parallel_config, br_config)
# Merge coordinator is automatically started and used
exit_code = orchestrator.run()
Troubleshooting¶
"Circuit breaker tripped" Message¶
Cause: 3+ consecutive merge failures
Resolution:
1. Check git status for conflicts: git status
2. Review recent merge failures in logs
3. Manually resolve any git state issues
4. Restart ll-parallel (circuit breaker resets)
"Stash could not be restored" Warning¶
Cause: Stash pop failed due to conflicts with merged changes
Resolution:
# View stashed changes
git stash show
# Attempt manual restore
git stash pop
# If conflicts, resolve and continue
git add .
git stash drop
Note: The merge was successful; only the restoration of your local changes failed.
Repeated "Pull --rebase failed with conflicts"¶
Cause: Specific commits consistently cause rebase conflicts
Expected Behavior: After first occurrence, should switch to merge strategy
Check: - Look for "Using merge strategy (known conflict: abc12345)" in logs - If not switching, may indicate commit hash detection failure
"Merge blocked by local changes despite stash"¶
Cause: Lifecycle file moves or state file modifications not excluded from stash
Check:
1. Look for .issues/ modified issue files in git status (status: done updates)
2. Check if state file has modifications
3. Verify stash exclusion logic is working
Resolution: These should be auto-committed; if not, file a bug report
Implementation Notes¶
Design Decisions¶
Why sequential processing? - Merging in parallel causes frequent conflicts - Sequential processing eliminates race conditions - Slight performance cost is acceptable vs. conflict resolution complexity
Why not just use git worktree merge? - Worktrees can have stale base branches - Main repo merge ensures consistency - Allows pull from remote before merge
Why so many error detectors? - Git error messages vary across versions - Different failure modes require different recovery strategies - Defensive programming prevents cascading failures
Why preserve merge on stash pop failure? - The work is done; the merge is valuable - User's local changes are safely in stash - Better to preserve merge + manual recovery than lose work
Testing¶
The merge coordinator has 80%+ test coverage with comprehensive integration tests:
# Run merge coordinator tests
python -m pytest scripts/tests/test_merge_coordinator.py -v
# Integration tests (require git)
python -m pytest scripts/tests/test_merge_coordinator.py -m integration
Test Categories: - Stash/pop operations - Conflict handling and retry - Index recovery - Error detection - Lifecycle file coordination - Adaptive pull strategy - Circuit breaker behavior
Battle-Tested Evolution¶
This component has evolved through real-world usage with issues discovered and fixed:
- BUG-007: Worktree files leaking to main repo
- BUG-008: Stash pop failure losing local changes
- BUG-018: Merge blocked by lifecycle file moves (reopened, fixed twice)
- BUG-038: Gitignored leaked files causing cascading pull failures
- ENH-037: Smarter pull strategy for repeated rebase conflicts
- BUG-079: Post-merge rebase causing unnecessary failures
- BUG-140: Worktree merge race condition
- BUG-141: Context state JSON not excluded from stash
- BUG-180: Stale worktree base causing merge failures
- BUG-930: Done/Active counts stuck at 0 in parallel merge path — fixed by calling
mark_completed()/mark_failed()afterwait_for_completion()in the parallel path, mirroring the sequential merge pattern - BUG-931: Commit leak recovery skips main reset when main has advanced — replaced warning-and-return in
_recover_committed_leaks()with a surgicalgit rebase --ontoto reapply work commits on top of the new main baseline - BUG-968:
_is_lifecycle_file_movesubstring check too broad — replaced unanchoredinchecks withstartswithin both_is_lifecycle_file_moveand the secondary block in_stash_local_changes, preventing false positives on paths likemy-issues/completed/file.py
Each issue added sophistication to handle edge cases that appear in production use.
See Also¶
- ARCHITECTURE.md - Overall system architecture
- API.md - Complete API reference
- TROUBLESHOOTING.md - Common issues and solutions
scripts/little_loops/parallel/merge_coordinator.py- Source codescripts/tests/test_merge_coordinator.py- Test suite