Skip to content

Session Handoff

Automatic context management and session continuation for long-running tasks.

Table of Contents

Overview

Claude Code sessions have a context window limit. When working on complex tasks, context can fill up, potentially losing progress. Session handoff solves this by:

  1. Monitoring context usage in real-time
  2. Warning when approaching the limit (80% by default)
  3. Preserving session state to a continuation prompt
  4. Resuming seamlessly in a fresh session

Quick Start

Prerequisites

The context monitor hook requires jq to parse the config file at runtime:

# macOS
brew install jq

# Debian/Ubuntu
sudo apt-get install jq

# Verify
which jq

Enable Context Monitoring

Add to .ll/ll-config.json:

{
  "context_monitor": {
    "enabled": true,
    "auto_handoff_threshold": 80
  }
}

Manual Handoff

When you want to preserve your work:

/ll:handoff                           # Auto-detect context
/ll:handoff "Working on auth module"  # With explicit description

Resume in New Session

/ll:resume

How It Works

Interactive Sessions

┌─────────────────────────────────────────────────────────────────┐
│ Session starts                                                  │
│     ↓                                                           │
│ PostToolUse hook monitors each tool call                        │
│     ↓                                                           │
│ Estimates tokens: Read (10/line), Bash (0.3/char), etc.         │
│     ↓                                                           │
│ When usage >= 80%:                                              │
│     "[ll] Context ~82% used. Run /ll:handoff..."                │
│     ↓                                                           │
│ Reminder repeats on every tool call until handoff executed      │
│     ↓                                                           │
│ /ll:handoff writes .ll/ll-continue-prompt.md                │
│     ↓                                                           │
│ Start new session → /ll:resume → Continue working               │
└─────────────────────────────────────────────────────────────────┘

Automation Tools (ll-auto, ll-parallel)

Automation tools handle handoff automatically:

┌─────────────────────────────────────────────────────────────────┐
│ Worker processes issue                                          │
│     ↓                                                           │
│ Context threshold reached                                       │
│     ↓                                                           │
│ PostToolUse hook outputs to stderr (exit 2)                     │
│ "[ll] Context ~82% used. Run /ll:handoff..."                    │
│     ↓                                                           │
│ Claude receives feedback, autonomously runs /ll:handoff         │
│     ↓                                                           │
│ /ll:handoff outputs: "CONTEXT_HANDOFF: Ready for fresh session" │
│     ↓                                                           │
│ CLI detects signal, reads .ll/ll-continue-prompt.md         │
│     ↓                                                           │
│ Spawns fresh Claude session with continuation prompt            │
│     ↓                                                           │
│ Work continues (up to 3 continuations per issue)                │
└─────────────────────────────────────────────────────────────────┘

Commands

/ll:handoff

Generates a continuation prompt capturing current session state. Summarizes conversation history without running external tools by default - using the conversation history already in context.

Usage:

/ll:handoff                              # Conversation summary (default, fast)
/ll:handoff "Refactoring auth module"    # With explicit context hint
/ll:handoff --deep                       # With artifact validation
/ll:handoff "Working on BUG-042" --deep  # Context + artifact validation

Modes:

Mode Command What's Captured Speed
Default /ll:handoff Conversation summary, decisions, errors, code changes Fast (no disk I/O)
Deep /ll:handoff --deep Default + git status, todos, discrepancy detection Slower (runs git)

Default Output: Writes to .ll/ll-continue-prompt.md:

# Session Continuation: Refactoring auth module

## Conversation Summary

### Primary Intent
Refactoring the authentication module to support OAuth2 providers.

### What Happened
1. Analyzed existing auth middleware
2. Discussed JWT vs session-based approach - chose JWT for statelessness
3. Implemented token validation utility
4. Encountered CORS issue with refresh endpoint - fixed with credentials flag

### User Feedback
- User clarified that refresh tokens should use HTTP-only cookies for security

### Errors and Resolutions
| Error | How Fixed | User Feedback |
|-------|-----------|---------------|
| CORS error on /refresh | Added credentials: 'include' | None |
| Token expiry too short | Increased to 15 minutes | User confirmed 15min is acceptable |

### Code Changes
| File | Changes Made | Discussion Context |
|------|--------------|-------------------|
| `src/middleware/auth.ts:45` | Added token validation | Core auth flow |
| `src/utils/tokens.ts:12` | New token utility | Extracted for reuse |

## Resume Point

### What Was Being Worked On
Implementing the refresh token endpoint callback handler

### Direct Quote
> "Now let's implement the callback handler for the refresh flow"

### Next Step
Add the /auth/refresh endpoint with cookie handling

## Important Context

### Decisions Made
- **JWT over sessions**: Chosen for statelessness and microservice compatibility
- **HTTP-only cookies**: For refresh tokens to prevent XSS

### Gotchas Discovered
- **CORS with credentials**: Must set credentials: 'include' on fetch requests

### User-Specified Constraints
- Refresh tokens must use HTTP-only cookies (security requirement)
- Token expiry: 15 minutes

### Patterns Being Followed
- Following pattern from `src/middleware/rate-limit.ts` for middleware structure

Deep Mode Output: Includes all sections above, plus:

## Artifact Validation

### Current Git Status
```
M  src/middleware/auth.ts
M  src/utils/tokens.ts
?? src/utils/cookies.ts
```

### Discrepancies
No discrepancies detected between conversation and artifacts

### Todo List State
| Status | Task |
|--------|------|
| in_progress | Implement refresh token endpoint |
| pending | Add token rotation on refresh |
| completed | Create token validation utility |

### Plan Files
- Active plan: `thoughts/shared/plans/2024-01-15-auth-refactor.md`

/ll:resume

Loads a continuation prompt and restores session context.

Usage:

/ll:resume                              # From default location
/ll:resume path/to/custom-prompt.md     # From custom location

Output:

Resuming from previous session
─────────────────────────────────────────────────────────────────
[Continuation prompt content displayed]
─────────────────────────────────────────────────────────────────

Ready to continue. What would you like to do next?

Error — file not found:

If the specified path does not exist, /ll:resume reports an error and stops:

Error: No continuation prompt found at .ll/ll-continue-prompt.md

Run /ll:handoff to generate one, or specify a custom path:
  /ll:resume path/to/custom-prompt.md

Configuration

Full Configuration Options

{
  "context_monitor": {
    "enabled": true,
    "auto_handoff_threshold": 80,
    "context_limit_estimate": 1000000,
    "estimate_weights": {
      "read_per_line": 10,
      "tool_call_base": 100,
      "bash_output_per_char": 0.3
    },
    "use_transcript_baseline": true,
    "state_file": ".ll/ll-context-state.json"
  },
  "continuation": {
    "enabled": true,
    "auto_detect_on_session_start": true,
    "include_todos": true,
    "include_git_status": true,
    "include_recent_files": true,
    "max_continuations": 3,
    "prompt_expiry_hours": 24
  }
}

Configuration Reference

Setting Default Description
context_monitor.enabled false Enable automatic context monitoring
context_monitor.auto_handoff_threshold 80 Percentage (50-95) to trigger warnings
context_monitor.context_limit_estimate 1000000 Token limit estimate for usage percentage calculation. Default matches Sonnet 4.6/Opus 4.6 (1M). Set to 200000 for Haiku 4.5.
context_monitor.estimate_weights.read_per_line 10 Token cost per line for Read tool calls
context_monitor.estimate_weights.tool_call_base 100 Base token overhead per tool call
context_monitor.estimate_weights.bash_output_per_char 0.3 Token cost per character for Bash output
context_monitor.use_transcript_baseline true Use JSONL transcript token counts as an API-exact baseline (one-turn lag). Improves accuracy from ±30–50% to ±5–15%. Falls back to pure heuristics when unavailable.
continuation.enabled true Enable session continuation features
continuation.auto_detect_on_session_start true Automatically detect and offer to resume an existing continuation prompt when a new session starts
continuation.include_todos true Include current todo list state in deep mode handoff output
continuation.include_git_status true Include git status in deep mode handoff output
continuation.include_recent_files true Include recently modified files in deep mode handoff output
continuation.max_continuations 3 Max auto-continuations per issue (automation)
continuation.prompt_expiry_hours 24 Hours before prompt marked stale

The threshold can also be overridden per-run via the --handoff-threshold CLI flag (1-100), which takes precedence over the config value:

ll-auto --handoff-threshold 90      # Trigger handoff at 90% for this run
ll-parallel --handoff-threshold 70  # Earlier warnings for parallel runs
ll-sprint run my-sprint --handoff-threshold 85

Auto-Detect on Session Start

When continuation.auto_detect_on_session_start is true (the default), little-loops checks for an existing .ll/ll-continue-prompt.md at the beginning of each session. If a prompt file is found that is not yet expired, a notice is printed prompting you to run /ll:resume. Set this to false to suppress automatic detection and only resume manually.

Transcript Baseline Mode (Default)

Advanced — This section explains internal token estimation mechanics. Most users can skip this. It's useful if you're tuning threshold for maximum accuracy.

By default (use_transcript_baseline: true), the monitor uses the JSONL transcript at transcript_path (provided by the PostToolUse hook payload) as an API-exact baseline:

  1. Read input_tokens + cache_creation_input_tokens + cache_read_input_tokens + output_tokens from the last assistant entry in the transcript
  2. Add the current-turn heuristic delta (single tool call estimate) on top
  3. Divide by context_limit_estimate for the usage percentage

This shifts accuracy from ±30–50% (pure heuristics) to ±5–15% (API-exact baseline + small current-turn delta).

Mode Accuracy When Active
Transcript baseline ±5–15% use_transcript_baseline: true (default) and transcript available
Pure heuristics ±30–50% Fallback when transcript is absent or parse fails

The transcript has a one-turn lag (it reflects the last completed API call). The current-turn estimate bridges the gap.

Token Estimation Weights

The context monitor estimates the current-turn delta based on tool activity:

Advanced — These weights are pre-tuned and rarely need adjustment.

Tool Estimation Rationale
Read lines × 10 File content is verbose
Grep matches × 5 Summarized search results
Bash chars × 0.3 Command output varies
Glob files × 20 File lists are compact
Write/Edit 300 Base cost × 3 for edits
Task 2000 Agent responses are summarized
WebFetch 1500 Web content is processed
WebSearch 1000 Search results summary
Other 100 Base overhead per call

Files

File Purpose
.ll/ll-continue-prompt.md Generated continuation prompt
.ll/ll-context-state.json Running context usage state
.ll/ll-session-state.json Session metadata (fallback)

State File Format

.ll/ll-context-state.json:

You'll rarely need to inspect this directly, but it's useful for debugging stuck continuations.

{
  "session_start": "2024-01-15T10:30:00Z",
  "estimated_tokens": 125000,
  "transcript_baseline_tokens": 122000,
  "result_token_count": 124500,
  "tool_calls": 63,
  "threshold_crossed_at": "2024-01-15T11:45:00Z",
  "handoff_complete": false,
  "breakdown": {
    "read": 60000,
    "bash": 30000,
    "grep": 15000,
    "glob": 5000,
    "task": 15000
  }
}
  • transcript_baseline_tokens: The raw API token sum from the last assistant entry in the JSONL transcript (0 when unavailable or use_transcript_baseline: false). Useful for diagnosing estimation accuracy.
  • result_token_count: The authoritative input_tokens + cache_read_input_tokens + output_tokens total from the most recent stream-json result event, written by the on_usage callback in process_issue_inplace. When non-zero, the context monitor uses this value directly instead of heuristics or the transcript baseline (zero lag, maximum accuracy).

Troubleshooting

Context monitor not triggering

  1. Check if enabled:

    cat .ll/ll-config.json | jq '.context_monitor.enabled'
    

  2. Verify jq is installed (required for the hook):

    which jq
    

  3. Check state file:

    cat .ll/ll-context-state.json
    

Reminders keep appearing after handoff

The monitor checks if .ll/ll-continue-prompt.md was modified after the threshold was crossed. Ensure:

  1. /ll:handoff was run (not just manually creating the file)
  2. The file modification time is recent
  3. Check handoff_complete in state file

Resume shows stale prompt

Prompts older than prompt_expiry_hours (default: 24) are marked stale. The content is still shown, but a warning appears. You can:

  1. Run /ll:handoff to generate a fresh prompt
  2. Increase prompt_expiry_hours in config

Automation not detecting handoff

Ensure the handoff command outputs the signal:

CONTEXT_HANDOFF: Ready for fresh session

Check subprocess_utils.py detection pattern:

CONTEXT_HANDOFF_PATTERN = re.compile(r"CONTEXT_HANDOFF:\s*Ready for fresh session")

Max continuations reached

If you see "Reached max continuations", the issue required more than 3 session restarts. Options:

  1. Increase continuation.max_continuations in config
  2. Break the issue into smaller tasks
  3. Run remaining work manually

Best Practices

When to Use Manual Handoff

  • Before taking a break on a long task
  • When you notice context filling up
  • Before switching to a different task
  • When the hook starts warning you

Writing Good Continuation Prompts

The /ll:handoff command auto-generates prompts from conversation history. You can improve them:

  1. Provide explicit context hints when running handoff:

    /ll:handoff "Implementing OAuth2 flow - finished provider setup, starting callback handler"
    

  2. Use --deep for complex situations when you need to verify disk state:

    /ll:handoff --deep
    

  3. Keep todos updated - they're included in deep mode validation

  4. Discuss decisions in conversation - the conversation summary captures reasoning and trade-offs

For Automation

  1. Set appropriate thresholds - Lower threshold (70%) for complex issues
  2. Monitor logs - Check for CONTEXT_HANDOFF signals
  3. Review continuation count - High counts may indicate issues that need splitting

Integration

With Other Hooks

  • PostToolUse hook: Monitors context usage and triggers handoff reminders
  • Stop hook: Cleans up context state when the session ends by deleting .ll/ll-context-state.json and .ll/ll-session-state.json; the continuation prompt .ll/ll-continue-prompt.md is preserved for use by /ll:resume

With Automation Tools

Both ll-auto and ll-parallel support automatic continuation:

# In issue_manager.py and worker_pool.py
if detect_context_handoff(result.stdout):
    prompt_content = read_continuation_prompt(working_dir)
    # Spawn fresh session with prompt

See Also