Session Handoff¶

Automatic context management and session continuation for long-running tasks.

Table of Contents¶

Overview
Quick Start
Prerequisites
Enable Context Monitoring
Manual Handoff
Resume in New Session
How It Works
Interactive Sessions
Automation Tools
Commands
/ll:handoff
/ll:resume
Configuration
Full Configuration Options
Configuration Reference
Auto-Detect on Session Start
Token Estimation Weights
Files
Troubleshooting
Best Practices
Integration
See Also

Overview¶

Claude Code sessions have a context window limit. When working on complex tasks, context can fill up, potentially losing progress. Session handoff solves this by:

Monitoring context usage in real-time
Warning when approaching the limit (80% by default)
Preserving session state to a continuation prompt
Resuming seamlessly in a fresh session

Quick Start¶

Prerequisites¶

The context monitor hook requires jq to parse the config file at runtime:

# macOS
brew install jq

# Debian/Ubuntu
sudo apt-get install jq

# Verify
which jq

Enable Context Monitoring¶

Add to .ll/ll-config.json:

{
  "context_monitor": {
    "enabled": true,
    "auto_handoff_threshold": 80
  }
}

Manual Handoff¶

When you want to preserve your work:

/ll:handoff                           # Auto-detect context
/ll:handoff "Working on auth module"  # With explicit description

Resume in New Session¶

/ll:resume

How It Works¶

Interactive Sessions¶

┌─────────────────────────────────────────────────────────────────┐
│ Session starts                                                  │
│     ↓                                                           │
│ PostToolUse hook monitors each tool call                        │
│     ↓                                                           │
│ Estimates tokens: Read (10/line), Bash (0.3/char), etc.         │
│     ↓                                                           │
│ When usage >= 80%:                                              │
│     "[ll] Context ~82% used. Run /ll:handoff..."                │
│     ↓                                                           │
│ Reminder repeats on every tool call until handoff executed      │
│     ↓                                                           │
│ /ll:handoff writes .ll/ll-continue-prompt.md                │
│     ↓                                                           │
│ Start new session → /ll:resume → Continue working               │
└─────────────────────────────────────────────────────────────────┘

Automation Tools (ll-auto, ll-parallel)¶

Automation tools handle handoff automatically:

┌─────────────────────────────────────────────────────────────────┐
│ Worker processes issue                                          │
│     ↓                                                           │
│ Context threshold reached                                       │
│     ↓                                                           │
│ PostToolUse hook outputs to stderr (exit 2)                     │
│ "[ll] Context ~82% used. Run /ll:handoff..."                    │
│     ↓                                                           │
│ Claude receives feedback, autonomously runs /ll:handoff         │
│     ↓                                                           │
│ /ll:handoff outputs: "CONTEXT_HANDOFF: Ready for fresh session" │
│     ↓                                                           │
│ CLI detects signal, reads .ll/ll-continue-prompt.md         │
│     ↓                                                           │
│ Spawns fresh Claude session with continuation prompt            │
│     ↓                                                           │
│ Work continues (up to 3 continuations per issue)                │
└─────────────────────────────────────────────────────────────────┘

Commands¶

`/ll:handoff`¶

Generates a continuation prompt capturing current session state. Summarizes conversation history without running external tools by default - using the conversation history already in context.

Usage:

/ll:handoff                              # Conversation summary (default, fast)
/ll:handoff "Refactoring auth module"    # With explicit context hint
/ll:handoff --deep                       # With artifact validation
/ll:handoff "Working on BUG-042" --deep  # Context + artifact validation

Modes:

Mode	Command	What's Captured	Speed
Default	`/ll:handoff`	Conversation summary, decisions, errors, code changes	Fast (no disk I/O)
Deep	`/ll:handoff --deep`	Default + git status, todos, discrepancy detection	Slower (runs git)

Default Output: Writes to .ll/ll-continue-prompt.md:

# Session Continuation: Refactoring auth module

## Conversation Summary

### Primary Intent
Refactoring the authentication module to support OAuth2 providers.

### What Happened
1. Analyzed existing auth middleware
2. Discussed JWT vs session-based approach - chose JWT for statelessness
3. Implemented token validation utility
4. Encountered CORS issue with refresh endpoint - fixed with credentials flag

### User Feedback
- User clarified that refresh tokens should use HTTP-only cookies for security

### Errors and Resolutions
| Error | How Fixed | User Feedback |
|-------|-----------|---------------|
| CORS error on /refresh | Added credentials: 'include' | None |
| Token expiry too short | Increased to 15 minutes | User confirmed 15min is acceptable |

### Code Changes
| File | Changes Made | Discussion Context |
|------|--------------|-------------------|
| `src/middleware/auth.ts:45` | Added token validation | Core auth flow |
| `src/utils/tokens.ts:12` | New token utility | Extracted for reuse |

## Resume Point

### What Was Being Worked On
Implementing the refresh token endpoint callback handler

### Direct Quote
> "Now let's implement the callback handler for the refresh flow"

### Next Step
Add the /auth/refresh endpoint with cookie handling

## Important Context

### Decisions Made
- **JWT over sessions**: Chosen for statelessness and microservice compatibility
- **HTTP-only cookies**: For refresh tokens to prevent XSS

### Gotchas Discovered
- **CORS with credentials**: Must set credentials: 'include' on fetch requests

### User-Specified Constraints
- Refresh tokens must use HTTP-only cookies (security requirement)
- Token expiry: 15 minutes

### Patterns Being Followed
- Following pattern from `src/middleware/rate-limit.ts` for middleware structure

Deep Mode Output: Includes all sections above, plus:

## Artifact Validation

### Current Git Status
```
M  src/middleware/auth.ts
M  src/utils/tokens.ts
?? src/utils/cookies.ts
```

### Discrepancies
No discrepancies detected between conversation and artifacts

### Todo List State
| Status | Task |
|--------|------|
| in_progress | Implement refresh token endpoint |
| pending | Add token rotation on refresh |
| completed | Create token validation utility |

### Plan Files
- Active plan: `thoughts/shared/plans/2024-01-15-auth-refactor.md`

`/ll:resume`¶

Loads a continuation prompt and restores session context.

Usage:

/ll:resume                              # From default location
/ll:resume path/to/custom-prompt.md     # From custom location

Output:

Resuming from previous session
─────────────────────────────────────────────────────────────────
[Continuation prompt content displayed]
─────────────────────────────────────────────────────────────────

Ready to continue. What would you like to do next?

Error — file not found:

If the specified path does not exist, /ll:resume reports an error and stops:

Error: No continuation prompt found at .ll/ll-continue-prompt.md

Run /ll:handoff to generate one, or specify a custom path:
  /ll:resume path/to/custom-prompt.md

Configuration¶

Full Configuration Options¶

{
  "context_monitor": {
    "enabled": true,
    "auto_handoff_threshold": 80,
    "context_limit_estimate": 1000000,
    "estimate_weights": {
      "read_per_line": 10,
      "tool_call_base": 100,
      "bash_output_per_char": 0.3
    },
    "use_transcript_baseline": true,
    "state_file": ".ll/ll-context-state.json"
  },
  "continuation": {
    "enabled": true,
    "auto_detect_on_session_start": true,
    "include_todos": true,
    "include_git_status": true,
    "include_recent_files": true,
    "max_continuations": 3,
    "prompt_expiry_hours": 24
  }
}

Configuration Reference¶

Setting	Default	Description
`context_monitor.enabled`	`false`	Enable automatic context monitoring
`context_monitor.auto_handoff_threshold`	`80`	Percentage (50-95) to trigger warnings
`context_monitor.context_limit_estimate`	`1000000`	Token limit estimate for usage percentage calculation. Default matches Sonnet 4.6/Opus 4.6 (1M). Set to `200000` for Haiku 4.5.
`context_monitor.estimate_weights.read_per_line`	`10`	Token cost per line for Read tool calls
`context_monitor.estimate_weights.tool_call_base`	`100`	Base token overhead per tool call
`context_monitor.estimate_weights.bash_output_per_char`	`0.3`	Token cost per character for Bash output
`context_monitor.use_transcript_baseline`	`true`	Use JSONL transcript token counts as an API-exact baseline (one-turn lag). Improves accuracy from ±30–50% to ±5–15%. Falls back to pure heuristics when unavailable.
`continuation.enabled`	`true`	Enable session continuation features
`continuation.auto_detect_on_session_start`	`true`	Automatically detect and offer to resume an existing continuation prompt when a new session starts
`continuation.include_todos`	`true`	Include current todo list state in deep mode handoff output
`continuation.include_git_status`	`true`	Include git status in deep mode handoff output
`continuation.include_recent_files`	`true`	Include recently modified files in deep mode handoff output
`continuation.max_continuations`	`3`	Max auto-continuations per issue (automation)
`continuation.prompt_expiry_hours`	`24`	Hours before prompt marked stale

The threshold can also be overridden per-run via the --handoff-threshold CLI flag (1-100), which takes precedence over the config value:

ll-auto --handoff-threshold 90      # Trigger handoff at 90% for this run
ll-parallel --handoff-threshold 70  # Earlier warnings for parallel runs
ll-sprint run my-sprint --handoff-threshold 85

Auto-Detect on Session Start¶

When continuation.auto_detect_on_session_start is true (the default), little-loops checks for an existing .ll/ll-continue-prompt.md at the beginning of each session. If a prompt file is found that is not yet expired, a notice is printed prompting you to run /ll:resume. Set this to false to suppress automatic detection and only resume manually.

Transcript Baseline Mode (Default)¶

Advanced — This section explains internal token estimation mechanics. Most users can skip this. It's useful if you're tuning threshold for maximum accuracy.

By default (use_transcript_baseline: true), the monitor uses the JSONL transcript at transcript_path (provided by the PostToolUse hook payload) as an API-exact baseline:

Read input_tokens + cache_creation_input_tokens + cache_read_input_tokens + output_tokens from the last assistant entry in the transcript
Add the current-turn heuristic delta (single tool call estimate) on top
Divide by context_limit_estimate for the usage percentage

This shifts accuracy from ±30–50% (pure heuristics) to ±5–15% (API-exact baseline + small current-turn delta).

Mode	Accuracy	When Active
Transcript baseline	±5–15%	`use_transcript_baseline: true` (default) and transcript available
Pure heuristics	±30–50%	Fallback when transcript is absent or parse fails

The transcript has a one-turn lag (it reflects the last completed API call). The current-turn estimate bridges the gap.

Token Estimation Weights¶

The context monitor estimates the current-turn delta based on tool activity:

Advanced — These weights are pre-tuned and rarely need adjustment.

Tool	Estimation	Rationale
Read	`lines × 10`	File content is verbose
Grep	`matches × 5`	Summarized search results
Bash	`chars × 0.3`	Command output varies
Glob	`files × 20`	File lists are compact
Write/Edit	`300`	Base cost × 3 for edits
Task	`2000`	Agent responses are summarized
WebFetch	`1500`	Web content is processed
WebSearch	`1000`	Search results summary
Other	`100`	Base overhead per call

Files¶

File	Purpose
`.ll/ll-continue-prompt.md`	Generated continuation prompt
`.ll/ll-context-state.json`	Running context usage state
`.ll/ll-session-state.json`	Session metadata (fallback)

State File Format¶

.ll/ll-context-state.json:

You'll rarely need to inspect this directly, but it's useful for debugging stuck continuations.

{
  "session_start": "2024-01-15T10:30:00Z",
  "estimated_tokens": 125000,
  "transcript_baseline_tokens": 122000,
  "result_token_count": 124500,
  "tool_calls": 63,
  "threshold_crossed_at": "2024-01-15T11:45:00Z",
  "handoff_complete": false,
  "breakdown": {
    "read": 60000,
    "bash": 30000,
    "grep": 15000,
    "glob": 5000,
    "task": 15000
  }
}

transcript_baseline_tokens: The raw API token sum from the last assistant entry in the JSONL transcript (0 when unavailable or use_transcript_baseline: false). Useful for diagnosing estimation accuracy.
result_token_count: The authoritative input_tokens + cache_read_input_tokens + output_tokens total from the most recent stream-json result event, written by the on_usage callback in process_issue_inplace. When non-zero, the context monitor uses this value directly instead of heuristics or the transcript baseline (zero lag, maximum accuracy).

Troubleshooting¶

Context monitor not triggering¶

Check if enabled:

cat .ll/ll-config.json | jq '.context_monitor.enabled'

Verify jq is installed (required for the hook):
```
which jq
```
Check state file:
```
cat .ll/ll-context-state.json
```

Reminders keep appearing after handoff¶

The monitor checks if .ll/ll-continue-prompt.md was modified after the threshold was crossed. Ensure:

/ll:handoff was run (not just manually creating the file)
The file modification time is recent
Check handoff_complete in state file

Resume shows stale prompt¶

Prompts older than prompt_expiry_hours (default: 24) are marked stale. The content is still shown, but a warning appears. You can:

Run /ll:handoff to generate a fresh prompt
Increase prompt_expiry_hours in config

Automation not detecting handoff¶

Ensure the handoff command outputs the signal:

CONTEXT_HANDOFF: Ready for fresh session

Check subprocess_utils.py detection pattern:

CONTEXT_HANDOFF_PATTERN = re.compile(r"CONTEXT_HANDOFF:\s*Ready for fresh session")

Max continuations reached¶

If you see "Reached max continuations", the issue required more than 3 session restarts. Options:

Increase continuation.max_continuations in config
Break the issue into smaller tasks
Run remaining work manually

Best Practices¶

When to Use Manual Handoff¶

Before taking a break on a long task
When you notice context filling up
Before switching to a different task
When the hook starts warning you

Writing Good Continuation Prompts¶

The /ll:handoff command auto-generates prompts from conversation history. You can improve them:

Provide explicit context hints when running handoff:

/ll:handoff "Implementing OAuth2 flow - finished provider setup, starting callback handler"

Use --deep for complex situations when you need to verify disk state:
```
/ll:handoff --deep
```
Keep todos updated - they're included in deep mode validation
Discuss decisions in conversation - the conversation summary captures reasoning and trade-offs

For Automation¶

Set appropriate thresholds - Lower threshold (70%) for complex issues
Monitor logs - Check for CONTEXT_HANDOFF signals
Review continuation count - High counts may indicate issues that need splitting

Integration¶

With Other Hooks¶

PostToolUse hook: Monitors context usage and triggers handoff reminders
Stop hook: Cleans up context state when the session ends by deleting .ll/ll-context-state.json and .ll/ll-session-state.json; the continuation prompt .ll/ll-continue-prompt.md is preserved for use by /ll:resume

With Automation Tools¶

Both ll-auto and ll-parallel support automatic continuation:

# In issue_manager.py and worker_pool.py
if detect_context_handoff(result.stdout):
    prompt_content = read_continuation_prompt(working_dir)
    # Spawn fresh session with prompt