Session Handoff¶
Automatic context management and session continuation for long-running tasks.
Table of Contents¶
- Overview
- Quick Start
- Prerequisites
- Enable Context Monitoring
- Manual Handoff
- Resume in New Session
- How It Works
- Interactive Sessions
- Automation Tools
- Commands
/ll:handoff/ll:resume- Configuration
- Full Configuration Options
- Configuration Reference
- Auto-Detect on Session Start
- Token Estimation Weights
- Files
- Troubleshooting
- Best Practices
- Integration
- See Also
Overview¶
Claude Code sessions have a context window limit. When working on complex tasks, context can fill up, potentially losing progress. Session handoff solves this by:
- Monitoring context usage in real-time
- Warning when approaching the limit (80% by default)
- Preserving session state to a continuation prompt
- Resuming seamlessly in a fresh session
Quick Start¶
Prerequisites¶
The context monitor hook requires jq to parse the config file at runtime:
Enable Context Monitoring¶
Add to .ll/ll-config.json:
Manual Handoff¶
When you want to preserve your work:
Resume in New Session¶
How It Works¶
Interactive Sessions¶
┌─────────────────────────────────────────────────────────────────┐
│ Session starts │
│ ↓ │
│ PostToolUse hook monitors each tool call │
│ ↓ │
│ Estimates tokens: Read (10/line), Bash (0.3/char), etc. │
│ ↓ │
│ When usage >= 80%: │
│ "[ll] Context ~82% used. Run /ll:handoff..." │
│ ↓ │
│ Reminder repeats on every tool call until handoff executed │
│ ↓ │
│ /ll:handoff writes .ll/ll-continue-prompt.md │
│ ↓ │
│ Start new session → /ll:resume → Continue working │
└─────────────────────────────────────────────────────────────────┘
Automation Tools (ll-auto, ll-parallel)¶
Automation tools handle handoff automatically:
┌─────────────────────────────────────────────────────────────────┐
│ Worker processes issue │
│ ↓ │
│ Context threshold reached │
│ ↓ │
│ PostToolUse hook outputs to stderr (exit 2) │
│ "[ll] Context ~82% used. Run /ll:handoff..." │
│ ↓ │
│ Claude receives feedback, autonomously runs /ll:handoff │
│ ↓ │
│ /ll:handoff outputs: "CONTEXT_HANDOFF: Ready for fresh session" │
│ ↓ │
│ CLI detects signal, reads .ll/ll-continue-prompt.md │
│ ↓ │
│ Spawns fresh Claude session with continuation prompt │
│ ↓ │
│ Work continues (up to 3 continuations per issue) │
└─────────────────────────────────────────────────────────────────┘
Commands¶
/ll:handoff¶
Generates a continuation prompt capturing current session state. Summarizes conversation history without running external tools by default - using the conversation history already in context.
Usage:
/ll:handoff # Conversation summary (default, fast)
/ll:handoff "Refactoring auth module" # With explicit context hint
/ll:handoff --deep # With artifact validation
/ll:handoff "Working on BUG-042" --deep # Context + artifact validation
Modes:
| Mode | Command | What's Captured | Speed |
|---|---|---|---|
| Default | /ll:handoff |
Conversation summary, decisions, errors, code changes | Fast (no disk I/O) |
| Deep | /ll:handoff --deep |
Default + git status, todos, discrepancy detection | Slower (runs git) |
Default Output: Writes to .ll/ll-continue-prompt.md:
# Session Continuation: Refactoring auth module
## Conversation Summary
### Primary Intent
Refactoring the authentication module to support OAuth2 providers.
### What Happened
1. Analyzed existing auth middleware
2. Discussed JWT vs session-based approach - chose JWT for statelessness
3. Implemented token validation utility
4. Encountered CORS issue with refresh endpoint - fixed with credentials flag
### User Feedback
- User clarified that refresh tokens should use HTTP-only cookies for security
### Errors and Resolutions
| Error | How Fixed | User Feedback |
|-------|-----------|---------------|
| CORS error on /refresh | Added credentials: 'include' | None |
| Token expiry too short | Increased to 15 minutes | User confirmed 15min is acceptable |
### Code Changes
| File | Changes Made | Discussion Context |
|------|--------------|-------------------|
| `src/middleware/auth.ts:45` | Added token validation | Core auth flow |
| `src/utils/tokens.ts:12` | New token utility | Extracted for reuse |
## Resume Point
### What Was Being Worked On
Implementing the refresh token endpoint callback handler
### Direct Quote
> "Now let's implement the callback handler for the refresh flow"
### Next Step
Add the /auth/refresh endpoint with cookie handling
## Important Context
### Decisions Made
- **JWT over sessions**: Chosen for statelessness and microservice compatibility
- **HTTP-only cookies**: For refresh tokens to prevent XSS
### Gotchas Discovered
- **CORS with credentials**: Must set credentials: 'include' on fetch requests
### User-Specified Constraints
- Refresh tokens must use HTTP-only cookies (security requirement)
- Token expiry: 15 minutes
### Patterns Being Followed
- Following pattern from `src/middleware/rate-limit.ts` for middleware structure
Deep Mode Output: Includes all sections above, plus:
## Artifact Validation
### Current Git Status
```
M src/middleware/auth.ts
M src/utils/tokens.ts
?? src/utils/cookies.ts
```
### Discrepancies
No discrepancies detected between conversation and artifacts
### Todo List State
| Status | Task |
|--------|------|
| in_progress | Implement refresh token endpoint |
| pending | Add token rotation on refresh |
| completed | Create token validation utility |
### Plan Files
- Active plan: `thoughts/shared/plans/2024-01-15-auth-refactor.md`
/ll:resume¶
Loads a continuation prompt and restores session context.
Usage:
Output:
Resuming from previous session
─────────────────────────────────────────────────────────────────
[Continuation prompt content displayed]
─────────────────────────────────────────────────────────────────
Ready to continue. What would you like to do next?
Error — file not found:
If the specified path does not exist, /ll:resume reports an error and stops:
Error: No continuation prompt found at .ll/ll-continue-prompt.md
Run /ll:handoff to generate one, or specify a custom path:
/ll:resume path/to/custom-prompt.md
Configuration¶
Full Configuration Options¶
{
"context_monitor": {
"enabled": true,
"auto_handoff_threshold": 80,
"context_limit_estimate": 1000000,
"estimate_weights": {
"read_per_line": 10,
"tool_call_base": 100,
"bash_output_per_char": 0.3
},
"use_transcript_baseline": true,
"state_file": ".ll/ll-context-state.json"
},
"continuation": {
"enabled": true,
"auto_detect_on_session_start": true,
"include_todos": true,
"include_git_status": true,
"include_recent_files": true,
"max_continuations": 3,
"prompt_expiry_hours": 24
}
}
Configuration Reference¶
| Setting | Default | Description |
|---|---|---|
context_monitor.enabled |
false |
Enable automatic context monitoring |
context_monitor.auto_handoff_threshold |
80 |
Percentage (50-95) to trigger warnings |
context_monitor.context_limit_estimate |
1000000 |
Token limit estimate for usage percentage calculation. Default matches Sonnet 4.6/Opus 4.6 (1M). Set to 200000 for Haiku 4.5. |
context_monitor.estimate_weights.read_per_line |
10 |
Token cost per line for Read tool calls |
context_monitor.estimate_weights.tool_call_base |
100 |
Base token overhead per tool call |
context_monitor.estimate_weights.bash_output_per_char |
0.3 |
Token cost per character for Bash output |
context_monitor.use_transcript_baseline |
true |
Use JSONL transcript token counts as an API-exact baseline (one-turn lag). Improves accuracy from ±30–50% to ±5–15%. Falls back to pure heuristics when unavailable. |
continuation.enabled |
true |
Enable session continuation features |
continuation.auto_detect_on_session_start |
true |
Automatically detect and offer to resume an existing continuation prompt when a new session starts |
continuation.include_todos |
true |
Include current todo list state in deep mode handoff output |
continuation.include_git_status |
true |
Include git status in deep mode handoff output |
continuation.include_recent_files |
true |
Include recently modified files in deep mode handoff output |
continuation.max_continuations |
3 |
Max auto-continuations per issue (automation) |
continuation.prompt_expiry_hours |
24 |
Hours before prompt marked stale |
The threshold can also be overridden per-run via the --handoff-threshold CLI flag (1-100), which takes precedence over the config value:
ll-auto --handoff-threshold 90 # Trigger handoff at 90% for this run
ll-parallel --handoff-threshold 70 # Earlier warnings for parallel runs
ll-sprint run my-sprint --handoff-threshold 85
Auto-Detect on Session Start¶
When continuation.auto_detect_on_session_start is true (the default), little-loops checks for an existing .ll/ll-continue-prompt.md at the beginning of each session. If a prompt file is found that is not yet expired, a notice is printed prompting you to run /ll:resume. Set this to false to suppress automatic detection and only resume manually.
Transcript Baseline Mode (Default)¶
Advanced — This section explains internal token estimation mechanics. Most users can skip this. It's useful if you're tuning
thresholdfor maximum accuracy.
By default (use_transcript_baseline: true), the monitor uses the JSONL transcript at transcript_path (provided by the PostToolUse hook payload) as an API-exact baseline:
- Read
input_tokens + cache_creation_input_tokens + cache_read_input_tokens + output_tokensfrom the lastassistantentry in the transcript - Add the current-turn heuristic delta (single tool call estimate) on top
- Divide by
context_limit_estimatefor the usage percentage
This shifts accuracy from ±30–50% (pure heuristics) to ±5–15% (API-exact baseline + small current-turn delta).
| Mode | Accuracy | When Active |
|---|---|---|
| Transcript baseline | ±5–15% | use_transcript_baseline: true (default) and transcript available |
| Pure heuristics | ±30–50% | Fallback when transcript is absent or parse fails |
The transcript has a one-turn lag (it reflects the last completed API call). The current-turn estimate bridges the gap.
Token Estimation Weights¶
The context monitor estimates the current-turn delta based on tool activity:
Advanced — These weights are pre-tuned and rarely need adjustment.
| Tool | Estimation | Rationale |
|---|---|---|
| Read | lines × 10 |
File content is verbose |
| Grep | matches × 5 |
Summarized search results |
| Bash | chars × 0.3 |
Command output varies |
| Glob | files × 20 |
File lists are compact |
| Write/Edit | 300 |
Base cost × 3 for edits |
| Task | 2000 |
Agent responses are summarized |
| WebFetch | 1500 |
Web content is processed |
| WebSearch | 1000 |
Search results summary |
| Other | 100 |
Base overhead per call |
Files¶
| File | Purpose |
|---|---|
.ll/ll-continue-prompt.md |
Generated continuation prompt |
.ll/ll-context-state.json |
Running context usage state |
.ll/ll-session-state.json |
Session metadata (fallback) |
State File Format¶
.ll/ll-context-state.json:
You'll rarely need to inspect this directly, but it's useful for debugging stuck continuations.
{
"session_start": "2024-01-15T10:30:00Z",
"estimated_tokens": 125000,
"transcript_baseline_tokens": 122000,
"result_token_count": 124500,
"tool_calls": 63,
"threshold_crossed_at": "2024-01-15T11:45:00Z",
"handoff_complete": false,
"breakdown": {
"read": 60000,
"bash": 30000,
"grep": 15000,
"glob": 5000,
"task": 15000
}
}
transcript_baseline_tokens: The raw API token sum from the last assistant entry in the JSONL transcript (0 when unavailable oruse_transcript_baseline: false). Useful for diagnosing estimation accuracy.result_token_count: The authoritativeinput_tokens + cache_read_input_tokens + output_tokenstotal from the most recent stream-jsonresultevent, written by theon_usagecallback inprocess_issue_inplace. When non-zero, the context monitor uses this value directly instead of heuristics or the transcript baseline (zero lag, maximum accuracy).
Troubleshooting¶
Context monitor not triggering¶
-
Check if enabled:
-
Verify jq is installed (required for the hook):
-
Check state file:
Reminders keep appearing after handoff¶
The monitor checks if .ll/ll-continue-prompt.md was modified after the threshold was crossed. Ensure:
/ll:handoffwas run (not just manually creating the file)- The file modification time is recent
- Check
handoff_completein state file
Resume shows stale prompt¶
Prompts older than prompt_expiry_hours (default: 24) are marked stale. The content is still shown, but a warning appears. You can:
- Run
/ll:handoffto generate a fresh prompt - Increase
prompt_expiry_hoursin config
Automation not detecting handoff¶
Ensure the handoff command outputs the signal:
Check subprocess_utils.py detection pattern:
Max continuations reached¶
If you see "Reached max continuations", the issue required more than 3 session restarts. Options:
- Increase
continuation.max_continuationsin config - Break the issue into smaller tasks
- Run remaining work manually
Best Practices¶
When to Use Manual Handoff¶
- Before taking a break on a long task
- When you notice context filling up
- Before switching to a different task
- When the hook starts warning you
Writing Good Continuation Prompts¶
The /ll:handoff command auto-generates prompts from conversation history. You can improve them:
-
Provide explicit context hints when running handoff:
-
Use
--deepfor complex situations when you need to verify disk state: -
Keep todos updated - they're included in deep mode validation
-
Discuss decisions in conversation - the conversation summary captures reasoning and trade-offs
For Automation¶
- Set appropriate thresholds - Lower threshold (70%) for complex issues
- Monitor logs - Check for
CONTEXT_HANDOFFsignals - Review continuation count - High counts may indicate issues that need splitting
Integration¶
With Other Hooks¶
- PostToolUse hook: Monitors context usage and triggers handoff reminders
- Stop hook: Cleans up context state when the session ends by deleting
.ll/ll-context-state.jsonand.ll/ll-session-state.json; the continuation prompt.ll/ll-continue-prompt.mdis preserved for use by/ll:resume
With Automation Tools¶
Both ll-auto and ll-parallel support automatic continuation:
# In issue_manager.py and worker_pool.py
if detect_context_handoff(result.stdout):
prompt_content = read_continuation_prompt(working_dir)
# Spawn fresh session with prompt
See Also¶
- ARCHITECTURE.md - Technical details
- commands/handoff.md - Command reference
- commands/resume.md - Command reference