Troubleshooting Guide¶

Common Issues and Solutions¶

Installation Issues¶

Agent Not Found¶

Problem: ralph: command 'claude' not found

Solutions:

Verify agent installation:

which claude
which gemini
which q

Install missing agent:

# Claude
npm install -g @anthropic-ai/claude-code

# Gemini
npm install -g @google/gemini-cli

Add to PATH:

export PATH=$PATH:/usr/local/bin

Permission Denied¶

Problem: Permission denied: './ralph'

Solution:

chmod +x ralph ralph_orchestrator.py

Execution Issues¶

Task Running Too Long¶

Problem: Ralph runs maximum iterations without achieving goals

Possible Causes:

Unclear or overly complex task description
Agent not making progress towards objectives
Task scope too large for iteration limits

Solutions:

Check iteration progress and logs:

ralph status

Break down complex tasks:

# Instead of:

Build a complete web application

# Try:

Create a Flask app with one endpoint that returns "Hello World"

Increase iteration limits or try different agent:

ralph run --max-iterations 200
ralph run --agent gemini

Agent Timeout¶

Problem: Agent execution timed out

Solutions:

Increase timeout:

# In ralph.json
{
  "timeout_per_iteration": 600
}

Reduce prompt complexity:
Break large tasks into smaller ones
Remove unnecessary context
Check system resources:

htop
free -h

Repeated Errors¶

Problem: Same error occurs in multiple iterations

Solutions:

Check error pattern:

cat .agent/metrics/state_*.json | jq '.errors'

Clear workspace and retry:

ralph clean
ralph run

Manual intervention:
Fix the specific issue
Add clarification to PROMPT.md
Resume execution

Loop Detection Issues¶

Problem: Loop detected: XX% similarity to previous output

Ralph's loop detection triggers when agent output is ≥90% similar to any of the last 5 outputs.

Possible Causes:

Agent is stuck on the same subtask
Agent producing similar "working on it" messages
API errors causing identical retry messages
Task requires same action repeatedly (false positive)

Solutions:

Check if it's a legitimate loop:

# Review recent outputs
ls -lt .agent/prompts/ | head -10
diff .agent/prompts/prompt_N.md .agent/prompts/prompt_N-1.md

Improve prompt to encourage variety:

# Add explicit progress tracking

## Current Status

Document what step you're on and what has changed since last iteration.

Break down the task:
If agent keeps doing the same thing, the task may need restructuring
Split into smaller, more distinct subtasks
Check for underlying issues:
API errors causing retries
Permission issues blocking progress
Missing dependencies

Completion Marker Not Detected¶

Problem: Ralph continues running despite TASK_COMPLETE marker

Possible Causes:

Incorrect marker format
Invisible characters or encoding issues
Marker buried in code block

Solutions:

Use exact format:

# Correct formats:

- [x] TASK_COMPLETE
      [x] TASK_COMPLETE

# Incorrect (won't trigger):

- [ ] TASK_COMPLETE # Not checked
      TASK_COMPLETE # No checkbox
- [x] TASK_COMPLETE # Capital X

Check for hidden characters:

cat -A PROMPT.md | grep TASK_COMPLETE

Ensure marker is on its own line:

# Good - on its own line

- [x] TASK_COMPLETE

# Bad - inside code block

```markdown
- [x] TASK_COMPLETE # Inside code block - won't work
```

Verify encoding:

file PROMPT.md
# Should show: UTF-8 Unicode text

Git Issues¶

Checkpoint Failed¶

Problem: Failed to create checkpoint

Solutions:

Initialize Git repository:

git init
git add .
git commit -m "Initial commit"

Check Git status:

git status

Fix Git configuration:

git config user.email "you@example.com"
git config user.name "Your Name"

Uncommitted Changes Warning¶

Problem: Uncommitted changes detected

Solutions:

Commit changes:

git add .
git commit -m "Save work"

Stash changes:

git stash
ralph run
git stash pop

Disable Git operations:

ralph run --no-git

Context Issues¶

Context Window Exceeded¶

Problem: Context window limit exceeded

Symptoms:

Agent forgets earlier instructions
Incomplete responses
Errors about missing information

Solutions:

Reduce file sizes:

# Split large files
split -l 500 large_file.py part_

Use more concise prompt:

# Remove unnecessary details

# Focus on current task

Switch to higher-context agent:

# Claude has 200K context
ralph run --agent claude

Clear iteration history:

rm .agent/prompts/prompt_*.md

Performance Issues¶

Slow Execution¶

Problem: Iterations taking too long

Solutions:

Check system resources:

top
df -h
iostat

Reduce parallel operations:
Close other applications
Limit background processes
Use faster agent:

# Q is typically faster
ralph run --agent q

High Memory Usage¶

Problem: Ralph consuming excessive memory

Solutions:

Set resource limits:

# In ralph.json
{
  "resource_limits": {
    "memory_mb": 2048
  }
}

Clean old state files:

find .agent -name "*.json" -mtime +7 -delete

Restart Ralph:

pkill -f ralph_orchestrator
ralph run

State and Metrics Issues¶

Corrupted State File¶

Problem: Invalid state file

Solutions:

Remove corrupted file:

rm .agent/metrics/state_latest.json

Restore from backup:

cp .agent/metrics/state_*.json .agent/metrics/state_latest.json

Reset state:

ralph clean

Missing Metrics¶

Problem: No metrics being collected

Solutions:

Check metrics directory:

ls -la .agent/metrics/

Create directory if missing:

mkdir -p .agent/metrics

Check permissions:

chmod 755 .agent/metrics

Error Messages¶

Common Error Codes¶

Error	Meaning	Solution
`Exit code 1`	General failure	Check logs for details
`Exit code 130`	Interrupted (Ctrl+C)	Normal interruption
`Exit code 137`	Killed (out of memory)	Increase memory limits
`Exit code 124`	Timeout	Increase timeout value

Agent-Specific Errors¶

Claude Errors¶

"Rate limit exceeded"

Solution: Add delay between iterations or upgrade API plan

"Invalid API key"

Solution: Check Claude CLI configuration

Gemini Errors¶

"Quota exceeded"

Solution: Wait for quota reset or upgrade plan

"Model not available"

Solution: Check Gemini CLI version and update

Q Chat Errors¶

"Connection refused"

Solution: Ensure Q service is running

Debug Mode¶

Enable Verbose Logging¶

# Maximum verbosity
ralph run --verbose

# With debug environment
DEBUG=1 ralph run

# Save logs
ralph run --verbose 2>&1 | tee debug.log

Inspect Execution¶

# Add debug points in PROMPT.md
print("DEBUG: Reached checkpoint 1")

Trace Execution¶

# Trace system calls
strace -o trace.log ralph run

# Profile Python execution
python -m cProfile ralph_orchestrator.py

Recovery Procedures¶

From Failed State¶

Save current state:

cp -r .agent .agent.backup

Analyze failure:

tail -n 100 .agent/logs/ralph.log

Fix issue:
Update PROMPT.md
Fix code errors
Clear problematic files
Resume or restart:

# Resume from checkpoint
ralph run

# Or start fresh
ralph clean && ralph run

From Git Checkpoint¶

# List checkpoints
git log --oneline | grep checkpoint

# Reset to checkpoint
git reset --hard <commit-hash>

# Resume execution
ralph run

Getting Help¶

Self-Diagnosis¶

Run the diagnostic script:

cat > diagnose.sh << 'EOF'
#!/bin/bash
echo "Ralph Orchestrator Diagnostic"
echo "============================"
echo "Agents available:"
which claude && echo "  ✓ Claude" || echo "  ✗ Claude"
which gemini && echo "  ✓ Gemini" || echo "  ✗ Gemini"
which q && echo "  ✓ Q" || echo "  ✗ Q"
echo ""
echo "Git status:"
git status --short
echo ""
echo "Ralph status:"
./ralph status
echo ""
echo "Recent errors:"
grep ERROR .agent/logs/*.log 2>/dev/null | tail -5
EOF
chmod +x diagnose.sh
./diagnose.sh

Community Support¶

GitHub Issues: Report bugs
Discussions: Ask questions
Discord: Join the community chat

Reporting Bugs¶

Include in bug reports:

Ralph version: ralph --version
Agent versions
Error messages
PROMPT.md content
Diagnostic output
Steps to reproduce

Prevention Tips¶

Best Practices¶

Start simple: Test with basic tasks first
Regular checkpoints: Use default 5-iteration interval
Monitor progress: Check status frequently
Version control: Commit before running Ralph
Resource limits: Set appropriate limits
Clear requirements: Write specific, testable criteria

Pre-flight Checklist¶

Before running Ralph: