Skip to content

Troubleshooting Guide

Common Issues and Solutions

Installation Issues

Agent Not Found

Problem: ralph: command 'claude' not found

Solutions:

  1. Verify agent installation:
which claude
which gemini
which q
  1. Install missing agent:
# Claude
npm install -g @anthropic-ai/claude-code

# Gemini
npm install -g @google/gemini-cli
  1. Add to PATH:
export PATH=$PATH:/usr/local/bin

Permission Denied

Problem: Permission denied: './ralph'

Solution:

chmod +x ralph ralph_orchestrator.py

Execution Issues

Task Running Too Long

Problem: Ralph runs maximum iterations without achieving goals

Possible Causes:

  1. Unclear or overly complex task description
  2. Agent not making progress towards objectives
  3. Task scope too large for iteration limits

Solutions:

  1. Check iteration progress and logs:
ralph status
  1. Break down complex tasks:
# Instead of:

Build a complete web application

# Try:

Create a Flask app with one endpoint that returns "Hello World"
  1. Increase iteration limits or try different agent:
ralph run --max-iterations 200
ralph run --agent gemini

Agent Timeout

Problem: Agent execution timed out

Solutions:

  1. Increase timeout:
# In ralph.json
{
  "timeout_per_iteration": 600
}
  1. Reduce prompt complexity:
  2. Break large tasks into smaller ones
  3. Remove unnecessary context

  4. Check system resources:

htop
free -h

Repeated Errors

Problem: Same error occurs in multiple iterations

Solutions:

  1. Check error pattern:
cat .agent/metrics/state_*.json | jq '.errors'
  1. Clear workspace and retry:
ralph clean
ralph run
  1. Manual intervention:
  2. Fix the specific issue
  3. Add clarification to PROMPT.md
  4. Resume execution

Loop Detection Issues

Problem: Loop detected: XX% similarity to previous output

Ralph's loop detection triggers when agent output is ≥90% similar to any of the last 5 outputs.

Possible Causes:

  1. Agent is stuck on the same subtask
  2. Agent producing similar "working on it" messages
  3. API errors causing identical retry messages
  4. Task requires same action repeatedly (false positive)

Solutions:

  1. Check if it's a legitimate loop:
# Review recent outputs
ls -lt .agent/prompts/ | head -10
diff .agent/prompts/prompt_N.md .agent/prompts/prompt_N-1.md
  1. Improve prompt to encourage variety:
# Add explicit progress tracking

## Current Status

Document what step you're on and what has changed since last iteration.
  1. Break down the task:
  2. If agent keeps doing the same thing, the task may need restructuring
  3. Split into smaller, more distinct subtasks

  4. Check for underlying issues:

  5. API errors causing retries
  6. Permission issues blocking progress
  7. Missing dependencies

Completion Marker Not Detected

Problem: Ralph continues running despite TASK_COMPLETE marker

Possible Causes:

  1. Incorrect marker format
  2. Invisible characters or encoding issues
  3. Marker buried in code block

Solutions:

  1. Use exact format:
# Correct formats:

- [x] TASK_COMPLETE
      [x] TASK_COMPLETE

# Incorrect (won't trigger):

- [ ] TASK_COMPLETE # Not checked
      TASK_COMPLETE # No checkbox
- [x] TASK_COMPLETE # Capital X
  1. Check for hidden characters:
cat -A PROMPT.md | grep TASK_COMPLETE
  1. Ensure marker is on its own line:
# Good - on its own line

- [x] TASK_COMPLETE

# Bad - inside code block

```markdown
- [x] TASK_COMPLETE # Inside code block - won't work
```

  1. Verify encoding:
file PROMPT.md
# Should show: UTF-8 Unicode text

Git Issues

Checkpoint Failed

Problem: Failed to create checkpoint

Solutions:

  1. Initialize Git repository:
git init
git add .
git commit -m "Initial commit"
  1. Check Git status:
git status
  1. Fix Git configuration:
git config user.email "you@example.com"
git config user.name "Your Name"

Uncommitted Changes Warning

Problem: Uncommitted changes detected

Solutions:

  1. Commit changes:
git add .
git commit -m "Save work"
  1. Stash changes:
git stash
ralph run
git stash pop
  1. Disable Git operations:
ralph run --no-git

Context Issues

Context Window Exceeded

Problem: Context window limit exceeded

Symptoms:

  • Agent forgets earlier instructions
  • Incomplete responses
  • Errors about missing information

Solutions:

  1. Reduce file sizes:
# Split large files
split -l 500 large_file.py part_
  1. Use more concise prompt:
# Remove unnecessary details

# Focus on current task
  1. Switch to higher-context agent:
# Claude has 200K context
ralph run --agent claude
  1. Clear iteration history:
rm .agent/prompts/prompt_*.md

Performance Issues

Slow Execution

Problem: Iterations taking too long

Solutions:

  1. Check system resources:
top
df -h
iostat
  1. Reduce parallel operations:
  2. Close other applications
  3. Limit background processes

  4. Use faster agent:

# Q is typically faster
ralph run --agent q

High Memory Usage

Problem: Ralph consuming excessive memory

Solutions:

  1. Set resource limits:
# In ralph.json
{
  "resource_limits": {
    "memory_mb": 2048
  }
}
  1. Clean old state files:
find .agent -name "*.json" -mtime +7 -delete
  1. Restart Ralph:
pkill -f ralph_orchestrator
ralph run

State and Metrics Issues

Corrupted State File

Problem: Invalid state file

Solutions:

  1. Remove corrupted file:
rm .agent/metrics/state_latest.json
  1. Restore from backup:
cp .agent/metrics/state_*.json .agent/metrics/state_latest.json
  1. Reset state:
ralph clean

Missing Metrics

Problem: No metrics being collected

Solutions:

  1. Check metrics directory:
ls -la .agent/metrics/
  1. Create directory if missing:
mkdir -p .agent/metrics
  1. Check permissions:
chmod 755 .agent/metrics

Error Messages

Common Error Codes

Error Meaning Solution
Exit code 1 General failure Check logs for details
Exit code 130 Interrupted (Ctrl+C) Normal interruption
Exit code 137 Killed (out of memory) Increase memory limits
Exit code 124 Timeout Increase timeout value

Agent-Specific Errors

Claude Errors

"Rate limit exceeded"

Solution: Add delay between iterations or upgrade API plan

"Invalid API key"

Solution: Check Claude CLI configuration

Gemini Errors

"Quota exceeded"

Solution: Wait for quota reset or upgrade plan

"Model not available"

Solution: Check Gemini CLI version and update

Q Chat Errors

"Connection refused"

Solution: Ensure Q service is running

Debug Mode

Enable Verbose Logging

# Maximum verbosity
ralph run --verbose

# With debug environment
DEBUG=1 ralph run

# Save logs
ralph run --verbose 2>&1 | tee debug.log

Inspect Execution

# Add debug points in PROMPT.md
print("DEBUG: Reached checkpoint 1")

Trace Execution

# Trace system calls
strace -o trace.log ralph run

# Profile Python execution
python -m cProfile ralph_orchestrator.py

Recovery Procedures

From Failed State

  1. Save current state:
cp -r .agent .agent.backup
  1. Analyze failure:
tail -n 100 .agent/logs/ralph.log
  1. Fix issue:
  2. Update PROMPT.md
  3. Fix code errors
  4. Clear problematic files

  5. Resume or restart:

# Resume from checkpoint
ralph run

# Or start fresh
ralph clean && ralph run

From Git Checkpoint

# List checkpoints
git log --oneline | grep checkpoint

# Reset to checkpoint
git reset --hard <commit-hash>

# Resume execution
ralph run

Getting Help

Self-Diagnosis

Run the diagnostic script:

cat > diagnose.sh << 'EOF'
#!/bin/bash
echo "Ralph Orchestrator Diagnostic"
echo "============================"
echo "Agents available:"
which claude && echo "  ✓ Claude" || echo "  ✗ Claude"
which gemini && echo "  ✓ Gemini" || echo "  ✗ Gemini"
which q && echo "  ✓ Q" || echo "  ✗ Q"
echo ""
echo "Git status:"
git status --short
echo ""
echo "Ralph status:"
./ralph status
echo ""
echo "Recent errors:"
grep ERROR .agent/logs/*.log 2>/dev/null | tail -5
EOF
chmod +x diagnose.sh
./diagnose.sh

Community Support

  1. GitHub Issues: Report bugs
  2. Discussions: Ask questions
  3. Discord: Join the community chat

Reporting Bugs

Include in bug reports:

  1. Ralph version: ralph --version
  2. Agent versions
  3. Error messages
  4. PROMPT.md content
  5. Diagnostic output
  6. Steps to reproduce

Prevention Tips

Best Practices

  1. Start simple: Test with basic tasks first
  2. Regular checkpoints: Use default 5-iteration interval
  3. Monitor progress: Check status frequently
  4. Version control: Commit before running Ralph
  5. Resource limits: Set appropriate limits
  6. Clear requirements: Write specific, testable criteria

Pre-flight Checklist

Before running Ralph:

  • PROMPT.md is clear and specific
  • Git repository is clean
  • Agents are installed and working
  • Sufficient disk space available
  • No sensitive data in prompt
  • Backup important files