Testing & Validation¶

Comprehensive testing approaches for Ralph development and validation.

Test Types¶

Type	Purpose	Speed	Cost
Unit Tests	Test individual functions	Fast	Free
Smoke Tests	Replay recorded sessions	Fast	Free
E2E Tests	Validate against real backends	Slow	API costs
TUI Validation	Verify terminal rendering	Medium	Free

Running Tests¶

All Tests¶

cargo test

This includes unit tests and smoke tests (344+ tests total).

Smoke Tests Only¶

cargo test -p ralph-core smoke_runner

Kiro-Specific Tests¶

cargo test -p ralph-core kiro

E2E Tests¶

# All backends
cargo run -p ralph-e2e -- all

# Specific backend
cargo run -p ralph-e2e -- claude

# List scenarios
cargo run -p ralph-e2e -- --list

Smoke Tests¶

Smoke tests use recorded JSONL fixtures instead of live API calls — fast, free, deterministic.

How They Work¶

Record a session to JSONL
Replay during tests
Verify expected behavior

Fixture Locations¶

crates/ralph-core/tests/fixtures/
├── basic_session.jsonl          # Claude CLI session
└── kiro/                         # Kiro sessions
    ├── basic.jsonl
    ├── tool_use.jsonl
    └── autonomous.jsonl

Recording New Fixtures¶

# Record a session
ralph run -c ralph.yml --record-session session.jsonl -p "your prompt"

# Or capture raw CLI output
claude -p "your prompt" 2>&1 | tee output.txt

Fixture Format¶

JSONL with one event per line:

{"type":"output","content":"Starting task...","timestamp":"2024-01-21T10:00:00Z"}
{"type":"tool_call","tool":"read_file","args":{"path":"src/lib.rs"}}
{"type":"tool_result","result":"...contents..."}
{"type":"output","content":"LOOP_COMPLETE"}

E2E Tests¶

End-to-end tests validate against real AI backends.

Test Tiers¶

Tier	Focus	Scenarios
1	Connectivity	Backend availability, auth
2	Orchestration	Single/multi iteration
3	Events	Parsing, routing
4	Capabilities	Tool use, streaming
5	Hat Collections	Workflows, routing
6	Memory	Add, search, inject
7	Error Handling	Timeout, limits

Running E2E Tests¶

# All tests for Claude
cargo run -p ralph-e2e -- claude

# All available backends
cargo run -p ralph-e2e -- all

# Fast mode (skip analysis)
cargo run -p ralph-e2e -- claude --skip-analysis

# Debug mode (keep workspaces)
cargo run -p ralph-e2e -- claude --keep-workspace --verbose

E2E Reports¶

Generated in .e2e-tests/:

.e2e-tests/
├── report.md      # Human-readable Markdown
├── report.json    # Machine-readable JSON
└── claude-connect/  # Test workspace (with --keep-workspace)

E2E Orchestration¶

For E2E test development, use isolated config:

# E2E test development
ralph run -c ralph.e2e.yml -p "fix e2e tests"

This uses separate scratchpad to avoid pollution.

TUI Validation¶

Validate Terminal UI rendering using LLM-as-judge.

Quick Start¶

# Validate from captured output
/tui-validate file:output.txt criteria:ralph-header

# Validate live TUI via tmux
/tui-validate tmux:ralph-session criteria:ralph-full

# Custom criteria
/tui-validate command:"cargo run --example tui" criteria:"Shows header"

Built-in Criteria¶

Criteria	Validates
`ralph-header`	Iteration count, elapsed time, hat display
`ralph-footer`	Activity indicator, event topic
`ralph-full`	Complete layout and hierarchy
`tui-basic`	Has content, no artifacts

Live TUI Capture¶

# 1. Start TUI in tmux
tmux new-session -d -s ralph-test -x 100 -y 30
tmux send-keys -t ralph-test "ralph run -p 'test'" Enter

# 2. Wait for render
sleep 3

# 3. Capture
tmux capture-pane -t ralph-test -p -e > tui-capture.txt

# 4. Validate
/tui-validate file:tui-capture.txt criteria:ralph-header

Prerequisites¶

brew install charmbracelet/tap/freeze  # Screenshot tool
brew install tmux                       # Live capture

Linting¶

# Check formatting
cargo fmt --check

# Run clippy
cargo clippy --all-targets --all-features

Pre-commit Hooks¶

Install hooks:

./scripts/setup-hooks.sh

Hooks run cargo fmt --check and cargo clippy before each commit.

Testing Best Practices¶

1. Run Tests After Changes¶

cargo test  # Always run before declaring done

2. Prefer Smoke Tests¶

For new features, create replay fixtures rather than relying on live APIs.

3. Use E2E for Integration¶

E2E tests are expensive but catch integration issues.

4. Validate TUI Changes¶

After modifying ralph-tui, use TUI validation.

5. Keep Fixtures Updated¶

When behavior changes, update corresponding fixtures.

Creating New Tests¶

Unit Test¶

#[test]
fn test_event_parsing() {
    let input = r#"ralph emit "build.done" "tests pass""#;
    let event = parse_event(input).unwrap();
    assert_eq!(event.topic, "build.done");
}

Smoke Test¶

Record session: --record-session fixture.jsonl
Place in tests/fixtures/
Add test case referencing fixture

E2E Scenario¶

pub struct MyScenario;

impl E2EScenario for MyScenario {
    fn name(&self) -> &str { "my-scenario" }
    fn tier(&self) -> u8 { 3 }

    async fn run(&self, ctx: &E2EContext) -> E2EResult {
        // Test implementation
    }
}

Next Steps¶

Explore Diagnostics for debugging
Learn about Architecture
See the Contributing Guide