Writing Tests
Testing is central to Eigenvue’s reliability. Every algorithm has both TypeScript and Python generators, and both must produce identical output. This page explains every layer of the testing strategy.
Test Fixture Format
Section titled “Test Fixture Format”Test fixtures are golden JSON files that define the expected step output for a specific set of inputs. They are the single source of truth that both TypeScript and Python tests validate against.
Fixture Location
Section titled “Fixture Location”Fixtures live alongside the TypeScript generator they test:
algorithms/classical/binary-search/tests/ found.fixture.json not-found.fixture.json single-element.fixture.json edge-left.fixture.json edge-right.fixture.json generator.test.tsFixture Schema
Section titled “Fixture Schema”{ "description": "Human-readable description of this test case.", "inputs": { "array": [1, 3, 5, 7, 9, 11, 13, 15, 17, 19], "target": 13 }, "expected": { "stepCount": 9, "terminalStepId": "found", "result": 6, "stepIds": [ "initialize", "calculate_mid", "search_right", "calculate_mid", "search_left", "calculate_mid", "search_right", "calculate_mid", "found" ], "keyStates": [ { "stepIndex": 0, "field": "left", "value": 0 }, { "stepIndex": 0, "field": "right", "value": 9 }, { "stepIndex": 1, "field": "mid", "value": 4 }, { "stepIndex": 8, "field": "result", "value": 6 } ] }}Field Reference
Section titled “Field Reference”| Field | Type | Description |
|---|---|---|
description | string | What this test case verifies |
inputs | object | Algorithm input parameters |
expected.stepCount | number | Total number of steps |
expected.terminalStepId | string | The id of the final (terminal) step |
expected.result | any | The result field in the terminal step’s state |
expected.stepIds | string[] | Ordered list of step IDs |
expected.keyStates | array | Spot-checks on specific state fields |
Fixture Design Guidelines
Section titled “Fixture Design Guidelines”- Cover the core paths. At minimum: success case, failure case, edge cases.
- Keep arrays small. 5—10 elements. Easy to compute expected values by hand.
- Use surgical state checks. Only assert the fields that matter for the test. Writing out the entire expected state of every step is brittle and hard to maintain.
- Name files descriptively.
found.fixture.json,not-found.fixture.json,single-element.fixture.json,edge-left.fixture.json.
TypeScript Generator Tests (Vitest)
Section titled “TypeScript Generator Tests (Vitest)”Test File Location
Section titled “Test File Location”algorithms/<category>/<algorithm>/tests/generator.test.tsTest Structure
Section titled “Test Structure”Every generator test file follows the same pattern:
import { describe, it, expect } from "vitest";import { runGenerator } from "@/engine/generator";import myGenerator from "../generator";
// Import fixturesimport foundFixture from "./found.fixture.json";import notFoundFixture from "./not-found.fixture.json";
// ── Helper ────────────────────────────────────────────────────────────────
function assertFixture(fixture: { description: string; inputs: Record<string, unknown>; expected: { stepCount: number; terminalStepId: string; result: unknown; stepIds: string[]; keyStates: Array<{ stepIndex: number; field: string; value: unknown }>; };}) { const result = runGenerator(myGenerator, fixture.inputs); const { steps } = result;
expect(steps.length).toBe(fixture.expected.stepCount);
const terminal = steps[steps.length - 1]!; expect(terminal.id).toBe(fixture.expected.terminalStepId); expect(terminal.isTerminal).toBe(true); expect(terminal.state.result).toBe(fixture.expected.result);
expect(steps.map((s) => s.id)).toEqual(fixture.expected.stepIds);
for (const check of fixture.expected.keyStates) { expect(steps[check.stepIndex]!.state[check.field]).toEqual(check.value); }}Test Categories
Section titled “Test Categories”Every generator test file should include these four categories:
1. Fixture Tests
Section titled “1. Fixture Tests”Validate against golden fixtures:
describe("Fixture Tests", () => { it("found: default example", () => assertFixture(foundFixture)); it("not found: target absent", () => assertFixture(notFoundFixture));});2. Format Compliance Tests
Section titled “2. Format Compliance Tests”Verify structural invariants of the step format:
describe("Format Compliance", () => { const result = runGenerator(myGenerator, defaultInputs);
it("produces a valid StepSequence", () => { expect(result.formatVersion).toBe(1); expect(result.algorithmId).toBe("my-algorithm"); expect(result.generatedBy).toBe("typescript"); });
it("all step indices are contiguous", () => { result.steps.forEach((step, i) => expect(step.index).toBe(i)); });
it("exactly one terminal step, it is the last", () => { const terminals = result.steps.filter((s) => s.isTerminal); expect(terminals.length).toBe(1); expect(terminals[0]!.index).toBe(result.steps.length - 1); });
it("all step IDs match pattern", () => { for (const step of result.steps) { expect(step.id).toMatch(/^[a-z0-9][a-z0-9_-]*$/); } });
it("all code highlight lines are positive integers", () => { for (const step of result.steps) { for (const line of step.codeHighlight.lines) { expect(Number.isInteger(line)).toBe(true); expect(line).toBeGreaterThanOrEqual(1); } } });});3. State Immutability Tests
Section titled “3. State Immutability Tests”Verify that state snapshots are independent:
describe("State Immutability", () => { it("step states are independent snapshots", () => { const result = runGenerator(myGenerator, someInputs); const arrays = result.steps.map((s) => s.state.array as number[]);
// All arrays should contain the same values. for (const arr of arrays) { expect(arr).toEqual(someInputs.array); }
// But they should NOT be the same reference. for (let i = 0; i < arrays.length - 1; i++) { expect(arrays[i]).not.toBe(arrays[i + 1]); } });
it("mutating one step's state does not affect others", () => { const result = runGenerator(myGenerator, someInputs); const step0Array = result.steps[0]!.state.array as number[]; step0Array[0] = 9999; expect((result.steps[1]!.state.array as number[])[0]).not.toBe(9999); });});4. Edge Case Tests
Section titled “4. Edge Case Tests”Cover boundary conditions specific to your algorithm:
describe("Edge Cases", () => { it("single element, target found", () => { const result = runGenerator(myGenerator, { array: [42], target: 42 }); const terminal = result.steps[result.steps.length - 1]!; expect(terminal.state.result).toBe(0); });
it("single element, target not found", () => { const result = runGenerator(myGenerator, { array: [42], target: 99 }); const terminal = result.steps[result.steps.length - 1]!; expect(terminal.state.result).toBe(-1); });});Running TypeScript Tests
Section titled “Running TypeScript Tests”cd web
# Run all testsnpm run test
# Run tests for a specific algorithmnpx vitest run algorithms/classical/binary-search/tests/
# Watch mode (re-runs on file changes)npx vitest watch algorithms/classical/binary-search/tests/
# With coverage reportnpx vitest run --coveragePython Generator Tests (pytest)
Section titled “Python Generator Tests (pytest)”Test Location
Section titled “Test Location”python/tests/generators/classical/test_binary_search.pyTest Structure
Section titled “Test Structure”"""Binary Search Generator — Python Tests."""
import pytestfrom eigenvue.runner import run_generator
class TestBinarySearchFound: """Test cases where the target is found."""
def test_default_example(self) -> None: steps = run_generator("binary-search", { "array": [1, 3, 5, 7, 9, 11, 13, 15, 17, 19], "target": 13, }) assert len(steps) == 9 assert steps[-1]["id"] == "found" assert steps[-1]["isTerminal"] is True assert steps[-1]["state"]["result"] == 6
def test_single_element(self) -> None: steps = run_generator("binary-search", { "array": [42], "target": 42, }) assert steps[-1]["id"] == "found" assert steps[-1]["state"]["result"] == 0
class TestBinarySearchNotFound: """Test cases where the target is not found."""
def test_target_absent(self) -> None: steps = run_generator("binary-search", { "array": [2, 4, 6, 8, 10], "target": 5, }) assert steps[-1]["id"] == "not_found" assert steps[-1]["state"]["result"] == -1
class TestBinarySearchInvariants: """Test structural invariants of the step sequence."""
def test_index_contiguity(self) -> None: steps = run_generator("binary-search", { "array": [1, 3, 5, 7, 9], "target": 5, }) for i, step in enumerate(steps): assert step["index"] == i
def test_single_terminal(self) -> None: steps = run_generator("binary-search") terminals = [s for s in steps if s.get("isTerminal", False)] assert len(terminals) == 1 assert terminals[0] is steps[-1]Running Python Tests
Section titled “Running Python Tests”cd python
# Run all testspytest
# Run tests for a specific modulepytest tests/generators/classical/test_binary_search.py
# Verbose outputpytest -v
# Stop on first failurepytest -x
# With coveragepytest --cov=eigenvueCross-Language Parity Tests
Section titled “Cross-Language Parity Tests”The parity test verifies that Python and TypeScript produce identical JSON output. This is the most important test in the project.
How It Works
Section titled “How It Works”The script scripts/verify-step-parity.py:
- Loads each golden fixture (already in camelCase JSON format).
- Deserializes it into Python
Stepdataclasses viafrom_dict(). - Re-serializes it back to camelCase dicts via
to_dict(). - Compares the re-serialized output to the original JSON byte-for-byte.
If the round-trip is clean, the Python dataclasses are proven wire-compatible with the TypeScript interfaces.
Running the Parity Test
Section titled “Running the Parity Test”python scripts/verify-step-parity.pyExpected output:
============================================================Cross-Language Step Parity Verification============================================================
Verifying: binary-search-found.fixture.json PASSVerifying: binary-search-not-found.fixture.json PASSVerifying: single-element.fixture.json PASSVerifying: minimal-valid.fixture.json PASS
============================================================ALL FIXTURES PASSED============================================================When Parity Fails
Section titled “When Parity Fails”See Cross-Language Parity for debugging strategies for common parity failures.
Layout Tests
Section titled “Layout Tests”Layout tests verify that layout functions produce correct primitives for known step data.
Test Location
Section titled “Test Location”web/src/engine/layouts/__tests__/What to Test
Section titled “What to Test”import { describe, it, expect } from "vitest";
describe("array-with-pointers layout", () => { it("produces N elements for an N-element array", () => { // Create a mock step with a known array // Call the layout function // Assert the number of element primitives });
it("all primitive IDs are unique", () => { // IDs must be unique for animation diffing const ids = scene.primitives.map((p) => p.id); expect(new Set(ids).size).toBe(ids.length); });
it("elements fit within canvas bounds", () => { for (const p of scene.primitives) { if (p.kind === "element") { expect(p.x).toBeGreaterThanOrEqual(0); expect(p.x).toBeLessThanOrEqual(canvasSize.width); expect(p.y).toBeGreaterThanOrEqual(0); expect(p.y).toBeLessThanOrEqual(canvasSize.height); } } });
it("handles empty array gracefully", () => { // Should return an empty primitives array, not throw });
it("processes highlightElement visual action", () => { // Create a step with a highlightElement action // Verify the corresponding element has the correct fill color });});E2E Tests (Playwright)
Section titled “E2E Tests (Playwright)”End-to-end tests verify that algorithms render correctly in the browser and that user interactions (stepping forward/backward, changing inputs) work.
Test Location
Section titled “Test Location”tests/e2e/Example E2E Test
Section titled “Example E2E Test”import { test, expect } from "@playwright/test";
test("binary search visualization loads and steps forward", async ({ page }) => { await page.goto("/algorithms/binary-search");
// Wait for the visualization to load await expect(page.locator("canvas")).toBeVisible();
// Verify the step counter starts at 1 await expect(page.locator("[data-testid='step-counter']")).toContainText("1");
// Click the "Next" button await page.click("[data-testid='next-step']");
// Verify the step counter advances await expect(page.locator("[data-testid='step-counter']")).toContainText("2");});
test("binary search handles custom inputs", async ({ page }) => { await page.goto("/algorithms/binary-search");
// Open the input editor await page.click("[data-testid='edit-inputs']");
// Modify the target value await page.fill("[data-testid='input-target']", "99");
// Run the algorithm await page.click("[data-testid='run-algorithm']");
// Should reach "not found" terminal state // Navigate to the last step await page.click("[data-testid='last-step']"); await expect(page.locator("[data-testid='step-title']")).toContainText("Not Found");});Running E2E Tests
Section titled “Running E2E Tests”# Install Playwright browsers (first time only)npx playwright install
# Run E2E testsnpx playwright test
# Run with UI mode (interactive)npx playwright test --ui
# Run a specific test filenpx playwright test tests/e2e/binary-search.spec.tsFloating-Point Tolerance
Section titled “Floating-Point Tolerance”Many algorithms involve floating-point arithmetic (attention weights, gradient values, loss computations). When comparing floating-point values across TypeScript and Python, use a tolerance of +/-1e-9.
In Vitest (TypeScript)
Section titled “In Vitest (TypeScript)”// For individual valuesexpect(actual).toBeCloseTo(expected, 9); // 9 decimal digits
// For arrays of floatsfor (let i = 0; i < actual.length; i++) { expect(actual[i]).toBeCloseTo(expected[i], 9);}In pytest (Python)
Section titled “In pytest (Python)”import pytest
# For individual valuesassert actual == pytest.approx(expected, abs=1e-9)
# For lists of floatsassert actual_list == pytest.approx(expected_list, abs=1e-9)Why 1e-9?
Section titled “Why 1e-9?”JavaScript uses IEEE 754 double-precision (64-bit) floats, and Python’s float
is also IEEE 754 double-precision. Both have about 15—17 significant decimal
digits of precision. A tolerance of 1e-9 allows for small accumulation errors
from different ordering of operations while still catching meaningful numerical
bugs.
Test Checklist
Section titled “Test Checklist”Before submitting a PR, verify:
- All Vitest tests pass:
cd web && npm run test - All pytest tests pass:
cd python && pytest - Cross-language parity passes:
python scripts/verify-step-parity.py - New algorithm has fixtures for: found, not found, single element
- Format compliance tests are included
- State immutability tests are included
- Edge case tests cover algorithm-specific boundaries
- No test uses random data (all tests are deterministic)