bmad初始化

This commit is contained in:
2025-11-01 19:22:39 +08:00
parent 5b21dc0bd5
commit 426ae41f54
447 changed files with 80633 additions and 0 deletions

View File

@@ -0,0 +1,675 @@
# CI Pipeline and Burn-In Strategy
## Principle
CI pipelines must execute tests reliably, quickly, and provide clear feedback. Burn-in testing (running changed tests multiple times) flushes out flakiness before merge. Stage jobs strategically: install/cache once, run changed specs first for fast feedback, then shard full suites with fail-fast disabled to preserve evidence.
## Rationale
CI is the quality gate for production. A poorly configured pipeline either wastes developer time (slow feedback, false positives) or ships broken code (false negatives, insufficient coverage). Burn-in testing ensures reliability by stress-testing changed code, while parallel execution and intelligent test selection optimize speed without sacrificing thoroughness.
## Pattern Examples
### Example 1: GitHub Actions Workflow with Parallel Execution
**Context**: Production-ready CI/CD pipeline for E2E tests with caching, parallelization, and burn-in testing.
**Implementation**:
```yaml
# .github/workflows/e2e-tests.yml
name: E2E Tests
on:
pull_request:
push:
branches: [main, develop]
env:
NODE_VERSION_FILE: '.nvmrc'
CACHE_KEY: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
jobs:
install-dependencies:
name: Install & Cache Dependencies
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version-file: ${{ env.NODE_VERSION_FILE }}
cache: 'npm'
- name: Cache node modules
uses: actions/cache@v4
id: npm-cache
with:
path: |
~/.npm
node_modules
~/.cache/Cypress
~/.cache/ms-playwright
key: ${{ env.CACHE_KEY }}
restore-keys: |
${{ runner.os }}-node-
- name: Install dependencies
if: steps.npm-cache.outputs.cache-hit != 'true'
run: npm ci --prefer-offline --no-audit
- name: Install Playwright browsers
if: steps.npm-cache.outputs.cache-hit != 'true'
run: npx playwright install --with-deps chromium
test-changed-specs:
name: Test Changed Specs First (Burn-In)
needs: install-dependencies
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0 # Full history for accurate diff
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version-file: ${{ env.NODE_VERSION_FILE }}
cache: 'npm'
- name: Restore dependencies
uses: actions/cache@v4
with:
path: |
~/.npm
node_modules
~/.cache/ms-playwright
key: ${{ env.CACHE_KEY }}
- name: Detect changed test files
id: changed-tests
run: |
CHANGED_SPECS=$(git diff --name-only origin/main...HEAD | grep -E '\.(spec|test)\.(ts|js|tsx|jsx)$' || echo "")
echo "changed_specs=${CHANGED_SPECS}" >> $GITHUB_OUTPUT
echo "Changed specs: ${CHANGED_SPECS}"
- name: Run burn-in on changed specs (10 iterations)
if: steps.changed-tests.outputs.changed_specs != ''
run: |
SPECS="${{ steps.changed-tests.outputs.changed_specs }}"
echo "Running burn-in: 10 iterations on changed specs"
for i in {1..10}; do
echo "Burn-in iteration $i/10"
npm run test -- $SPECS || {
echo "❌ Burn-in failed on iteration $i"
exit 1
}
done
echo "✅ Burn-in passed - 10/10 successful runs"
- name: Upload artifacts on failure
if: failure()
uses: actions/upload-artifact@v4
with:
name: burn-in-failure-artifacts
path: |
test-results/
playwright-report/
screenshots/
retention-days: 7
test-e2e-sharded:
name: E2E Tests (Shard ${{ matrix.shard }}/${{ strategy.job-total }})
needs: [install-dependencies, test-changed-specs]
runs-on: ubuntu-latest
timeout-minutes: 30
strategy:
fail-fast: false # Run all shards even if one fails
matrix:
shard: [1, 2, 3, 4]
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version-file: ${{ env.NODE_VERSION_FILE }}
cache: 'npm'
- name: Restore dependencies
uses: actions/cache@v4
with:
path: |
~/.npm
node_modules
~/.cache/ms-playwright
key: ${{ env.CACHE_KEY }}
- name: Run E2E tests (shard ${{ matrix.shard }})
run: npm run test:e2e -- --shard=${{ matrix.shard }}/4
env:
TEST_ENV: staging
CI: true
- name: Upload test results
if: always()
uses: actions/upload-artifact@v4
with:
name: test-results-shard-${{ matrix.shard }}
path: |
test-results/
playwright-report/
retention-days: 30
- name: Upload JUnit report
if: always()
uses: actions/upload-artifact@v4
with:
name: junit-results-shard-${{ matrix.shard }}
path: test-results/junit.xml
retention-days: 30
merge-test-results:
name: Merge Test Results & Generate Report
needs: test-e2e-sharded
runs-on: ubuntu-latest
if: always()
steps:
- name: Download all shard results
uses: actions/download-artifact@v4
with:
pattern: test-results-shard-*
path: all-results/
- name: Merge HTML reports
run: |
npx playwright merge-reports --reporter=html all-results/
echo "Merged report available in playwright-report/"
- name: Upload merged report
uses: actions/upload-artifact@v4
with:
name: merged-playwright-report
path: playwright-report/
retention-days: 30
- name: Comment PR with results
if: github.event_name == 'pull_request'
uses: daun/playwright-report-comment@v3
with:
report-path: playwright-report/
```
**Key Points**:
- **Install once, reuse everywhere**: Dependencies cached across all jobs
- **Burn-in first**: Changed specs run 10x before full suite
- **Fail-fast disabled**: All shards run to completion for full evidence
- **Parallel execution**: 4 shards cut execution time by ~75%
- **Artifact retention**: 30 days for reports, 7 days for failure debugging
---
### Example 2: Burn-In Loop Pattern (Standalone Script)
**Context**: Reusable bash script for burn-in testing changed specs locally or in CI.
**Implementation**:
```bash
#!/bin/bash
# scripts/burn-in-changed.sh
# Usage: ./scripts/burn-in-changed.sh [iterations] [base-branch]
set -e # Exit on error
# Configuration
ITERATIONS=${1:-10}
BASE_BRANCH=${2:-main}
SPEC_PATTERN='\.(spec|test)\.(ts|js|tsx|jsx)$'
echo "🔥 Burn-In Test Runner"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "Iterations: $ITERATIONS"
echo "Base branch: $BASE_BRANCH"
echo ""
# Detect changed test files
echo "📋 Detecting changed test files..."
CHANGED_SPECS=$(git diff --name-only $BASE_BRANCH...HEAD | grep -E "$SPEC_PATTERN" || echo "")
if [ -z "$CHANGED_SPECS" ]; then
echo "✅ No test files changed. Skipping burn-in."
exit 0
fi
echo "Changed test files:"
echo "$CHANGED_SPECS" | sed 's/^/ - /'
echo ""
# Count specs
SPEC_COUNT=$(echo "$CHANGED_SPECS" | wc -l | xargs)
echo "Running burn-in on $SPEC_COUNT test file(s)..."
echo ""
# Burn-in loop
FAILURES=()
for i in $(seq 1 $ITERATIONS); do
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "🔄 Iteration $i/$ITERATIONS"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
# Run tests with explicit file list
if npm run test -- $CHANGED_SPECS 2>&1 | tee "burn-in-log-$i.txt"; then
echo "✅ Iteration $i passed"
else
echo "❌ Iteration $i failed"
FAILURES+=($i)
# Save failure artifacts
mkdir -p burn-in-failures/iteration-$i
cp -r test-results/ burn-in-failures/iteration-$i/ 2>/dev/null || true
cp -r screenshots/ burn-in-failures/iteration-$i/ 2>/dev/null || true
echo ""
echo "🛑 BURN-IN FAILED on iteration $i"
echo "Failure artifacts saved to: burn-in-failures/iteration-$i/"
echo "Logs saved to: burn-in-log-$i.txt"
echo ""
exit 1
fi
echo ""
done
# Success summary
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "🎉 BURN-IN PASSED"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "All $ITERATIONS iterations passed for $SPEC_COUNT test file(s)"
echo "Changed specs are stable and ready to merge."
echo ""
# Cleanup logs
rm -f burn-in-log-*.txt
exit 0
```
**Usage**:
```bash
# Run locally with default settings (10 iterations, compare to main)
./scripts/burn-in-changed.sh
# Custom iterations and base branch
./scripts/burn-in-changed.sh 20 develop
# Add to package.json
{
"scripts": {
"test:burn-in": "bash scripts/burn-in-changed.sh",
"test:burn-in:strict": "bash scripts/burn-in-changed.sh 20"
}
}
```
**Key Points**:
- **Exit on first failure**: Flaky tests caught immediately
- **Failure artifacts**: Saved per-iteration for debugging
- **Flexible configuration**: Iterations and base branch customizable
- **CI/local parity**: Same script runs in both environments
- **Clear output**: Visual feedback on progress and results
---
### Example 3: Shard Orchestration with Result Aggregation
**Context**: Advanced sharding strategy for large test suites with intelligent result merging.
**Implementation**:
```javascript
// scripts/run-sharded-tests.js
const { spawn } = require('child_process');
const fs = require('fs');
const path = require('path');
/**
* Run tests across multiple shards and aggregate results
* Usage: node scripts/run-sharded-tests.js --shards=4 --env=staging
*/
const SHARD_COUNT = parseInt(process.env.SHARD_COUNT || '4');
const TEST_ENV = process.env.TEST_ENV || 'local';
const RESULTS_DIR = path.join(__dirname, '../test-results');
console.log(`🚀 Running tests across ${SHARD_COUNT} shards`);
console.log(`Environment: ${TEST_ENV}`);
console.log('━'.repeat(50));
// Ensure results directory exists
if (!fs.existsSync(RESULTS_DIR)) {
fs.mkdirSync(RESULTS_DIR, { recursive: true });
}
/**
* Run a single shard
*/
function runShard(shardIndex) {
return new Promise((resolve, reject) => {
const shardId = `${shardIndex}/${SHARD_COUNT}`;
console.log(`\n📦 Starting shard ${shardId}...`);
const child = spawn('npx', ['playwright', 'test', `--shard=${shardId}`, '--reporter=json'], {
env: { ...process.env, TEST_ENV, SHARD_INDEX: shardIndex },
stdio: 'pipe',
});
let stdout = '';
let stderr = '';
child.stdout.on('data', (data) => {
stdout += data.toString();
process.stdout.write(data);
});
child.stderr.on('data', (data) => {
stderr += data.toString();
process.stderr.write(data);
});
child.on('close', (code) => {
// Save shard results
const resultFile = path.join(RESULTS_DIR, `shard-${shardIndex}.json`);
try {
const result = JSON.parse(stdout);
fs.writeFileSync(resultFile, JSON.stringify(result, null, 2));
console.log(`✅ Shard ${shardId} completed (exit code: ${code})`);
resolve({ shardIndex, code, result });
} catch (error) {
console.error(`❌ Shard ${shardId} failed to parse results:`, error.message);
reject({ shardIndex, code, error });
}
});
child.on('error', (error) => {
console.error(`❌ Shard ${shardId} process error:`, error.message);
reject({ shardIndex, error });
});
});
}
/**
* Aggregate results from all shards
*/
function aggregateResults() {
console.log('\n📊 Aggregating results from all shards...');
const shardResults = [];
let totalTests = 0;
let totalPassed = 0;
let totalFailed = 0;
let totalSkipped = 0;
let totalFlaky = 0;
for (let i = 1; i <= SHARD_COUNT; i++) {
const resultFile = path.join(RESULTS_DIR, `shard-${i}.json`);
if (fs.existsSync(resultFile)) {
const result = JSON.parse(fs.readFileSync(resultFile, 'utf8'));
shardResults.push(result);
// Aggregate stats
totalTests += result.stats?.expected || 0;
totalPassed += result.stats?.expected || 0;
totalFailed += result.stats?.unexpected || 0;
totalSkipped += result.stats?.skipped || 0;
totalFlaky += result.stats?.flaky || 0;
}
}
const summary = {
totalShards: SHARD_COUNT,
environment: TEST_ENV,
totalTests,
passed: totalPassed,
failed: totalFailed,
skipped: totalSkipped,
flaky: totalFlaky,
duration: shardResults.reduce((acc, r) => acc + (r.duration || 0), 0),
timestamp: new Date().toISOString(),
};
// Save aggregated summary
fs.writeFileSync(path.join(RESULTS_DIR, 'summary.json'), JSON.stringify(summary, null, 2));
console.log('\n━'.repeat(50));
console.log('📈 Test Results Summary');
console.log('━'.repeat(50));
console.log(`Total tests: ${totalTests}`);
console.log(`✅ Passed: ${totalPassed}`);
console.log(`❌ Failed: ${totalFailed}`);
console.log(`⏭️ Skipped: ${totalSkipped}`);
console.log(`⚠️ Flaky: ${totalFlaky}`);
console.log(`⏱️ Duration: ${(summary.duration / 1000).toFixed(2)}s`);
console.log('━'.repeat(50));
return summary;
}
/**
* Main execution
*/
async function main() {
const startTime = Date.now();
const shardPromises = [];
// Run all shards in parallel
for (let i = 1; i <= SHARD_COUNT; i++) {
shardPromises.push(runShard(i));
}
try {
await Promise.allSettled(shardPromises);
} catch (error) {
console.error('❌ One or more shards failed:', error);
}
// Aggregate results
const summary = aggregateResults();
const totalTime = ((Date.now() - startTime) / 1000).toFixed(2);
console.log(`\n⏱ Total execution time: ${totalTime}s`);
// Exit with failure if any tests failed
if (summary.failed > 0) {
console.error('\n❌ Test suite failed');
process.exit(1);
}
console.log('\n✅ All tests passed');
process.exit(0);
}
main().catch((error) => {
console.error('Fatal error:', error);
process.exit(1);
});
```
**package.json integration**:
```json
{
"scripts": {
"test:sharded": "node scripts/run-sharded-tests.js",
"test:sharded:ci": "SHARD_COUNT=8 TEST_ENV=staging node scripts/run-sharded-tests.js"
}
}
```
**Key Points**:
- **Parallel shard execution**: All shards run simultaneously
- **Result aggregation**: Unified summary across shards
- **Failure detection**: Exit code reflects overall test status
- **Artifact preservation**: Individual shard results saved for debugging
- **CI/local compatibility**: Same script works in both environments
---
### Example 4: Selective Test Execution (Changed Files + Tags)
**Context**: Optimize CI by running only relevant tests based on file changes and tags.
**Implementation**:
```bash
#!/bin/bash
# scripts/selective-test-runner.sh
# Intelligent test selection based on changed files and test tags
set -e
BASE_BRANCH=${BASE_BRANCH:-main}
TEST_ENV=${TEST_ENV:-local}
echo "🎯 Selective Test Runner"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "Base branch: $BASE_BRANCH"
echo "Environment: $TEST_ENV"
echo ""
# Detect changed files (all types, not just tests)
CHANGED_FILES=$(git diff --name-only $BASE_BRANCH...HEAD)
if [ -z "$CHANGED_FILES" ]; then
echo "✅ No files changed. Skipping tests."
exit 0
fi
echo "Changed files:"
echo "$CHANGED_FILES" | sed 's/^/ - /'
echo ""
# Determine test strategy based on changes
run_smoke_only=false
run_all_tests=false
affected_specs=""
# Critical files = run all tests
if echo "$CHANGED_FILES" | grep -qE '(package\.json|package-lock\.json|playwright\.config|cypress\.config|\.github/workflows)'; then
echo "⚠️ Critical configuration files changed. Running ALL tests."
run_all_tests=true
# Auth/security changes = run all auth + smoke tests
elif echo "$CHANGED_FILES" | grep -qE '(auth|login|signup|security)'; then
echo "🔒 Auth/security files changed. Running auth + smoke tests."
npm run test -- --grep "@auth|@smoke"
exit $?
# API changes = run integration + smoke tests
elif echo "$CHANGED_FILES" | grep -qE '(api|service|controller)'; then
echo "🔌 API files changed. Running integration + smoke tests."
npm run test -- --grep "@integration|@smoke"
exit $?
# UI component changes = run related component tests
elif echo "$CHANGED_FILES" | grep -qE '\.(tsx|jsx|vue)$'; then
echo "🎨 UI components changed. Running component + smoke tests."
# Extract component names and find related tests
components=$(echo "$CHANGED_FILES" | grep -E '\.(tsx|jsx|vue)$' | xargs -I {} basename {} | sed 's/\.[^.]*$//')
for component in $components; do
# Find tests matching component name
affected_specs+=$(find tests -name "*${component}*" -type f) || true
done
if [ -n "$affected_specs" ]; then
echo "Running tests for: $affected_specs"
npm run test -- $affected_specs --grep "@smoke"
else
echo "No specific tests found. Running smoke tests only."
npm run test -- --grep "@smoke"
fi
exit $?
# Documentation/config only = run smoke tests
elif echo "$CHANGED_FILES" | grep -qE '\.(md|txt|json|yml|yaml)$'; then
echo "📝 Documentation/config files changed. Running smoke tests only."
run_smoke_only=true
else
echo "⚙️ Other files changed. Running smoke tests."
run_smoke_only=true
fi
# Execute selected strategy
if [ "$run_all_tests" = true ]; then
echo ""
echo "Running full test suite..."
npm run test
elif [ "$run_smoke_only" = true ]; then
echo ""
echo "Running smoke tests..."
npm run test -- --grep "@smoke"
fi
```
**Usage in GitHub Actions**:
```yaml
# .github/workflows/selective-tests.yml
name: Selective Tests
on: pull_request
jobs:
selective-tests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Run selective tests
run: bash scripts/selective-test-runner.sh
env:
BASE_BRANCH: ${{ github.base_ref }}
TEST_ENV: staging
```
**Key Points**:
- **Intelligent routing**: Tests selected based on changed file types
- **Tag-based filtering**: Use @smoke, @auth, @integration tags
- **Fast feedback**: Only relevant tests run on most PRs
- **Safety net**: Critical changes trigger full suite
- **Component mapping**: UI changes run related component tests
---
## CI Configuration Checklist
Before deploying your CI pipeline, verify:
- [ ] **Caching strategy**: node_modules, npm cache, browser binaries cached
- [ ] **Timeout budgets**: Each job has reasonable timeout (10-30 min)
- [ ] **Artifact retention**: 30 days for reports, 7 days for failure artifacts
- [ ] **Parallelization**: Matrix strategy uses fail-fast: false
- [ ] **Burn-in enabled**: Changed specs run 5-10x before merge
- [ ] **wait-on app startup**: CI waits for app (wait-on: 'http://localhost:3000')
- [ ] **Secrets documented**: README lists required secrets (API keys, tokens)
- [ ] **Local parity**: CI scripts runnable locally (npm run test:ci)
## Integration Points
- Used in workflows: `*ci` (CI/CD pipeline setup)
- Related fragments: `selective-testing.md`, `playwright-config.md`, `test-quality.md`
- CI tools: GitHub Actions, GitLab CI, CircleCI, Jenkins
_Source: Murat CI/CD strategy blog, Playwright/Cypress workflow examples, SEON production pipelines_

View File

@@ -0,0 +1,486 @@
# Component Test-Driven Development Loop
## Principle
Start every UI change with a failing component test (`cy.mount`, Playwright component test, or RTL `render`). Follow the Red-Green-Refactor cycle: write a failing test (red), make it pass with minimal code (green), then improve the implementation (refactor). Ship only after the cycle completes. Keep component tests under 100 lines, isolated with fresh providers per test, and validate accessibility alongside functionality.
## Rationale
Component TDD provides immediate feedback during development. Failing tests (red) clarify requirements before writing code. Minimal implementations (green) prevent over-engineering. Refactoring with passing tests ensures changes don't break functionality. Isolated tests with fresh providers prevent state bleed in parallel runs. Accessibility assertions catch usability issues early. Visual debugging (Cypress runner, Storybook, Playwright trace viewer) accelerates diagnosis when tests fail.
## Pattern Examples
### Example 1: Red-Green-Refactor Loop
**Context**: When building a new component, start with a failing test that describes the desired behavior. Implement just enough to pass, then refactor for quality.
**Implementation**:
```typescript
// Step 1: RED - Write failing test
// Button.cy.tsx (Cypress Component Test)
import { Button } from './Button';
describe('Button Component', () => {
it('should render with label', () => {
cy.mount(<Button label="Click Me" />);
cy.contains('Click Me').should('be.visible');
});
it('should call onClick when clicked', () => {
const onClickSpy = cy.stub().as('onClick');
cy.mount(<Button label="Submit" onClick={onClickSpy} />);
cy.get('button').click();
cy.get('@onClick').should('have.been.calledOnce');
});
});
// Run test: FAILS - Button component doesn't exist yet
// Error: "Cannot find module './Button'"
// Step 2: GREEN - Minimal implementation
// Button.tsx
type ButtonProps = {
label: string;
onClick?: () => void;
};
export const Button = ({ label, onClick }: ButtonProps) => {
return <button onClick={onClick}>{label}</button>;
};
// Run test: PASSES - Component renders and handles clicks
// Step 3: REFACTOR - Improve implementation
// Add disabled state, loading state, variants
type ButtonProps = {
label: string;
onClick?: () => void;
disabled?: boolean;
loading?: boolean;
variant?: 'primary' | 'secondary' | 'danger';
};
export const Button = ({
label,
onClick,
disabled = false,
loading = false,
variant = 'primary'
}: ButtonProps) => {
return (
<button
onClick={onClick}
disabled={disabled || loading}
className={`btn btn-${variant}`}
data-testid="button"
>
{loading ? <Spinner /> : label}
</button>
);
};
// Step 4: Expand tests for new features
describe('Button Component', () => {
it('should render with label', () => {
cy.mount(<Button label="Click Me" />);
cy.contains('Click Me').should('be.visible');
});
it('should call onClick when clicked', () => {
const onClickSpy = cy.stub().as('onClick');
cy.mount(<Button label="Submit" onClick={onClickSpy} />);
cy.get('button').click();
cy.get('@onClick').should('have.been.calledOnce');
});
it('should be disabled when disabled prop is true', () => {
cy.mount(<Button label="Submit" disabled={true} />);
cy.get('button').should('be.disabled');
});
it('should show spinner when loading', () => {
cy.mount(<Button label="Submit" loading={true} />);
cy.get('[data-testid="spinner"]').should('be.visible');
cy.get('button').should('be.disabled');
});
it('should apply variant styles', () => {
cy.mount(<Button label="Delete" variant="danger" />);
cy.get('button').should('have.class', 'btn-danger');
});
});
// Run tests: ALL PASS - Refactored component still works
// Playwright Component Test equivalent
import { test, expect } from '@playwright/experimental-ct-react';
import { Button } from './Button';
test.describe('Button Component', () => {
test('should call onClick when clicked', async ({ mount }) => {
let clicked = false;
const component = await mount(
<Button label="Submit" onClick={() => { clicked = true; }} />
);
await component.getByRole('button').click();
expect(clicked).toBe(true);
});
test('should be disabled when loading', async ({ mount }) => {
const component = await mount(<Button label="Submit" loading={true} />);
await expect(component.getByRole('button')).toBeDisabled();
await expect(component.getByTestId('spinner')).toBeVisible();
});
});
```
**Key Points**:
- Red: Write failing test first - clarifies requirements before coding
- Green: Implement minimal code to pass - prevents over-engineering
- Refactor: Improve code quality while keeping tests green
- Expand: Add tests for new features after refactoring
- Cycle repeats: Each new feature starts with a failing test
### Example 2: Provider Isolation Pattern
**Context**: When testing components that depend on context providers (React Query, Auth, Router), wrap them with required providers in each test to prevent state bleed between tests.
**Implementation**:
```typescript
// test-utils/AllTheProviders.tsx
import { FC, ReactNode } from 'react';
import { QueryClient, QueryClientProvider } from '@tanstack/react-query';
import { BrowserRouter } from 'react-router-dom';
import { AuthProvider } from '../contexts/AuthContext';
type Props = {
children: ReactNode;
initialAuth?: { user: User | null; token: string | null };
};
export const AllTheProviders: FC<Props> = ({ children, initialAuth }) => {
// Create NEW QueryClient per test (prevent state bleed)
const queryClient = new QueryClient({
defaultOptions: {
queries: { retry: false },
mutations: { retry: false }
}
});
return (
<QueryClientProvider client={queryClient}>
<BrowserRouter>
<AuthProvider initialAuth={initialAuth}>
{children}
</AuthProvider>
</BrowserRouter>
</QueryClientProvider>
);
};
// Cypress custom mount command
// cypress/support/component.tsx
import { mount } from 'cypress/react18';
import { AllTheProviders } from '../../test-utils/AllTheProviders';
Cypress.Commands.add('wrappedMount', (component, options = {}) => {
const { initialAuth, ...mountOptions } = options;
return mount(
<AllTheProviders initialAuth={initialAuth}>
{component}
</AllTheProviders>,
mountOptions
);
});
// Usage in tests
// UserProfile.cy.tsx
import { UserProfile } from './UserProfile';
describe('UserProfile Component', () => {
it('should display user when authenticated', () => {
const user = { id: 1, name: 'John Doe', email: 'john@example.com' };
cy.wrappedMount(<UserProfile />, {
initialAuth: { user, token: 'fake-token' }
});
cy.contains('John Doe').should('be.visible');
cy.contains('john@example.com').should('be.visible');
});
it('should show login prompt when not authenticated', () => {
cy.wrappedMount(<UserProfile />, {
initialAuth: { user: null, token: null }
});
cy.contains('Please log in').should('be.visible');
});
});
// Playwright Component Test with providers
import { test, expect } from '@playwright/experimental-ct-react';
import { QueryClient, QueryClientProvider } from '@tanstack/react-query';
import { UserProfile } from './UserProfile';
import { AuthProvider } from '../contexts/AuthContext';
test.describe('UserProfile Component', () => {
test('should display user when authenticated', async ({ mount }) => {
const user = { id: 1, name: 'John Doe', email: 'john@example.com' };
const queryClient = new QueryClient();
const component = await mount(
<QueryClientProvider client={queryClient}>
<AuthProvider initialAuth={{ user, token: 'fake-token' }}>
<UserProfile />
</AuthProvider>
</QueryClientProvider>
);
await expect(component.getByText('John Doe')).toBeVisible();
await expect(component.getByText('john@example.com')).toBeVisible();
});
});
```
**Key Points**:
- Create NEW providers per test (QueryClient, Router, Auth)
- Prevents state pollution between tests
- `initialAuth` prop allows testing different auth states
- Custom mount command (`wrappedMount`) reduces boilerplate
- Providers wrap component, not the entire test suite
### Example 3: Accessibility Assertions
**Context**: When testing components, validate accessibility alongside functionality using axe-core, ARIA roles, labels, and keyboard navigation.
**Implementation**:
```typescript
// Cypress with axe-core
// cypress/support/component.tsx
import 'cypress-axe';
// Form.cy.tsx
import { Form } from './Form';
describe('Form Component Accessibility', () => {
beforeEach(() => {
cy.wrappedMount(<Form />);
cy.injectAxe(); // Inject axe-core
});
it('should have no accessibility violations', () => {
cy.checkA11y(); // Run axe scan
});
it('should have proper ARIA labels', () => {
cy.get('input[name="email"]').should('have.attr', 'aria-label', 'Email address');
cy.get('input[name="password"]').should('have.attr', 'aria-label', 'Password');
cy.get('button[type="submit"]').should('have.attr', 'aria-label', 'Submit form');
});
it('should support keyboard navigation', () => {
// Tab through form fields
cy.get('input[name="email"]').focus().type('test@example.com');
cy.realPress('Tab'); // cypress-real-events plugin
cy.focused().should('have.attr', 'name', 'password');
cy.focused().type('password123');
cy.realPress('Tab');
cy.focused().should('have.attr', 'type', 'submit');
cy.realPress('Enter'); // Submit via keyboard
cy.contains('Form submitted').should('be.visible');
});
it('should announce errors to screen readers', () => {
cy.get('button[type="submit"]').click(); // Submit without data
// Error has role="alert" and aria-live="polite"
cy.get('[role="alert"]')
.should('be.visible')
.and('have.attr', 'aria-live', 'polite')
.and('contain', 'Email is required');
});
it('should have sufficient color contrast', () => {
cy.checkA11y(null, {
rules: {
'color-contrast': { enabled: true }
}
});
});
});
// Playwright with axe-playwright
import { test, expect } from '@playwright/experimental-ct-react';
import AxeBuilder from '@axe-core/playwright';
import { Form } from './Form';
test.describe('Form Component Accessibility', () => {
test('should have no accessibility violations', async ({ mount, page }) => {
await mount(<Form />);
const accessibilityScanResults = await new AxeBuilder({ page })
.analyze();
expect(accessibilityScanResults.violations).toEqual([]);
});
test('should support keyboard navigation', async ({ mount, page }) => {
const component = await mount(<Form />);
await component.getByLabel('Email address').fill('test@example.com');
await page.keyboard.press('Tab');
await expect(component.getByLabel('Password')).toBeFocused();
await component.getByLabel('Password').fill('password123');
await page.keyboard.press('Tab');
await expect(component.getByRole('button', { name: 'Submit form' })).toBeFocused();
await page.keyboard.press('Enter');
await expect(component.getByText('Form submitted')).toBeVisible();
});
});
```
**Key Points**:
- Use `cy.checkA11y()` (Cypress) or `AxeBuilder` (Playwright) for automated accessibility scanning
- Validate ARIA roles, labels, and live regions
- Test keyboard navigation (Tab, Enter, Escape)
- Ensure errors are announced to screen readers (`role="alert"`, `aria-live`)
- Check color contrast meets WCAG standards
### Example 4: Visual Regression Test
**Context**: When testing components, capture screenshots to detect unintended visual changes. Use Playwright visual comparison or Cypress snapshot plugins.
**Implementation**:
```typescript
// Playwright visual regression
import { test, expect } from '@playwright/experimental-ct-react';
import { Button } from './Button';
test.describe('Button Visual Regression', () => {
test('should match primary button snapshot', async ({ mount }) => {
const component = await mount(<Button label="Primary" variant="primary" />);
// Capture and compare screenshot
await expect(component).toHaveScreenshot('button-primary.png');
});
test('should match secondary button snapshot', async ({ mount }) => {
const component = await mount(<Button label="Secondary" variant="secondary" />);
await expect(component).toHaveScreenshot('button-secondary.png');
});
test('should match disabled button snapshot', async ({ mount }) => {
const component = await mount(<Button label="Disabled" disabled={true} />);
await expect(component).toHaveScreenshot('button-disabled.png');
});
test('should match loading button snapshot', async ({ mount }) => {
const component = await mount(<Button label="Loading" loading={true} />);
await expect(component).toHaveScreenshot('button-loading.png');
});
});
// Cypress visual regression with percy or snapshot plugins
import { Button } from './Button';
describe('Button Visual Regression', () => {
it('should match primary button snapshot', () => {
cy.wrappedMount(<Button label="Primary" variant="primary" />);
// Option 1: Percy (cloud-based visual testing)
cy.percySnapshot('Button - Primary');
// Option 2: cypress-plugin-snapshots (local snapshots)
cy.get('button').toMatchImageSnapshot({
name: 'button-primary',
threshold: 0.01 // 1% threshold for pixel differences
});
});
it('should match hover state', () => {
cy.wrappedMount(<Button label="Hover Me" />);
cy.get('button').realHover(); // cypress-real-events
cy.percySnapshot('Button - Hover State');
});
it('should match focus state', () => {
cy.wrappedMount(<Button label="Focus Me" />);
cy.get('button').focus();
cy.percySnapshot('Button - Focus State');
});
});
// Playwright configuration for visual regression
// playwright.config.ts
export default defineConfig({
expect: {
toHaveScreenshot: {
maxDiffPixels: 100, // Allow 100 pixels difference
threshold: 0.2 // 20% threshold
}
},
use: {
screenshot: 'only-on-failure'
}
});
// Update snapshots when intentional changes are made
// npx playwright test --update-snapshots
```
**Key Points**:
- Playwright: Use `toHaveScreenshot()` for built-in visual comparison
- Cypress: Use Percy (cloud) or snapshot plugins (local) for visual testing
- Capture different states: default, hover, focus, disabled, loading
- Set threshold for acceptable pixel differences (avoid false positives)
- Update snapshots when visual changes are intentional
- Visual tests catch unintended CSS/layout regressions
## Integration Points
- **Used in workflows**: `*atdd` (component test generation), `*automate` (component test expansion), `*framework` (component testing setup)
- **Related fragments**:
- `test-quality.md` - Keep component tests <100 lines, isolated, focused
- `fixture-architecture.md` - Provider wrapping patterns, custom mount commands
- `data-factories.md` - Factory functions for component props
- `test-levels-framework.md` - When to use component tests vs E2E tests
## TDD Workflow Summary
**Red-Green-Refactor Cycle**:
1. **Red**: Write failing test describing desired behavior
2. **Green**: Implement minimal code to make test pass
3. **Refactor**: Improve code quality, tests stay green
4. **Repeat**: Each new feature starts with failing test
**Component Test Checklist**:
- [ ] Test renders with required props
- [ ] Test user interactions (click, type, submit)
- [ ] Test different states (loading, error, disabled)
- [ ] Test accessibility (ARIA, keyboard navigation)
- [ ] Test visual regression (snapshots)
- [ ] Isolate with fresh providers (no state bleed)
- [ ] Keep tests <100 lines (split by intent)
_Source: CCTDD repository, Murat component testing talks, Playwright/Cypress component testing docs._

View File

@@ -0,0 +1,957 @@
# Contract Testing Essentials (Pact)
## Principle
Contract testing validates API contracts between consumer and provider services without requiring integrated end-to-end tests. Store consumer contracts alongside integration specs, version contracts semantically, and publish on every CI run. Provider verification before merge surfaces breaking changes immediately, while explicit fallback behavior (timeouts, retries, error payloads) captures resilience guarantees in contracts.
## Rationale
Traditional integration testing requires running both consumer and provider simultaneously, creating slow, flaky tests with complex setup. Contract testing decouples services: consumers define expectations (pact files), providers verify against those expectations independently. This enables parallel development, catches breaking changes early, and documents API behavior as executable specifications. Pair contract tests with API smoke tests to validate data mapping and UI rendering in tandem.
## Pattern Examples
### Example 1: Pact Consumer Test (Frontend → Backend API)
**Context**: React application consuming a user management API, defining expected interactions.
**Implementation**:
```typescript
// tests/contract/user-api.pact.spec.ts
import { PactV3, MatchersV3 } from '@pact-foundation/pact';
import { getUserById, createUser, User } from '@/api/user-service';
const { like, eachLike, string, integer } = MatchersV3;
/**
* Consumer-Driven Contract Test
* - Consumer (React app) defines expected API behavior
* - Generates pact file for provider to verify
* - Runs in isolation (no real backend required)
*/
const provider = new PactV3({
consumer: 'user-management-web',
provider: 'user-api-service',
dir: './pacts', // Output directory for pact files
logLevel: 'warn',
});
describe('User API Contract', () => {
describe('GET /users/:id', () => {
it('should return user when user exists', async () => {
// Arrange: Define expected interaction
await provider
.given('user with id 1 exists') // Provider state
.uponReceiving('a request for user 1')
.withRequest({
method: 'GET',
path: '/users/1',
headers: {
Accept: 'application/json',
Authorization: like('Bearer token123'), // Matcher: any string
},
})
.willRespondWith({
status: 200,
headers: {
'Content-Type': 'application/json',
},
body: like({
id: integer(1),
name: string('John Doe'),
email: string('john@example.com'),
role: string('user'),
createdAt: string('2025-01-15T10:00:00Z'),
}),
})
.executeTest(async (mockServer) => {
// Act: Call consumer code against mock server
const user = await getUserById(1, {
baseURL: mockServer.url,
headers: { Authorization: 'Bearer token123' },
});
// Assert: Validate consumer behavior
expect(user).toEqual(
expect.objectContaining({
id: 1,
name: 'John Doe',
email: 'john@example.com',
role: 'user',
}),
);
});
});
it('should handle 404 when user does not exist', async () => {
await provider
.given('user with id 999 does not exist')
.uponReceiving('a request for non-existent user')
.withRequest({
method: 'GET',
path: '/users/999',
headers: { Accept: 'application/json' },
})
.willRespondWith({
status: 404,
headers: { 'Content-Type': 'application/json' },
body: {
error: 'User not found',
code: 'USER_NOT_FOUND',
},
})
.executeTest(async (mockServer) => {
// Act & Assert: Consumer handles 404 gracefully
await expect(getUserById(999, { baseURL: mockServer.url })).rejects.toThrow('User not found');
});
});
});
describe('POST /users', () => {
it('should create user and return 201', async () => {
const newUser: Omit<User, 'id' | 'createdAt'> = {
name: 'Jane Smith',
email: 'jane@example.com',
role: 'admin',
};
await provider
.given('no users exist')
.uponReceiving('a request to create a user')
.withRequest({
method: 'POST',
path: '/users',
headers: {
'Content-Type': 'application/json',
Accept: 'application/json',
},
body: like(newUser),
})
.willRespondWith({
status: 201,
headers: { 'Content-Type': 'application/json' },
body: like({
id: integer(2),
name: string('Jane Smith'),
email: string('jane@example.com'),
role: string('admin'),
createdAt: string('2025-01-15T11:00:00Z'),
}),
})
.executeTest(async (mockServer) => {
const createdUser = await createUser(newUser, {
baseURL: mockServer.url,
});
expect(createdUser).toEqual(
expect.objectContaining({
id: expect.any(Number),
name: 'Jane Smith',
email: 'jane@example.com',
role: 'admin',
}),
);
});
});
});
});
```
**package.json scripts**:
```json
{
"scripts": {
"test:contract": "jest tests/contract --testTimeout=30000",
"pact:publish": "pact-broker publish ./pacts --consumer-app-version=$GIT_SHA --broker-base-url=$PACT_BROKER_URL --broker-token=$PACT_BROKER_TOKEN"
}
}
```
**Key Points**:
- **Consumer-driven**: Frontend defines expectations, not backend
- **Matchers**: `like`, `string`, `integer` for flexible matching
- **Provider states**: given() sets up test preconditions
- **Isolation**: No real backend needed, runs fast
- **Pact generation**: Automatically creates JSON pact files
---
### Example 2: Pact Provider Verification (Backend validates contracts)
**Context**: Node.js/Express API verifying pacts published by consumers.
**Implementation**:
```typescript
// tests/contract/user-api.provider.spec.ts
import { Verifier, VerifierOptions } from '@pact-foundation/pact';
import { server } from '../../src/server'; // Your Express/Fastify app
import { seedDatabase, resetDatabase } from '../support/db-helpers';
/**
* Provider Verification Test
* - Provider (backend API) verifies against published pacts
* - State handlers setup test data for each interaction
* - Runs before merge to catch breaking changes
*/
describe('Pact Provider Verification', () => {
let serverInstance;
const PORT = 3001;
beforeAll(async () => {
// Start provider server
serverInstance = server.listen(PORT);
console.log(`Provider server running on port ${PORT}`);
});
afterAll(async () => {
// Cleanup
await serverInstance.close();
});
it('should verify pacts from all consumers', async () => {
const opts: VerifierOptions = {
// Provider details
provider: 'user-api-service',
providerBaseUrl: `http://localhost:${PORT}`,
// Pact Broker configuration
pactBrokerUrl: process.env.PACT_BROKER_URL,
pactBrokerToken: process.env.PACT_BROKER_TOKEN,
publishVerificationResult: process.env.CI === 'true',
providerVersion: process.env.GIT_SHA || 'dev',
// State handlers: Setup provider state for each interaction
stateHandlers: {
'user with id 1 exists': async () => {
await seedDatabase({
users: [
{
id: 1,
name: 'John Doe',
email: 'john@example.com',
role: 'user',
createdAt: '2025-01-15T10:00:00Z',
},
],
});
return 'User seeded successfully';
},
'user with id 999 does not exist': async () => {
// Ensure user doesn't exist
await resetDatabase();
return 'Database reset';
},
'no users exist': async () => {
await resetDatabase();
return 'Database empty';
},
},
// Request filters: Add auth headers to all requests
requestFilter: (req, res, next) => {
// Mock authentication for verification
req.headers['x-user-id'] = 'test-user';
req.headers['authorization'] = 'Bearer valid-test-token';
next();
},
// Timeout for verification
timeout: 30000,
};
// Run verification
await new Verifier(opts).verifyProvider();
});
});
```
**CI integration**:
```yaml
# .github/workflows/pact-provider.yml
name: Pact Provider Verification
on:
pull_request:
push:
branches: [main]
jobs:
verify-contracts:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version-file: '.nvmrc'
- name: Install dependencies
run: npm ci
- name: Start database
run: docker-compose up -d postgres
- name: Run migrations
run: npm run db:migrate
- name: Verify pacts
run: npm run test:contract:provider
env:
PACT_BROKER_URL: ${{ secrets.PACT_BROKER_URL }}
PACT_BROKER_TOKEN: ${{ secrets.PACT_BROKER_TOKEN }}
GIT_SHA: ${{ github.sha }}
CI: true
- name: Can I Deploy?
run: |
npx pact-broker can-i-deploy \
--pacticipant user-api-service \
--version ${{ github.sha }} \
--to-environment production
env:
PACT_BROKER_BASE_URL: ${{ secrets.PACT_BROKER_URL }}
PACT_BROKER_TOKEN: ${{ secrets.PACT_BROKER_TOKEN }}
```
**Key Points**:
- **State handlers**: Setup provider data for each given() state
- **Request filters**: Add auth/headers for verification requests
- **CI publishing**: Verification results sent to broker
- **can-i-deploy**: Safety check before production deployment
- **Database isolation**: Reset between state handlers
---
### Example 3: Contract CI Integration (Consumer & Provider Workflow)
**Context**: Complete CI/CD workflow coordinating consumer pact publishing and provider verification.
**Implementation**:
```yaml
# .github/workflows/pact-consumer.yml (Consumer side)
name: Pact Consumer Tests
on:
pull_request:
push:
branches: [main]
jobs:
consumer-tests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version-file: '.nvmrc'
- name: Install dependencies
run: npm ci
- name: Run consumer contract tests
run: npm run test:contract
- name: Publish pacts to broker
if: github.ref == 'refs/heads/main' || github.event_name == 'pull_request'
run: |
npx pact-broker publish ./pacts \
--consumer-app-version ${{ github.sha }} \
--branch ${{ github.head_ref || github.ref_name }} \
--broker-base-url ${{ secrets.PACT_BROKER_URL }} \
--broker-token ${{ secrets.PACT_BROKER_TOKEN }}
- name: Tag pact with environment (main branch only)
if: github.ref == 'refs/heads/main'
run: |
npx pact-broker create-version-tag \
--pacticipant user-management-web \
--version ${{ github.sha }} \
--tag production \
--broker-base-url ${{ secrets.PACT_BROKER_URL }} \
--broker-token ${{ secrets.PACT_BROKER_TOKEN }}
```
```yaml
# .github/workflows/pact-provider.yml (Provider side)
name: Pact Provider Verification
on:
pull_request:
push:
branches: [main]
repository_dispatch:
types: [pact_changed] # Webhook from Pact Broker
jobs:
verify-contracts:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version-file: '.nvmrc'
- name: Install dependencies
run: npm ci
- name: Start dependencies
run: docker-compose up -d
- name: Run provider verification
run: npm run test:contract:provider
env:
PACT_BROKER_URL: ${{ secrets.PACT_BROKER_URL }}
PACT_BROKER_TOKEN: ${{ secrets.PACT_BROKER_TOKEN }}
GIT_SHA: ${{ github.sha }}
CI: true
- name: Publish verification results
if: always()
run: echo "Verification results published to broker"
- name: Can I Deploy to Production?
if: github.ref == 'refs/heads/main'
run: |
npx pact-broker can-i-deploy \
--pacticipant user-api-service \
--version ${{ github.sha }} \
--to-environment production \
--broker-base-url ${{ secrets.PACT_BROKER_URL }} \
--broker-token ${{ secrets.PACT_BROKER_TOKEN }} \
--retry-while-unknown 6 \
--retry-interval 10
- name: Record deployment (if can-i-deploy passed)
if: success() && github.ref == 'refs/heads/main'
run: |
npx pact-broker record-deployment \
--pacticipant user-api-service \
--version ${{ github.sha }} \
--environment production \
--broker-base-url ${{ secrets.PACT_BROKER_URL }} \
--broker-token ${{ secrets.PACT_BROKER_TOKEN }}
```
**Pact Broker Webhook Configuration**:
```json
{
"events": [
{
"name": "contract_content_changed"
}
],
"request": {
"method": "POST",
"url": "https://api.github.com/repos/your-org/user-api/dispatches",
"headers": {
"Authorization": "Bearer ${user.githubToken}",
"Content-Type": "application/json",
"Accept": "application/vnd.github.v3+json"
},
"body": {
"event_type": "pact_changed",
"client_payload": {
"pact_url": "${pactbroker.pactUrl}",
"consumer": "${pactbroker.consumerName}",
"provider": "${pactbroker.providerName}"
}
}
}
}
```
**Key Points**:
- **Automatic trigger**: Consumer pact changes trigger provider verification via webhook
- **Branch tracking**: Pacts published per branch for feature testing
- **can-i-deploy**: Safety gate before production deployment
- **Record deployment**: Track which version is in each environment
- **Parallel dev**: Consumer and provider teams work independently
---
### Example 4: Resilience Coverage (Testing Fallback Behavior)
**Context**: Capture timeout, retry, and error handling behavior explicitly in contracts.
**Implementation**:
```typescript
// tests/contract/user-api-resilience.pact.spec.ts
import { PactV3, MatchersV3 } from '@pact-foundation/pact';
import { getUserById, ApiError } from '@/api/user-service';
const { like, string } = MatchersV3;
const provider = new PactV3({
consumer: 'user-management-web',
provider: 'user-api-service',
dir: './pacts',
});
describe('User API Resilience Contract', () => {
/**
* Test 500 error handling
* Verifies consumer handles server errors gracefully
*/
it('should handle 500 errors with retry logic', async () => {
await provider
.given('server is experiencing errors')
.uponReceiving('a request that returns 500')
.withRequest({
method: 'GET',
path: '/users/1',
headers: { Accept: 'application/json' },
})
.willRespondWith({
status: 500,
headers: { 'Content-Type': 'application/json' },
body: {
error: 'Internal server error',
code: 'INTERNAL_ERROR',
retryable: true,
},
})
.executeTest(async (mockServer) => {
// Consumer should retry on 500
try {
await getUserById(1, {
baseURL: mockServer.url,
retries: 3,
retryDelay: 100,
});
fail('Should have thrown error after retries');
} catch (error) {
expect(error).toBeInstanceOf(ApiError);
expect((error as ApiError).code).toBe('INTERNAL_ERROR');
expect((error as ApiError).retryable).toBe(true);
}
});
});
/**
* Test 429 rate limiting
* Verifies consumer respects rate limits
*/
it('should handle 429 rate limit with backoff', async () => {
await provider
.given('rate limit exceeded for user')
.uponReceiving('a request that is rate limited')
.withRequest({
method: 'GET',
path: '/users/1',
})
.willRespondWith({
status: 429,
headers: {
'Content-Type': 'application/json',
'Retry-After': '60', // Retry after 60 seconds
},
body: {
error: 'Too many requests',
code: 'RATE_LIMIT_EXCEEDED',
},
})
.executeTest(async (mockServer) => {
try {
await getUserById(1, {
baseURL: mockServer.url,
respectRateLimit: true,
});
fail('Should have thrown rate limit error');
} catch (error) {
expect(error).toBeInstanceOf(ApiError);
expect((error as ApiError).code).toBe('RATE_LIMIT_EXCEEDED');
expect((error as ApiError).retryAfter).toBe(60);
}
});
});
/**
* Test timeout handling
* Verifies consumer has appropriate timeout configuration
*/
it('should timeout after 10 seconds', async () => {
await provider
.given('server is slow to respond')
.uponReceiving('a request that times out')
.withRequest({
method: 'GET',
path: '/users/1',
})
.willRespondWith({
status: 200,
headers: { 'Content-Type': 'application/json' },
body: like({ id: 1, name: 'John' }),
})
.withDelay(15000) // Simulate 15 second delay
.executeTest(async (mockServer) => {
try {
await getUserById(1, {
baseURL: mockServer.url,
timeout: 10000, // 10 second timeout
});
fail('Should have timed out');
} catch (error) {
expect(error).toBeInstanceOf(ApiError);
expect((error as ApiError).code).toBe('TIMEOUT');
}
});
});
/**
* Test partial response (optional fields)
* Verifies consumer handles missing optional data
*/
it('should handle response with missing optional fields', async () => {
await provider
.given('user exists with minimal data')
.uponReceiving('a request for user with partial data')
.withRequest({
method: 'GET',
path: '/users/1',
})
.willRespondWith({
status: 200,
headers: { 'Content-Type': 'application/json' },
body: {
id: integer(1),
name: string('John Doe'),
email: string('john@example.com'),
// role, createdAt, etc. omitted (optional fields)
},
})
.executeTest(async (mockServer) => {
const user = await getUserById(1, { baseURL: mockServer.url });
// Consumer handles missing optional fields gracefully
expect(user.id).toBe(1);
expect(user.name).toBe('John Doe');
expect(user.role).toBeUndefined(); // Optional field
expect(user.createdAt).toBeUndefined(); // Optional field
});
});
});
```
**API client with retry logic**:
```typescript
// src/api/user-service.ts
import axios, { AxiosInstance, AxiosRequestConfig } from 'axios';
export class ApiError extends Error {
constructor(
message: string,
public code: string,
public retryable: boolean = false,
public retryAfter?: number,
) {
super(message);
}
}
/**
* User API client with retry and error handling
*/
export async function getUserById(
id: number,
config?: AxiosRequestConfig & { retries?: number; retryDelay?: number; respectRateLimit?: boolean },
): Promise<User> {
const { retries = 3, retryDelay = 1000, respectRateLimit = true, ...axiosConfig } = config || {};
let lastError: Error;
for (let attempt = 1; attempt <= retries; attempt++) {
try {
const response = await axios.get(`/users/${id}`, axiosConfig);
return response.data;
} catch (error: any) {
lastError = error;
// Handle rate limiting
if (error.response?.status === 429) {
const retryAfter = parseInt(error.response.headers['retry-after'] || '60');
throw new ApiError('Too many requests', 'RATE_LIMIT_EXCEEDED', false, retryAfter);
}
// Retry on 500 errors
if (error.response?.status === 500 && attempt < retries) {
await new Promise((resolve) => setTimeout(resolve, retryDelay * attempt));
continue;
}
// Handle 404
if (error.response?.status === 404) {
throw new ApiError('User not found', 'USER_NOT_FOUND', false);
}
// Handle timeout
if (error.code === 'ECONNABORTED') {
throw new ApiError('Request timeout', 'TIMEOUT', true);
}
break;
}
}
throw new ApiError('Request failed after retries', 'INTERNAL_ERROR', true);
}
```
**Key Points**:
- **Resilience contracts**: Timeouts, retries, errors explicitly tested
- **State handlers**: Provider sets up each test scenario
- **Error handling**: Consumer validates graceful degradation
- **Retry logic**: Exponential backoff tested
- **Optional fields**: Consumer handles partial responses
---
### Example 4: Pact Broker Housekeeping & Lifecycle Management
**Context**: Automated broker maintenance to prevent contract sprawl and noise.
**Implementation**:
```typescript
// scripts/pact-broker-housekeeping.ts
/**
* Pact Broker Housekeeping Script
* - Archive superseded contracts
* - Expire unused pacts
* - Tag releases for environment tracking
*/
import { execSync } from 'child_process';
const PACT_BROKER_URL = process.env.PACT_BROKER_URL!;
const PACT_BROKER_TOKEN = process.env.PACT_BROKER_TOKEN!;
const PACTICIPANT = 'user-api-service';
/**
* Tag release with environment
*/
function tagRelease(version: string, environment: 'staging' | 'production') {
console.log(`🏷️ Tagging ${PACTICIPANT} v${version} as ${environment}`);
execSync(
`npx pact-broker create-version-tag \
--pacticipant ${PACTICIPANT} \
--version ${version} \
--tag ${environment} \
--broker-base-url ${PACT_BROKER_URL} \
--broker-token ${PACT_BROKER_TOKEN}`,
{ stdio: 'inherit' },
);
}
/**
* Record deployment to environment
*/
function recordDeployment(version: string, environment: 'staging' | 'production') {
console.log(`📝 Recording deployment of ${PACTICIPANT} v${version} to ${environment}`);
execSync(
`npx pact-broker record-deployment \
--pacticipant ${PACTICIPANT} \
--version ${version} \
--environment ${environment} \
--broker-base-url ${PACT_BROKER_URL} \
--broker-token ${PACT_BROKER_TOKEN}`,
{ stdio: 'inherit' },
);
}
/**
* Clean up old pact versions (retention policy)
* Keep: last 30 days, all production tags, latest from each branch
*/
function cleanupOldPacts() {
console.log(`🧹 Cleaning up old pacts for ${PACTICIPANT}`);
execSync(
`npx pact-broker clean \
--pacticipant ${PACTICIPANT} \
--broker-base-url ${PACT_BROKER_URL} \
--broker-token ${PACT_BROKER_TOKEN} \
--keep-latest-for-branch 1 \
--keep-min-age 30`,
{ stdio: 'inherit' },
);
}
/**
* Check deployment compatibility
*/
function canIDeploy(version: string, toEnvironment: string): boolean {
console.log(`🔍 Checking if ${PACTICIPANT} v${version} can deploy to ${toEnvironment}`);
try {
execSync(
`npx pact-broker can-i-deploy \
--pacticipant ${PACTICIPANT} \
--version ${version} \
--to-environment ${toEnvironment} \
--broker-base-url ${PACT_BROKER_URL} \
--broker-token ${PACT_BROKER_TOKEN} \
--retry-while-unknown 6 \
--retry-interval 10`,
{ stdio: 'inherit' },
);
return true;
} catch (error) {
console.error(`❌ Cannot deploy to ${toEnvironment}`);
return false;
}
}
/**
* Main housekeeping workflow
*/
async function main() {
const command = process.argv[2];
const version = process.argv[3];
const environment = process.argv[4] as 'staging' | 'production';
switch (command) {
case 'tag-release':
tagRelease(version, environment);
break;
case 'record-deployment':
recordDeployment(version, environment);
break;
case 'can-i-deploy':
const canDeploy = canIDeploy(version, environment);
process.exit(canDeploy ? 0 : 1);
case 'cleanup':
cleanupOldPacts();
break;
default:
console.error('Unknown command. Use: tag-release | record-deployment | can-i-deploy | cleanup');
process.exit(1);
}
}
main();
```
**package.json scripts**:
```json
{
"scripts": {
"pact:tag": "ts-node scripts/pact-broker-housekeeping.ts tag-release",
"pact:record": "ts-node scripts/pact-broker-housekeeping.ts record-deployment",
"pact:can-deploy": "ts-node scripts/pact-broker-housekeeping.ts can-i-deploy",
"pact:cleanup": "ts-node scripts/pact-broker-housekeeping.ts cleanup"
}
}
```
**Deployment workflow integration**:
```yaml
# .github/workflows/deploy-production.yml
name: Deploy to Production
on:
push:
tags:
- 'v*'
jobs:
verify-contracts:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Check pact compatibility
run: npm run pact:can-deploy ${{ github.ref_name }} production
env:
PACT_BROKER_URL: ${{ secrets.PACT_BROKER_URL }}
PACT_BROKER_TOKEN: ${{ secrets.PACT_BROKER_TOKEN }}
deploy:
needs: verify-contracts
runs-on: ubuntu-latest
steps:
- name: Deploy to production
run: ./scripts/deploy.sh production
- name: Record deployment in Pact Broker
run: npm run pact:record ${{ github.ref_name }} production
env:
PACT_BROKER_URL: ${{ secrets.PACT_BROKER_URL }}
PACT_BROKER_TOKEN: ${{ secrets.PACT_BROKER_TOKEN }}
```
**Scheduled cleanup**:
```yaml
# .github/workflows/pact-housekeeping.yml
name: Pact Broker Housekeeping
on:
schedule:
- cron: '0 2 * * 0' # Weekly on Sunday at 2 AM
jobs:
cleanup:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Cleanup old pacts
run: npm run pact:cleanup
env:
PACT_BROKER_URL: ${{ secrets.PACT_BROKER_URL }}
PACT_BROKER_TOKEN: ${{ secrets.PACT_BROKER_TOKEN }}
```
**Key Points**:
- **Automated tagging**: Releases tagged with environment
- **Deployment tracking**: Broker knows which version is where
- **Safety gate**: can-i-deploy blocks incompatible deployments
- **Retention policy**: Keep recent, production, and branch-latest pacts
- **Webhook triggers**: Provider verification runs on consumer changes
---
## Contract Testing Checklist
Before implementing contract testing, verify:
- [ ] **Pact Broker setup**: Hosted (Pactflow) or self-hosted broker configured
- [ ] **Consumer tests**: Generate pacts in CI, publish to broker on merge
- [ ] **Provider verification**: Runs on PR, verifies all consumer pacts
- [ ] **State handlers**: Provider implements all given() states
- [ ] **can-i-deploy**: Blocks deployment if contracts incompatible
- [ ] **Webhooks configured**: Consumer changes trigger provider verification
- [ ] **Retention policy**: Old pacts archived (keep 30 days, all production tags)
- [ ] **Resilience tested**: Timeouts, retries, error codes in contracts
## Integration Points
- Used in workflows: `*automate` (integration test generation), `*ci` (contract CI setup)
- Related fragments: `test-levels-framework.md`, `ci-burn-in.md`
- Tools: Pact.js, Pact Broker (Pactflow or self-hosted), Pact CLI
_Source: Pact consumer/provider sample repos, Murat contract testing blog, Pact official documentation_

View File

@@ -0,0 +1,500 @@
# Data Factories and API-First Setup
## Principle
Prefer factory functions that accept overrides and return complete objects (`createUser(overrides)`). Seed test state through APIs, tasks, or direct DB helpers before visiting the UI—never via slow UI interactions. UI is for validation only, not setup.
## Rationale
Static fixtures (JSON files, hardcoded objects) create brittle tests that:
- Fail when schemas evolve (missing new required fields)
- Cause collisions in parallel execution (same user IDs)
- Hide test intent (what matters for _this_ test?)
Dynamic factories with overrides provide:
- **Parallel safety**: UUIDs and timestamps prevent collisions
- **Schema evolution**: Defaults adapt to schema changes automatically
- **Explicit intent**: Overrides show what matters for each test
- **Speed**: API setup is 10-50x faster than UI
## Pattern Examples
### Example 1: Factory Function with Overrides
**Context**: When creating test data, build factory functions with sensible defaults and explicit overrides. Use `faker` for dynamic values that prevent collisions.
**Implementation**:
```typescript
// test-utils/factories/user-factory.ts
import { faker } from '@faker-js/faker';
type User = {
id: string;
email: string;
name: string;
role: 'user' | 'admin' | 'moderator';
createdAt: Date;
isActive: boolean;
};
export const createUser = (overrides: Partial<User> = {}): User => ({
id: faker.string.uuid(),
email: faker.internet.email(),
name: faker.person.fullName(),
role: 'user',
createdAt: new Date(),
isActive: true,
...overrides,
});
// test-utils/factories/product-factory.ts
type Product = {
id: string;
name: string;
price: number;
stock: number;
category: string;
};
export const createProduct = (overrides: Partial<Product> = {}): Product => ({
id: faker.string.uuid(),
name: faker.commerce.productName(),
price: parseFloat(faker.commerce.price()),
stock: faker.number.int({ min: 0, max: 100 }),
category: faker.commerce.department(),
...overrides,
});
// Usage in tests:
test('admin can delete users', async ({ page, apiRequest }) => {
// Default user
const user = createUser();
// Admin user (explicit override shows intent)
const admin = createUser({ role: 'admin' });
// Seed via API (fast!)
await apiRequest({ method: 'POST', url: '/api/users', data: user });
await apiRequest({ method: 'POST', url: '/api/users', data: admin });
// Now test UI behavior
await page.goto('/admin/users');
await page.click(`[data-testid="delete-user-${user.id}"]`);
await expect(page.getByText(`User ${user.name} deleted`)).toBeVisible();
});
```
**Key Points**:
- `Partial<User>` allows overriding any field without breaking type safety
- Faker generates unique values—no collisions in parallel tests
- Override shows test intent: `createUser({ role: 'admin' })` is explicit
- Factory lives in `test-utils/factories/` for easy reuse
### Example 2: Nested Factory Pattern
**Context**: When testing relationships (orders with users and products), nest factories to create complete object graphs. Control relationship data explicitly.
**Implementation**:
```typescript
// test-utils/factories/order-factory.ts
import { createUser } from './user-factory';
import { createProduct } from './product-factory';
type OrderItem = {
product: Product;
quantity: number;
price: number;
};
type Order = {
id: string;
user: User;
items: OrderItem[];
total: number;
status: 'pending' | 'paid' | 'shipped' | 'delivered';
createdAt: Date;
};
export const createOrderItem = (overrides: Partial<OrderItem> = {}): OrderItem => {
const product = overrides.product || createProduct();
const quantity = overrides.quantity || faker.number.int({ min: 1, max: 5 });
return {
product,
quantity,
price: product.price * quantity,
...overrides,
};
};
export const createOrder = (overrides: Partial<Order> = {}): Order => {
const items = overrides.items || [createOrderItem(), createOrderItem()];
const total = items.reduce((sum, item) => sum + item.price, 0);
return {
id: faker.string.uuid(),
user: overrides.user || createUser(),
items,
total,
status: 'pending',
createdAt: new Date(),
...overrides,
};
};
// Usage in tests:
test('user can view order details', async ({ page, apiRequest }) => {
const user = createUser({ email: 'test@example.com' });
const product1 = createProduct({ name: 'Widget A', price: 10.0 });
const product2 = createProduct({ name: 'Widget B', price: 15.0 });
// Explicit relationships
const order = createOrder({
user,
items: [
createOrderItem({ product: product1, quantity: 2 }), // $20
createOrderItem({ product: product2, quantity: 1 }), // $15
],
});
// Seed via API
await apiRequest({ method: 'POST', url: '/api/users', data: user });
await apiRequest({ method: 'POST', url: '/api/products', data: product1 });
await apiRequest({ method: 'POST', url: '/api/products', data: product2 });
await apiRequest({ method: 'POST', url: '/api/orders', data: order });
// Test UI
await page.goto(`/orders/${order.id}`);
await expect(page.getByText('Widget A x 2')).toBeVisible();
await expect(page.getByText('Widget B x 1')).toBeVisible();
await expect(page.getByText('Total: $35.00')).toBeVisible();
});
```
**Key Points**:
- Nested factories handle relationships (order → user, order → products)
- Overrides cascade: provide custom user/products or use defaults
- Calculated fields (total) derived automatically from nested data
- Explicit relationships make test data clear and maintainable
### Example 3: Factory with API Seeding
**Context**: When tests need data setup, always use API calls or database tasks—never UI navigation. Wrap factory usage with seeding utilities for clean test setup.
**Implementation**:
```typescript
// playwright/support/helpers/seed-helpers.ts
import { APIRequestContext } from '@playwright/test';
import { User, createUser } from '../../test-utils/factories/user-factory';
import { Product, createProduct } from '../../test-utils/factories/product-factory';
export async function seedUser(request: APIRequestContext, overrides: Partial<User> = {}): Promise<User> {
const user = createUser(overrides);
const response = await request.post('/api/users', {
data: user,
});
if (!response.ok()) {
throw new Error(`Failed to seed user: ${response.status()}`);
}
return user;
}
export async function seedProduct(request: APIRequestContext, overrides: Partial<Product> = {}): Promise<Product> {
const product = createProduct(overrides);
const response = await request.post('/api/products', {
data: product,
});
if (!response.ok()) {
throw new Error(`Failed to seed product: ${response.status()}`);
}
return product;
}
// Playwright globalSetup for shared data
// playwright/support/global-setup.ts
import { chromium, FullConfig } from '@playwright/test';
import { seedUser } from './helpers/seed-helpers';
async function globalSetup(config: FullConfig) {
const browser = await chromium.launch();
const page = await browser.newPage();
const context = page.context();
// Seed admin user for all tests
const admin = await seedUser(context.request, {
email: 'admin@example.com',
role: 'admin',
});
// Save auth state for reuse
await context.storageState({ path: 'playwright/.auth/admin.json' });
await browser.close();
}
export default globalSetup;
// Cypress equivalent with cy.task
// cypress/support/tasks.ts
export const seedDatabase = async (entity: string, data: unknown) => {
// Direct database insert or API call
if (entity === 'users') {
await db.users.create(data);
}
return null;
};
// Usage in Cypress tests:
beforeEach(() => {
const user = createUser({ email: 'test@example.com' });
cy.task('db:seed', { entity: 'users', data: user });
});
```
**Key Points**:
- API seeding is 10-50x faster than UI-based setup
- `globalSetup` seeds shared data once (e.g., admin user)
- Per-test seeding uses `seedUser()` helpers for isolation
- Cypress `cy.task` allows direct database access for speed
### Example 4: Anti-Pattern - Hardcoded Test Data
**Problem**:
```typescript
// ❌ BAD: Hardcoded test data
test('user can login', async ({ page }) => {
await page.goto('/login');
await page.fill('[data-testid="email"]', 'test@test.com'); // Hardcoded
await page.fill('[data-testid="password"]', 'password123'); // Hardcoded
await page.click('[data-testid="submit"]');
// What if this user already exists? Test fails in parallel runs.
// What if schema adds required fields? Test breaks.
});
// ❌ BAD: Static JSON fixtures
// fixtures/users.json
{
"users": [
{ "id": 1, "email": "user1@test.com", "name": "User 1" },
{ "id": 2, "email": "user2@test.com", "name": "User 2" }
]
}
test('admin can delete user', async ({ page }) => {
const users = require('../fixtures/users.json');
// Brittle: IDs collide in parallel, schema drift breaks tests
});
```
**Why It Fails**:
- **Parallel collisions**: Hardcoded IDs (`id: 1`, `email: 'test@test.com'`) cause failures when tests run concurrently
- **Schema drift**: Adding required fields (`phoneNumber`, `address`) breaks all tests using fixtures
- **Hidden intent**: Does this test need `email: 'test@test.com'` specifically, or any email?
- **Slow setup**: UI-based data creation is 10-50x slower than API
**Better Approach**: Use factories
```typescript
// ✅ GOOD: Factory-based data
test('user can login', async ({ page, apiRequest }) => {
const user = createUser({ email: 'unique@example.com', password: 'secure123' });
// Seed via API (fast, parallel-safe)
await apiRequest({ method: 'POST', url: '/api/users', data: user });
// Test UI
await page.goto('/login');
await page.fill('[data-testid="email"]', user.email);
await page.fill('[data-testid="password"]', user.password);
await page.click('[data-testid="submit"]');
await expect(page).toHaveURL('/dashboard');
});
// ✅ GOOD: Factories adapt to schema changes automatically
// When `phoneNumber` becomes required, update factory once:
export const createUser = (overrides: Partial<User> = {}): User => ({
id: faker.string.uuid(),
email: faker.internet.email(),
name: faker.person.fullName(),
phoneNumber: faker.phone.number(), // NEW field, all tests get it automatically
role: 'user',
...overrides,
});
```
**Key Points**:
- Factories generate unique, parallel-safe data
- Schema evolution handled in one place (factory), not every test
- Test intent explicit via overrides
- API seeding is fast and reliable
### Example 5: Factory Composition
**Context**: When building specialized factories, compose simpler factories instead of duplicating logic. Layer overrides for specific test scenarios.
**Implementation**:
```typescript
// test-utils/factories/user-factory.ts (base)
export const createUser = (overrides: Partial<User> = {}): User => ({
id: faker.string.uuid(),
email: faker.internet.email(),
name: faker.person.fullName(),
role: 'user',
createdAt: new Date(),
isActive: true,
...overrides,
});
// Compose specialized factories
export const createAdminUser = (overrides: Partial<User> = {}): User => createUser({ role: 'admin', ...overrides });
export const createModeratorUser = (overrides: Partial<User> = {}): User => createUser({ role: 'moderator', ...overrides });
export const createInactiveUser = (overrides: Partial<User> = {}): User => createUser({ isActive: false, ...overrides });
// Account-level factories with feature flags
type Account = {
id: string;
owner: User;
plan: 'free' | 'pro' | 'enterprise';
features: string[];
maxUsers: number;
};
export const createAccount = (overrides: Partial<Account> = {}): Account => ({
id: faker.string.uuid(),
owner: overrides.owner || createUser(),
plan: 'free',
features: [],
maxUsers: 1,
...overrides,
});
export const createProAccount = (overrides: Partial<Account> = {}): Account =>
createAccount({
plan: 'pro',
features: ['advanced-analytics', 'priority-support'],
maxUsers: 10,
...overrides,
});
export const createEnterpriseAccount = (overrides: Partial<Account> = {}): Account =>
createAccount({
plan: 'enterprise',
features: ['advanced-analytics', 'priority-support', 'sso', 'audit-logs'],
maxUsers: 100,
...overrides,
});
// Usage in tests:
test('pro accounts can access analytics', async ({ page, apiRequest }) => {
const admin = createAdminUser({ email: 'admin@company.com' });
const account = createProAccount({ owner: admin });
await apiRequest({ method: 'POST', url: '/api/users', data: admin });
await apiRequest({ method: 'POST', url: '/api/accounts', data: account });
await page.goto('/analytics');
await expect(page.getByText('Advanced Analytics')).toBeVisible();
});
test('free accounts cannot access analytics', async ({ page, apiRequest }) => {
const user = createUser({ email: 'user@company.com' });
const account = createAccount({ owner: user }); // Defaults to free plan
await apiRequest({ method: 'POST', url: '/api/users', data: user });
await apiRequest({ method: 'POST', url: '/api/accounts', data: account });
await page.goto('/analytics');
await expect(page.getByText('Upgrade to Pro')).toBeVisible();
});
```
**Key Points**:
- Compose specialized factories from base factories (`createAdminUser``createUser`)
- Defaults cascade: `createProAccount` sets plan + features automatically
- Still allow overrides: `createProAccount({ maxUsers: 50 })` works
- Test intent clear: `createProAccount()` vs `createAccount({ plan: 'pro', features: [...] })`
## Integration Points
- **Used in workflows**: `*atdd` (test generation), `*automate` (test expansion), `*framework` (factory setup)
- **Related fragments**:
- `fixture-architecture.md` - Pure functions and fixtures for factory integration
- `network-first.md` - API-first setup patterns
- `test-quality.md` - Parallel-safe, deterministic test design
## Cleanup Strategy
Ensure factories work with cleanup patterns:
```typescript
// Track created IDs for cleanup
const createdUsers: string[] = [];
afterEach(async ({ apiRequest }) => {
// Clean up all users created during test
for (const userId of createdUsers) {
await apiRequest({ method: 'DELETE', url: `/api/users/${userId}` });
}
createdUsers.length = 0;
});
test('user registration flow', async ({ page, apiRequest }) => {
const user = createUser();
createdUsers.push(user.id);
await apiRequest({ method: 'POST', url: '/api/users', data: user });
// ... test logic
});
```
## Feature Flag Integration
When working with feature flags, layer them into factories:
```typescript
export const createUserWithFlags = (
overrides: Partial<User> = {},
flags: Record<string, boolean> = {},
): User & { flags: Record<string, boolean> } => ({
...createUser(overrides),
flags: {
'new-dashboard': false,
'beta-features': false,
...flags,
},
});
// Usage:
const user = createUserWithFlags(
{ email: 'test@example.com' },
{
'new-dashboard': true,
'beta-features': true,
},
);
```
_Source: Murat Testing Philosophy (lines 94-120), API-first testing patterns, faker.js documentation._

View File

@@ -0,0 +1,721 @@
# Email-Based Authentication Testing
## Principle
Email-based authentication (magic links, one-time codes, passwordless login) requires specialized testing with email capture services like Mailosaur or Ethereal. Extract magic links via HTML parsing or use built-in link extraction, preserve browser storage (local/session/cookies) when processing links, cache email payloads to avoid exhausting inbox quotas, and cover negative cases (expired links, reused links, multiple rapid requests). Log email IDs and links for troubleshooting, but scrub PII before committing artifacts.
## Rationale
Email authentication introduces unique challenges: asynchronous email delivery, quota limits (AWS Cognito: 50/day), cost per email, and complex state management (session preservation across link clicks). Without proper patterns, tests become slow (wait for email each time), expensive (quota exhaustion), and brittle (timing issues, missing state). Using email capture services + session caching + state preservation patterns makes email auth tests fast, reliable, and cost-effective.
## Pattern Examples
### Example 1: Magic Link Extraction with Mailosaur
**Context**: Passwordless login flow where user receives magic link via email, clicks it, and is authenticated.
**Implementation**:
```typescript
// tests/e2e/magic-link-auth.spec.ts
import { test, expect } from '@playwright/test';
/**
* Magic Link Authentication Flow
* 1. User enters email
* 2. Backend sends magic link
* 3. Test retrieves email via Mailosaur
* 4. Extract and visit magic link
* 5. Verify user is authenticated
*/
// Mailosaur configuration
const MAILOSAUR_API_KEY = process.env.MAILOSAUR_API_KEY!;
const MAILOSAUR_SERVER_ID = process.env.MAILOSAUR_SERVER_ID!;
/**
* Extract href from HTML email body
* DOMParser provides XML/HTML parsing in Node.js
*/
function extractMagicLink(htmlString: string): string | null {
const { JSDOM } = require('jsdom');
const dom = new JSDOM(htmlString);
const link = dom.window.document.querySelector('#magic-link-button');
return link ? (link as HTMLAnchorElement).href : null;
}
/**
* Alternative: Use Mailosaur's built-in link extraction
* Mailosaur automatically parses links - no regex needed!
*/
async function getMagicLinkFromEmail(email: string): Promise<string> {
const MailosaurClient = require('mailosaur');
const mailosaur = new MailosaurClient(MAILOSAUR_API_KEY);
// Wait for email (timeout: 30 seconds)
const message = await mailosaur.messages.get(
MAILOSAUR_SERVER_ID,
{
sentTo: email,
},
{
timeout: 30000, // 30 seconds
},
);
// Mailosaur extracts links automatically - no parsing needed!
const magicLink = message.html?.links?.[0]?.href;
if (!magicLink) {
throw new Error(`Magic link not found in email to ${email}`);
}
console.log(`📧 Email received. Magic link extracted: ${magicLink}`);
return magicLink;
}
test.describe('Magic Link Authentication', () => {
test('should authenticate user via magic link', async ({ page, context }) => {
// Arrange: Generate unique test email
const randomId = Math.floor(Math.random() * 1000000);
const testEmail = `user-${randomId}@${MAILOSAUR_SERVER_ID}.mailosaur.net`;
// Act: Request magic link
await page.goto('/login');
await page.getByTestId('email-input').fill(testEmail);
await page.getByTestId('send-magic-link').click();
// Assert: Success message
await expect(page.getByTestId('check-email-message')).toBeVisible();
await expect(page.getByTestId('check-email-message')).toContainText('Check your email');
// Retrieve magic link from email
const magicLink = await getMagicLinkFromEmail(testEmail);
// Visit magic link
await page.goto(magicLink);
// Assert: User is authenticated
await expect(page.getByTestId('user-menu')).toBeVisible();
await expect(page.getByTestId('user-email')).toContainText(testEmail);
// Verify session storage preserved
const localStorage = await page.evaluate(() => JSON.stringify(window.localStorage));
expect(localStorage).toContain('authToken');
});
test('should handle expired magic link', async ({ page }) => {
// Use pre-expired link (older than 15 minutes)
const expiredLink = 'http://localhost:3000/auth/verify?token=expired-token-123';
await page.goto(expiredLink);
// Assert: Error message displayed
await expect(page.getByTestId('error-message')).toBeVisible();
await expect(page.getByTestId('error-message')).toContainText('link has expired');
// Assert: User NOT authenticated
await expect(page.getByTestId('user-menu')).not.toBeVisible();
});
test('should prevent reusing magic link', async ({ page }) => {
const randomId = Math.floor(Math.random() * 1000000);
const testEmail = `user-${randomId}@${MAILOSAUR_SERVER_ID}.mailosaur.net`;
// Request magic link
await page.goto('/login');
await page.getByTestId('email-input').fill(testEmail);
await page.getByTestId('send-magic-link').click();
const magicLink = await getMagicLinkFromEmail(testEmail);
// Visit link first time (success)
await page.goto(magicLink);
await expect(page.getByTestId('user-menu')).toBeVisible();
// Sign out
await page.getByTestId('sign-out').click();
// Try to reuse same link (should fail)
await page.goto(magicLink);
await expect(page.getByTestId('error-message')).toBeVisible();
await expect(page.getByTestId('error-message')).toContainText('link has already been used');
});
});
```
**Cypress equivalent with Mailosaur plugin**:
```javascript
// cypress/e2e/magic-link-auth.cy.ts
describe('Magic Link Authentication', () => {
it('should authenticate user via magic link', () => {
const serverId = Cypress.env('MAILOSAUR_SERVERID');
const randomId = Cypress._.random(1e6);
const testEmail = `user-${randomId}@${serverId}.mailosaur.net`;
// Request magic link
cy.visit('/login');
cy.get('[data-cy="email-input"]').type(testEmail);
cy.get('[data-cy="send-magic-link"]').click();
cy.get('[data-cy="check-email-message"]').should('be.visible');
// Retrieve and visit magic link
cy.mailosaurGetMessage(serverId, { sentTo: testEmail })
.its('html.links.0.href') // Mailosaur extracts links automatically!
.should('exist')
.then((magicLink) => {
cy.log(`Magic link: ${magicLink}`);
cy.visit(magicLink);
});
// Verify authenticated
cy.get('[data-cy="user-menu"]').should('be.visible');
cy.get('[data-cy="user-email"]').should('contain', testEmail);
});
});
```
**Key Points**:
- **Mailosaur auto-extraction**: `html.links[0].href` or `html.codes[0].value`
- **Unique emails**: Random ID prevents collisions
- **Negative testing**: Expired and reused links tested
- **State verification**: localStorage/session checked
- **Fast email retrieval**: 30 second timeout typical
---
### Example 2: State Preservation Pattern with cy.session / Playwright storageState
**Context**: Cache authenticated session to avoid requesting magic link on every test.
**Implementation**:
```typescript
// playwright/fixtures/email-auth-fixture.ts
import { test as base } from '@playwright/test';
import { getMagicLinkFromEmail } from '../support/mailosaur-helpers';
type EmailAuthFixture = {
authenticatedUser: { email: string; token: string };
};
export const test = base.extend<EmailAuthFixture>({
authenticatedUser: async ({ page, context }, use) => {
const randomId = Math.floor(Math.random() * 1000000);
const testEmail = `user-${randomId}@${process.env.MAILOSAUR_SERVER_ID}.mailosaur.net`;
// Check if we have cached auth state for this email
const storageStatePath = `./test-results/auth-state-${testEmail}.json`;
try {
// Try to reuse existing session
await context.storageState({ path: storageStatePath });
await page.goto('/dashboard');
// Validate session is still valid
const isAuthenticated = await page.getByTestId('user-menu').isVisible({ timeout: 2000 });
if (isAuthenticated) {
console.log(`✅ Reusing cached session for ${testEmail}`);
await use({ email: testEmail, token: 'cached' });
return;
}
} catch (error) {
console.log(`📧 No cached session, requesting magic link for ${testEmail}`);
}
// Request new magic link
await page.goto('/login');
await page.getByTestId('email-input').fill(testEmail);
await page.getByTestId('send-magic-link').click();
// Get magic link from email
const magicLink = await getMagicLinkFromEmail(testEmail);
// Visit link and authenticate
await page.goto(magicLink);
await expect(page.getByTestId('user-menu')).toBeVisible();
// Extract auth token from localStorage
const authToken = await page.evaluate(() => localStorage.getItem('authToken'));
// Save session state for reuse
await context.storageState({ path: storageStatePath });
console.log(`💾 Cached session for ${testEmail}`);
await use({ email: testEmail, token: authToken || '' });
},
});
```
**Cypress equivalent with cy.session + data-session**:
```javascript
// cypress/support/commands/email-auth.js
import { dataSession } from 'cypress-data-session';
/**
* Authenticate via magic link with session caching
* - First run: Requests email, extracts link, authenticates
* - Subsequent runs: Reuses cached session (no email)
*/
Cypress.Commands.add('authViaMagicLink', (email) => {
return dataSession({
name: `magic-link-${email}`,
// First-time setup: Request and process magic link
setup: () => {
cy.visit('/login');
cy.get('[data-cy="email-input"]').type(email);
cy.get('[data-cy="send-magic-link"]').click();
// Get magic link from Mailosaur
cy.mailosaurGetMessage(Cypress.env('MAILOSAUR_SERVERID'), {
sentTo: email,
})
.its('html.links.0.href')
.should('exist')
.then((magicLink) => {
cy.visit(magicLink);
});
// Wait for authentication
cy.get('[data-cy="user-menu"]', { timeout: 10000 }).should('be.visible');
// Preserve authentication state
return cy.getAllLocalStorage().then((storage) => {
return { storage, email };
});
},
// Validate cached session is still valid
validate: (cached) => {
return cy.wrap(Boolean(cached?.storage));
},
// Recreate session from cache (no email needed)
recreate: (cached) => {
// Restore localStorage
cy.setLocalStorage(cached.storage);
cy.visit('/dashboard');
cy.get('[data-cy="user-menu"]', { timeout: 5000 }).should('be.visible');
},
shareAcrossSpecs: true, // Share session across all tests
});
});
```
**Usage in tests**:
```javascript
// cypress/e2e/dashboard.cy.ts
describe('Dashboard', () => {
const serverId = Cypress.env('MAILOSAUR_SERVERID');
const testEmail = `test-user@${serverId}.mailosaur.net`;
beforeEach(() => {
// First test: Requests magic link
// Subsequent tests: Reuses cached session (no email!)
cy.authViaMagicLink(testEmail);
});
it('should display user dashboard', () => {
cy.get('[data-cy="dashboard-content"]').should('be.visible');
});
it('should show user profile', () => {
cy.get('[data-cy="user-email"]').should('contain', testEmail);
});
// Both tests share same session - only 1 email consumed!
});
```
**Key Points**:
- **Session caching**: First test requests email, rest reuse session
- **State preservation**: localStorage/cookies saved and restored
- **Validation**: Check cached session is still valid
- **Quota optimization**: Massive reduction in email consumption
- **Fast tests**: Cached auth takes seconds vs. minutes
---
### Example 3: Negative Flow Tests (Expired, Invalid, Reused Links)
**Context**: Comprehensive negative testing for email authentication edge cases.
**Implementation**:
```typescript
// tests/e2e/email-auth-negative.spec.ts
import { test, expect } from '@playwright/test';
import { getMagicLinkFromEmail } from '../support/mailosaur-helpers';
const MAILOSAUR_SERVER_ID = process.env.MAILOSAUR_SERVER_ID!;
test.describe('Email Auth Negative Flows', () => {
test('should reject expired magic link', async ({ page }) => {
// Generate expired link (simulate 24 hours ago)
const expiredToken = Buffer.from(
JSON.stringify({
email: 'test@example.com',
exp: Date.now() - 24 * 60 * 60 * 1000, // 24 hours ago
}),
).toString('base64');
const expiredLink = `http://localhost:3000/auth/verify?token=${expiredToken}`;
// Visit expired link
await page.goto(expiredLink);
// Assert: Error displayed
await expect(page.getByTestId('error-message')).toBeVisible();
await expect(page.getByTestId('error-message')).toContainText(/link.*expired|expired.*link/i);
// Assert: Link to request new one
await expect(page.getByTestId('request-new-link')).toBeVisible();
// Assert: User NOT authenticated
await expect(page.getByTestId('user-menu')).not.toBeVisible();
});
test('should reject invalid magic link token', async ({ page }) => {
const invalidLink = 'http://localhost:3000/auth/verify?token=invalid-garbage';
await page.goto(invalidLink);
// Assert: Error displayed
await expect(page.getByTestId('error-message')).toBeVisible();
await expect(page.getByTestId('error-message')).toContainText(/invalid.*link|link.*invalid/i);
// Assert: User not authenticated
await expect(page.getByTestId('user-menu')).not.toBeVisible();
});
test('should reject already-used magic link', async ({ page, context }) => {
const randomId = Math.floor(Math.random() * 1000000);
const testEmail = `user-${randomId}@${MAILOSAUR_SERVER_ID}.mailosaur.net`;
// Request magic link
await page.goto('/login');
await page.getByTestId('email-input').fill(testEmail);
await page.getByTestId('send-magic-link').click();
const magicLink = await getMagicLinkFromEmail(testEmail);
// Visit link FIRST time (success)
await page.goto(magicLink);
await expect(page.getByTestId('user-menu')).toBeVisible();
// Sign out
await page.getByTestId('user-menu').click();
await page.getByTestId('sign-out').click();
await expect(page.getByTestId('user-menu')).not.toBeVisible();
// Try to reuse SAME link (should fail)
await page.goto(magicLink);
// Assert: Link already used error
await expect(page.getByTestId('error-message')).toBeVisible();
await expect(page.getByTestId('error-message')).toContainText(/already.*used|link.*used/i);
// Assert: User not authenticated
await expect(page.getByTestId('user-menu')).not.toBeVisible();
});
test('should handle rapid successive link requests', async ({ page }) => {
const randomId = Math.floor(Math.random() * 1000000);
const testEmail = `user-${randomId}@${MAILOSAUR_SERVER_ID}.mailosaur.net`;
// Request magic link 3 times rapidly
for (let i = 0; i < 3; i++) {
await page.goto('/login');
await page.getByTestId('email-input').fill(testEmail);
await page.getByTestId('send-magic-link').click();
await expect(page.getByTestId('check-email-message')).toBeVisible();
}
// Only the LATEST link should work
const MailosaurClient = require('mailosaur');
const mailosaur = new MailosaurClient(process.env.MAILOSAUR_API_KEY);
const messages = await mailosaur.messages.list(MAILOSAUR_SERVER_ID, {
sentTo: testEmail,
});
// Should receive 3 emails
expect(messages.items.length).toBeGreaterThanOrEqual(3);
// Get the LATEST magic link
const latestMessage = messages.items[0]; // Most recent first
const latestLink = latestMessage.html.links[0].href;
// Latest link works
await page.goto(latestLink);
await expect(page.getByTestId('user-menu')).toBeVisible();
// Older links should NOT work (if backend invalidates previous)
await page.getByTestId('sign-out').click();
const olderLink = messages.items[1].html.links[0].href;
await page.goto(olderLink);
await expect(page.getByTestId('error-message')).toBeVisible();
});
test('should rate-limit excessive magic link requests', async ({ page }) => {
const randomId = Math.floor(Math.random() * 1000000);
const testEmail = `user-${randomId}@${MAILOSAUR_SERVER_ID}.mailosaur.net`;
// Request magic link 10 times rapidly (should hit rate limit)
for (let i = 0; i < 10; i++) {
await page.goto('/login');
await page.getByTestId('email-input').fill(testEmail);
await page.getByTestId('send-magic-link').click();
// After N requests, should show rate limit error
const errorVisible = await page
.getByTestId('rate-limit-error')
.isVisible({ timeout: 1000 })
.catch(() => false);
if (errorVisible) {
console.log(`Rate limit hit after ${i + 1} requests`);
await expect(page.getByTestId('rate-limit-error')).toContainText(/too many.*requests|rate.*limit/i);
return;
}
}
// If no rate limit after 10 requests, log warning
console.warn('⚠️ No rate limit detected after 10 requests');
});
});
```
**Key Points**:
- **Expired links**: Test 24+ hour old tokens
- **Invalid tokens**: Malformed or garbage tokens rejected
- **Reuse prevention**: Same link can't be used twice
- **Rapid requests**: Multiple requests handled gracefully
- **Rate limiting**: Excessive requests blocked
---
### Example 4: Caching Strategy with cypress-data-session / Playwright Projects
**Context**: Minimize email consumption by sharing authentication state across tests and specs.
**Implementation**:
```javascript
// cypress/support/commands/register-and-sign-in.js
import { dataSession } from 'cypress-data-session';
/**
* Email Authentication Caching Strategy
* - One email per test run (not per spec, not per test)
* - First spec: Full registration flow (form → email → code → sign in)
* - Subsequent specs: Only sign in (reuse user)
* - Subsequent tests in same spec: Session already active (no sign in)
*/
// Helper: Fill registration form
function fillRegistrationForm({ fullName, userName, email, password }) {
cy.intercept('POST', 'https://cognito-idp*').as('cognito');
cy.contains('Register').click();
cy.get('#reg-dialog-form').should('be.visible');
cy.get('#first-name').type(fullName, { delay: 0 });
cy.get('#last-name').type(lastName, { delay: 0 });
cy.get('#email').type(email, { delay: 0 });
cy.get('#username').type(userName, { delay: 0 });
cy.get('#password').type(password, { delay: 0 });
cy.contains('button', 'Create an account').click();
cy.wait('@cognito').its('response.statusCode').should('equal', 200);
}
// Helper: Confirm registration with email code
function confirmRegistration(email) {
return cy
.mailosaurGetMessage(Cypress.env('MAILOSAUR_SERVERID'), { sentTo: email })
.its('html.codes.0.value') // Mailosaur auto-extracts codes!
.then((code) => {
cy.intercept('POST', 'https://cognito-idp*').as('cognito');
cy.get('#verification-code').type(code, { delay: 0 });
cy.contains('button', 'Confirm registration').click();
cy.wait('@cognito');
cy.contains('You are now registered!').should('be.visible');
cy.contains('button', /ok/i).click();
return cy.wrap(code); // Return code for reference
});
}
// Helper: Full registration (form + email)
function register({ fullName, userName, email, password }) {
fillRegistrationForm({ fullName, userName, email, password });
return confirmRegistration(email);
}
// Helper: Sign in
function signIn({ userName, password }) {
cy.intercept('POST', 'https://cognito-idp*').as('cognito');
cy.contains('Sign in').click();
cy.get('#sign-in-username').type(userName, { delay: 0 });
cy.get('#sign-in-password').type(password, { delay: 0 });
cy.contains('button', 'Sign in').click();
cy.wait('@cognito');
cy.contains('Sign out').should('be.visible');
}
/**
* Register and sign in with email caching
* ONE EMAIL PER MACHINE (cypress run or cypress open)
*/
Cypress.Commands.add('registerAndSignIn', ({ fullName, userName, email, password }) => {
return dataSession({
name: email, // Unique session per email
// First time: Full registration (form → email → code)
init: () => register({ fullName, userName, email, password }),
// Subsequent specs: Just check email exists (code already used)
setup: () => confirmRegistration(email),
// Always runs after init/setup: Sign in
recreate: () => signIn({ userName, password }),
// Share across ALL specs (one email for entire test run)
shareAcrossSpecs: true,
});
});
```
**Usage across multiple specs**:
```javascript
// cypress/e2e/place-order.cy.ts
describe('Place Order', () => {
beforeEach(() => {
cy.visit('/');
cy.registerAndSignIn({
fullName: Cypress.env('fullName'), // From cypress.config
userName: Cypress.env('userName'),
email: Cypress.env('email'), // SAME email across all specs
password: Cypress.env('password'),
});
});
it('should place order', () => {
/* ... */
});
it('should view order history', () => {
/* ... */
});
});
// cypress/e2e/profile.cy.ts
describe('User Profile', () => {
beforeEach(() => {
cy.visit('/');
cy.registerAndSignIn({
fullName: Cypress.env('fullName'),
userName: Cypress.env('userName'),
email: Cypress.env('email'), // SAME email - no new email sent!
password: Cypress.env('password'),
});
});
it('should update profile', () => {
/* ... */
});
});
```
**Playwright equivalent with storageState**:
```typescript
// playwright.config.ts
import { defineConfig } from '@playwright/test';
export default defineConfig({
projects: [
{
name: 'setup',
testMatch: /global-setup\.ts/,
},
{
name: 'authenticated',
testMatch: /.*\.spec\.ts/,
dependencies: ['setup'],
use: {
storageState: '.auth/user-session.json', // Reuse auth state
},
},
],
});
```
```typescript
// tests/global-setup.ts (runs once)
import { test as setup } from '@playwright/test';
import { getMagicLinkFromEmail } from './support/mailosaur-helpers';
const authFile = '.auth/user-session.json';
setup('authenticate via magic link', async ({ page }) => {
const testEmail = process.env.TEST_USER_EMAIL!;
// Request magic link
await page.goto('/login');
await page.getByTestId('email-input').fill(testEmail);
await page.getByTestId('send-magic-link').click();
// Get and visit magic link
const magicLink = await getMagicLinkFromEmail(testEmail);
await page.goto(magicLink);
// Verify authenticated
await expect(page.getByTestId('user-menu')).toBeVisible();
// Save authenticated state (ONE TIME for all tests)
await page.context().storageState({ path: authFile });
console.log('✅ Authentication state saved to', authFile);
});
```
**Key Points**:
- **One email per run**: Global setup authenticates once
- **State reuse**: All tests use cached storageState
- **cypress-data-session**: Intelligently manages cache lifecycle
- **shareAcrossSpecs**: Session shared across all spec files
- **Massive savings**: 500 tests = 1 email (not 500!)
---
## Email Authentication Testing Checklist
Before implementing email auth tests, verify:
- [ ] **Email service**: Mailosaur/Ethereal/MailHog configured with API keys
- [ ] **Link extraction**: Use built-in parsing (html.links[0].href) over regex
- [ ] **State preservation**: localStorage/session/cookies saved and restored
- [ ] **Session caching**: cypress-data-session or storageState prevents redundant emails
- [ ] **Negative flows**: Expired, invalid, reused, rapid requests tested
- [ ] **Quota awareness**: One email per run (not per test)
- [ ] **PII scrubbing**: Email IDs logged for debug, but scrubbed from artifacts
- [ ] **Timeout handling**: 30 second email retrieval timeout configured
## Integration Points
- Used in workflows: `*framework` (email auth setup), `*automate` (email auth test generation)
- Related fragments: `fixture-architecture.md`, `test-quality.md`
- Email services: Mailosaur (recommended), Ethereal (free), MailHog (self-hosted)
- Plugins: cypress-mailosaur, cypress-data-session
_Source: Email authentication blog, Murat testing toolkit, Mailosaur documentation_

View File

@@ -0,0 +1,725 @@
# Error Handling and Resilience Checks
## Principle
Treat expected failures explicitly: intercept network errors, assert UI fallbacks (error messages visible, retries triggered), and use scoped exception handling to ignore known errors while catching regressions. Test retry/backoff logic by forcing sequential failures (500 → timeout → success) and validate telemetry logging. Log captured errors with context (request payload, user/session) but redact secrets to keep artifacts safe for sharing.
## Rationale
Tests fail for two reasons: genuine bugs or poor error handling in the test itself. Without explicit error handling patterns, tests become noisy (uncaught exceptions cause false failures) or silent (swallowing all errors hides real bugs). Scoped exception handling (Cypress.on('uncaught:exception'), page.on('pageerror')) allows tests to ignore documented, expected errors while surfacing unexpected ones. Resilience testing (retry logic, graceful degradation) ensures applications handle failures gracefully in production.
## Pattern Examples
### Example 1: Scoped Exception Handling (Expected Errors Only)
**Context**: Handle known errors (Network failures, expected 500s) without masking unexpected bugs.
**Implementation**:
```typescript
// tests/e2e/error-handling.spec.ts
import { test, expect } from '@playwright/test';
/**
* Scoped Error Handling Pattern
* - Only ignore specific, documented errors
* - Rethrow everything else to catch regressions
* - Validate error UI and user experience
*/
test.describe('API Error Handling', () => {
test('should display error message when API returns 500', async ({ page }) => {
// Scope error handling to THIS test only
const consoleErrors: string[] = [];
page.on('pageerror', (error) => {
// Only swallow documented NetworkError
if (error.message.includes('NetworkError: Failed to fetch')) {
consoleErrors.push(error.message);
return; // Swallow this specific error
}
// Rethrow all other errors (catch regressions!)
throw error;
});
// Arrange: Mock 500 error response
await page.route('**/api/users', (route) =>
route.fulfill({
status: 500,
contentType: 'application/json',
body: JSON.stringify({
error: 'Internal server error',
code: 'INTERNAL_ERROR',
}),
}),
);
// Act: Navigate to page that fetches users
await page.goto('/dashboard');
// Assert: Error UI displayed
await expect(page.getByTestId('error-message')).toBeVisible();
await expect(page.getByTestId('error-message')).toContainText(/error.*loading|failed.*load/i);
// Assert: Retry button visible
await expect(page.getByTestId('retry-button')).toBeVisible();
// Assert: NetworkError was thrown and caught
expect(consoleErrors).toContainEqual(expect.stringContaining('NetworkError'));
});
test('should NOT swallow unexpected errors', async ({ page }) => {
let unexpectedError: Error | null = null;
page.on('pageerror', (error) => {
// Capture but don't swallow - test should fail
unexpectedError = error;
throw error;
});
// Arrange: App has JavaScript error (bug)
await page.addInitScript(() => {
// Simulate bug in app code
(window as any).buggyFunction = () => {
throw new Error('UNEXPECTED BUG: undefined is not a function');
};
});
await page.goto('/dashboard');
// Trigger buggy function
await page.evaluate(() => (window as any).buggyFunction());
// Assert: Test fails because unexpected error was NOT swallowed
expect(unexpectedError).not.toBeNull();
expect(unexpectedError?.message).toContain('UNEXPECTED BUG');
});
});
```
**Cypress equivalent**:
```javascript
// cypress/e2e/error-handling.cy.ts
describe('API Error Handling', () => {
it('should display error message when API returns 500', () => {
// Scoped to this test only
cy.on('uncaught:exception', (err) => {
// Only swallow documented NetworkError
if (err.message.includes('NetworkError')) {
return false; // Prevent test failure
}
// All other errors fail the test
return true;
});
// Arrange: Mock 500 error
cy.intercept('GET', '**/api/users', {
statusCode: 500,
body: {
error: 'Internal server error',
code: 'INTERNAL_ERROR',
},
}).as('getUsers');
// Act
cy.visit('/dashboard');
cy.wait('@getUsers');
// Assert: Error UI
cy.get('[data-cy="error-message"]').should('be.visible');
cy.get('[data-cy="error-message"]').should('contain', 'error loading');
cy.get('[data-cy="retry-button"]').should('be.visible');
});
it('should NOT swallow unexpected errors', () => {
// No exception handler - test should fail on unexpected errors
cy.visit('/dashboard');
// Trigger unexpected error
cy.window().then((win) => {
// This should fail the test
win.eval('throw new Error("UNEXPECTED BUG")');
});
// Test fails (as expected) - validates error detection works
});
});
```
**Key Points**:
- **Scoped handling**: page.on() / cy.on() scoped to specific tests
- **Explicit allow-list**: Only ignore documented errors
- **Rethrow unexpected**: Catch regressions by failing on unknown errors
- **Error UI validation**: Assert user sees error message
- **Logging**: Capture errors for debugging, don't swallow silently
---
### Example 2: Retry Validation Pattern (Network Resilience)
**Context**: Test that retry/backoff logic works correctly for transient failures.
**Implementation**:
```typescript
// tests/e2e/retry-resilience.spec.ts
import { test, expect } from '@playwright/test';
/**
* Retry Validation Pattern
* - Force sequential failures (500 → 500 → 200)
* - Validate retry attempts and backoff timing
* - Assert telemetry captures retry events
*/
test.describe('Network Retry Logic', () => {
test('should retry on 500 error and succeed', async ({ page }) => {
let attemptCount = 0;
const attemptTimestamps: number[] = [];
// Mock API: Fail twice, succeed on third attempt
await page.route('**/api/products', (route) => {
attemptCount++;
attemptTimestamps.push(Date.now());
if (attemptCount <= 2) {
// First 2 attempts: 500 error
route.fulfill({
status: 500,
body: JSON.stringify({ error: 'Server error' }),
});
} else {
// 3rd attempt: Success
route.fulfill({
status: 200,
contentType: 'application/json',
body: JSON.stringify({ products: [{ id: 1, name: 'Product 1' }] }),
});
}
});
// Act: Navigate (should retry automatically)
await page.goto('/products');
// Assert: Data eventually loads after retries
await expect(page.getByTestId('product-list')).toBeVisible();
await expect(page.getByTestId('product-item')).toHaveCount(1);
// Assert: Exactly 3 attempts made
expect(attemptCount).toBe(3);
// Assert: Exponential backoff timing (1s → 2s between attempts)
if (attemptTimestamps.length === 3) {
const delay1 = attemptTimestamps[1] - attemptTimestamps[0];
const delay2 = attemptTimestamps[2] - attemptTimestamps[1];
expect(delay1).toBeGreaterThanOrEqual(900); // ~1 second
expect(delay1).toBeLessThan(1200);
expect(delay2).toBeGreaterThanOrEqual(1900); // ~2 seconds
expect(delay2).toBeLessThan(2200);
}
// Assert: Telemetry logged retry events
const telemetryEvents = await page.evaluate(() => (window as any).__TELEMETRY_EVENTS__ || []);
expect(telemetryEvents).toContainEqual(
expect.objectContaining({
event: 'api_retry',
attempt: 1,
endpoint: '/api/products',
}),
);
expect(telemetryEvents).toContainEqual(
expect.objectContaining({
event: 'api_retry',
attempt: 2,
}),
);
});
test('should give up after max retries and show error', async ({ page }) => {
let attemptCount = 0;
// Mock API: Always fail (test retry limit)
await page.route('**/api/products', (route) => {
attemptCount++;
route.fulfill({
status: 500,
body: JSON.stringify({ error: 'Persistent server error' }),
});
});
// Act
await page.goto('/products');
// Assert: Max retries reached (3 attempts typical)
expect(attemptCount).toBe(3);
// Assert: Error UI displayed after exhausting retries
await expect(page.getByTestId('error-message')).toBeVisible();
await expect(page.getByTestId('error-message')).toContainText(/unable.*load|failed.*after.*retries/i);
// Assert: Data not displayed
await expect(page.getByTestId('product-list')).not.toBeVisible();
});
test('should NOT retry on 404 (non-retryable error)', async ({ page }) => {
let attemptCount = 0;
// Mock API: 404 error (should NOT retry)
await page.route('**/api/products/999', (route) => {
attemptCount++;
route.fulfill({
status: 404,
body: JSON.stringify({ error: 'Product not found' }),
});
});
await page.goto('/products/999');
// Assert: Only 1 attempt (no retries on 404)
expect(attemptCount).toBe(1);
// Assert: 404 error displayed immediately
await expect(page.getByTestId('not-found-message')).toBeVisible();
});
});
```
**Cypress with retry interception**:
```javascript
// cypress/e2e/retry-resilience.cy.ts
describe('Network Retry Logic', () => {
it('should retry on 500 and succeed on 3rd attempt', () => {
let attemptCount = 0;
cy.intercept('GET', '**/api/products', (req) => {
attemptCount++;
if (attemptCount <= 2) {
req.reply({ statusCode: 500, body: { error: 'Server error' } });
} else {
req.reply({ statusCode: 200, body: { products: [{ id: 1, name: 'Product 1' }] } });
}
}).as('getProducts');
cy.visit('/products');
// Wait for final successful request
cy.wait('@getProducts').its('response.statusCode').should('eq', 200);
// Assert: Data loaded
cy.get('[data-cy="product-list"]').should('be.visible');
cy.get('[data-cy="product-item"]').should('have.length', 1);
// Validate retry count
cy.wrap(attemptCount).should('eq', 3);
});
});
```
**Key Points**:
- **Sequential failures**: Test retry logic with 500 → 500 → 200
- **Backoff timing**: Validate exponential backoff delays
- **Retry limits**: Max attempts enforced (typically 3)
- **Non-retryable errors**: 404s don't trigger retries
- **Telemetry**: Log retry attempts for monitoring
---
### Example 3: Telemetry Logging with Context (Sentry Integration)
**Context**: Capture errors with full context for production debugging without exposing secrets.
**Implementation**:
```typescript
// tests/e2e/telemetry-logging.spec.ts
import { test, expect } from '@playwright/test';
/**
* Telemetry Logging Pattern
* - Log errors with request context
* - Redact sensitive data (tokens, passwords, PII)
* - Integrate with monitoring (Sentry, Datadog)
* - Validate error logging without exposing secrets
*/
type ErrorLog = {
level: 'error' | 'warn' | 'info';
message: string;
context?: {
endpoint?: string;
method?: string;
statusCode?: number;
userId?: string;
sessionId?: string;
};
timestamp: string;
};
test.describe('Error Telemetry', () => {
test('should log API errors with context', async ({ page }) => {
const errorLogs: ErrorLog[] = [];
// Capture console errors
page.on('console', (msg) => {
if (msg.type() === 'error') {
try {
const log = JSON.parse(msg.text());
errorLogs.push(log);
} catch {
// Not a structured log, ignore
}
}
});
// Mock failing API
await page.route('**/api/orders', (route) =>
route.fulfill({
status: 500,
body: JSON.stringify({ error: 'Payment processor unavailable' }),
}),
);
// Act: Trigger error
await page.goto('/checkout');
await page.getByTestId('place-order').click();
// Wait for error UI
await expect(page.getByTestId('error-message')).toBeVisible();
// Assert: Error logged with context
expect(errorLogs).toContainEqual(
expect.objectContaining({
level: 'error',
message: expect.stringContaining('API request failed'),
context: expect.objectContaining({
endpoint: '/api/orders',
method: 'POST',
statusCode: 500,
userId: expect.any(String),
}),
}),
);
// Assert: Sensitive data NOT logged
const logString = JSON.stringify(errorLogs);
expect(logString).not.toContain('password');
expect(logString).not.toContain('token');
expect(logString).not.toContain('creditCard');
});
test('should send errors to Sentry with breadcrumbs', async ({ page }) => {
const sentryEvents: any[] = [];
// Mock Sentry SDK
await page.addInitScript(() => {
(window as any).Sentry = {
captureException: (error: Error, context?: any) => {
(window as any).__SENTRY_EVENTS__ = (window as any).__SENTRY_EVENTS__ || [];
(window as any).__SENTRY_EVENTS__.push({
error: error.message,
context,
timestamp: Date.now(),
});
},
addBreadcrumb: (breadcrumb: any) => {
(window as any).__SENTRY_BREADCRUMBS__ = (window as any).__SENTRY_BREADCRUMBS__ || [];
(window as any).__SENTRY_BREADCRUMBS__.push(breadcrumb);
},
};
});
// Mock failing API
await page.route('**/api/users', (route) => route.fulfill({ status: 403, body: { error: 'Forbidden' } }));
// Act
await page.goto('/users');
// Assert: Sentry captured error
const events = await page.evaluate(() => (window as any).__SENTRY_EVENTS__);
expect(events).toHaveLength(1);
expect(events[0]).toMatchObject({
error: expect.stringContaining('403'),
context: expect.objectContaining({
endpoint: '/api/users',
statusCode: 403,
}),
});
// Assert: Breadcrumbs include user actions
const breadcrumbs = await page.evaluate(() => (window as any).__SENTRY_BREADCRUMBS__);
expect(breadcrumbs).toContainEqual(
expect.objectContaining({
category: 'navigation',
message: '/users',
}),
);
});
});
```
**Cypress with Sentry**:
```javascript
// cypress/e2e/telemetry-logging.cy.ts
describe('Error Telemetry', () => {
it('should log API errors with redacted sensitive data', () => {
const errorLogs = [];
// Capture console errors
cy.on('window:before:load', (win) => {
cy.stub(win.console, 'error').callsFake((msg) => {
errorLogs.push(msg);
});
});
// Mock failing API
cy.intercept('POST', '**/api/orders', {
statusCode: 500,
body: { error: 'Payment failed' },
});
// Act
cy.visit('/checkout');
cy.get('[data-cy="place-order"]').click();
// Assert: Error logged
cy.wrap(errorLogs).should('have.length.greaterThan', 0);
// Assert: Context included
cy.wrap(errorLogs[0]).should('include', '/api/orders');
// Assert: Secrets redacted
cy.wrap(JSON.stringify(errorLogs)).should('not.contain', 'password');
cy.wrap(JSON.stringify(errorLogs)).should('not.contain', 'creditCard');
});
});
```
**Error logger utility with redaction**:
```typescript
// src/utils/error-logger.ts
type ErrorContext = {
endpoint?: string;
method?: string;
statusCode?: number;
userId?: string;
sessionId?: string;
requestPayload?: any;
};
const SENSITIVE_KEYS = ['password', 'token', 'creditCard', 'ssn', 'apiKey'];
/**
* Redact sensitive data from objects
*/
function redactSensitiveData(obj: any): any {
if (typeof obj !== 'object' || obj === null) return obj;
const redacted = { ...obj };
for (const key of Object.keys(redacted)) {
if (SENSITIVE_KEYS.some((sensitive) => key.toLowerCase().includes(sensitive))) {
redacted[key] = '[REDACTED]';
} else if (typeof redacted[key] === 'object') {
redacted[key] = redactSensitiveData(redacted[key]);
}
}
return redacted;
}
/**
* Log error with context (Sentry integration)
*/
export function logError(error: Error, context?: ErrorContext) {
const safeContext = context ? redactSensitiveData(context) : {};
const errorLog = {
level: 'error' as const,
message: error.message,
stack: error.stack,
context: safeContext,
timestamp: new Date().toISOString(),
};
// Console (development)
console.error(JSON.stringify(errorLog));
// Sentry (production)
if (typeof window !== 'undefined' && (window as any).Sentry) {
(window as any).Sentry.captureException(error, {
contexts: { custom: safeContext },
});
}
}
```
**Key Points**:
- **Context-rich logging**: Endpoint, method, status, user ID
- **Secret redaction**: Passwords, tokens, PII removed before logging
- **Sentry integration**: Production monitoring with breadcrumbs
- **Structured logs**: JSON format for easy parsing
- **Test validation**: Assert logs contain context but not secrets
---
### Example 4: Graceful Degradation Tests (Fallback Behavior)
**Context**: Validate application continues functioning when services are unavailable.
**Implementation**:
```typescript
// tests/e2e/graceful-degradation.spec.ts
import { test, expect } from '@playwright/test';
/**
* Graceful Degradation Pattern
* - Simulate service unavailability
* - Validate fallback behavior
* - Ensure user experience degrades gracefully
* - Verify telemetry captures degradation events
*/
test.describe('Service Unavailability', () => {
test('should display cached data when API is down', async ({ page }) => {
// Arrange: Seed localStorage with cached data
await page.addInitScript(() => {
localStorage.setItem(
'products_cache',
JSON.stringify({
data: [
{ id: 1, name: 'Cached Product 1' },
{ id: 2, name: 'Cached Product 2' },
],
timestamp: Date.now(),
}),
);
});
// Mock API unavailable
await page.route(
'**/api/products',
(route) => route.abort('connectionrefused'), // Simulate server down
);
// Act
await page.goto('/products');
// Assert: Cached data displayed
await expect(page.getByTestId('product-list')).toBeVisible();
await expect(page.getByText('Cached Product 1')).toBeVisible();
// Assert: Stale data warning shown
await expect(page.getByTestId('cache-warning')).toBeVisible();
await expect(page.getByTestId('cache-warning')).toContainText(/showing.*cached|offline.*mode/i);
// Assert: Retry button available
await expect(page.getByTestId('refresh-button')).toBeVisible();
});
test('should show fallback UI when analytics service fails', async ({ page }) => {
// Mock analytics service down (non-critical)
await page.route('**/analytics/track', (route) => route.fulfill({ status: 503, body: 'Service unavailable' }));
// Act: Navigate normally
await page.goto('/dashboard');
// Assert: Page loads successfully (analytics failure doesn't block)
await expect(page.getByTestId('dashboard-content')).toBeVisible();
// Assert: Analytics error logged but not shown to user
const consoleErrors = [];
page.on('console', (msg) => {
if (msg.type() === 'error') consoleErrors.push(msg.text());
});
// Trigger analytics event
await page.getByTestId('track-action-button').click();
// Analytics error logged
expect(consoleErrors).toContainEqual(expect.stringContaining('Analytics service unavailable'));
// But user doesn't see error
await expect(page.getByTestId('error-message')).not.toBeVisible();
});
test('should fallback to local validation when API is slow', async ({ page }) => {
// Mock slow API (> 5 seconds)
await page.route('**/api/validate-email', async (route) => {
await new Promise((resolve) => setTimeout(resolve, 6000)); // 6 second delay
route.fulfill({
status: 200,
body: JSON.stringify({ valid: true }),
});
});
// Act: Fill form
await page.goto('/signup');
await page.getByTestId('email-input').fill('test@example.com');
await page.getByTestId('email-input').blur();
// Assert: Client-side validation triggers immediately (doesn't wait for API)
await expect(page.getByTestId('email-valid-icon')).toBeVisible({ timeout: 1000 });
// Assert: Eventually API validates too (but doesn't block UX)
await expect(page.getByTestId('email-validated-badge')).toBeVisible({ timeout: 7000 });
});
test('should maintain functionality with third-party script failure', async ({ page }) => {
// Block third-party scripts (Google Analytics, Intercom, etc.)
await page.route('**/*.google-analytics.com/**', (route) => route.abort());
await page.route('**/*.intercom.io/**', (route) => route.abort());
// Act
await page.goto('/');
// Assert: App works without third-party scripts
await expect(page.getByTestId('main-content')).toBeVisible();
await expect(page.getByTestId('nav-menu')).toBeVisible();
// Assert: Core functionality intact
await page.getByTestId('nav-products').click();
await expect(page).toHaveURL(/.*\/products/);
});
});
```
**Key Points**:
- **Cached fallbacks**: Display stale data when API unavailable
- **Non-critical degradation**: Analytics failures don't block app
- **Client-side fallbacks**: Local validation when API slow
- **Third-party resilience**: App works without external scripts
- **User transparency**: Stale data warnings displayed
---
## Error Handling Testing Checklist
Before shipping error handling code, verify:
- [ ] **Scoped exception handling**: Only ignore documented errors (NetworkError, specific codes)
- [ ] **Rethrow unexpected**: Unknown errors fail tests (catch regressions)
- [ ] **Error UI tested**: User sees error messages for all error states
- [ ] **Retry logic validated**: Sequential failures test backoff and max attempts
- [ ] **Telemetry verified**: Errors logged with context (endpoint, status, user)
- [ ] **Secret redaction**: Logs don't contain passwords, tokens, PII
- [ ] **Graceful degradation**: Critical services down, app shows fallback UI
- [ ] **Non-critical failures**: Analytics/tracking failures don't block app
## Integration Points
- Used in workflows: `*automate` (error handling test generation), `*test-review` (error pattern detection)
- Related fragments: `network-first.md`, `test-quality.md`, `contract-testing.md`
- Monitoring tools: Sentry, Datadog, LogRocket
_Source: Murat error-handling patterns, Pact resilience guidance, SEON production error handling_

View File

@@ -0,0 +1,750 @@
# Feature Flag Governance
## Principle
Feature flags enable controlled rollouts and A/B testing, but require disciplined testing governance. Centralize flag definitions in a frozen enum, test both enabled and disabled states, clean up targeting after each spec, and maintain a comprehensive flag lifecycle checklist. For LaunchDarkly-style systems, script API helpers to seed variations programmatically rather than manual UI mutations.
## Rationale
Poorly managed feature flags become technical debt: untested variations ship broken code, forgotten flags clutter the codebase, and shared environments become unstable from leftover targeting rules. Structured governance ensures flags are testable, traceable, temporary, and safe. Testing both states prevents surprises when flags flip in production.
## Pattern Examples
### Example 1: Feature Flag Enum Pattern with Type Safety
**Context**: Centralized flag management with TypeScript type safety and runtime validation.
**Implementation**:
```typescript
// src/utils/feature-flags.ts
/**
* Centralized feature flag definitions
* - Object.freeze prevents runtime modifications
* - TypeScript ensures compile-time type safety
* - Single source of truth for all flag keys
*/
export const FLAGS = Object.freeze({
// User-facing features
NEW_CHECKOUT_FLOW: 'new-checkout-flow',
DARK_MODE: 'dark-mode',
ENHANCED_SEARCH: 'enhanced-search',
// Experiments
PRICING_EXPERIMENT_A: 'pricing-experiment-a',
HOMEPAGE_VARIANT_B: 'homepage-variant-b',
// Infrastructure
USE_NEW_API_ENDPOINT: 'use-new-api-endpoint',
ENABLE_ANALYTICS_V2: 'enable-analytics-v2',
// Killswitches (emergency disables)
DISABLE_PAYMENT_PROCESSING: 'disable-payment-processing',
DISABLE_EMAIL_NOTIFICATIONS: 'disable-email-notifications',
} as const);
/**
* Type-safe flag keys
* Prevents typos and ensures autocomplete in IDEs
*/
export type FlagKey = (typeof FLAGS)[keyof typeof FLAGS];
/**
* Flag metadata for governance
*/
type FlagMetadata = {
key: FlagKey;
name: string;
owner: string;
createdDate: string;
expiryDate?: string;
defaultState: boolean;
requiresCleanup: boolean;
dependencies?: FlagKey[];
telemetryEvents?: string[];
};
/**
* Flag registry with governance metadata
* Used for flag lifecycle tracking and cleanup alerts
*/
export const FLAG_REGISTRY: Record<FlagKey, FlagMetadata> = {
[FLAGS.NEW_CHECKOUT_FLOW]: {
key: FLAGS.NEW_CHECKOUT_FLOW,
name: 'New Checkout Flow',
owner: 'payments-team',
createdDate: '2025-01-15',
expiryDate: '2025-03-15',
defaultState: false,
requiresCleanup: true,
dependencies: [FLAGS.USE_NEW_API_ENDPOINT],
telemetryEvents: ['checkout_started', 'checkout_completed'],
},
[FLAGS.DARK_MODE]: {
key: FLAGS.DARK_MODE,
name: 'Dark Mode UI',
owner: 'frontend-team',
createdDate: '2025-01-10',
defaultState: false,
requiresCleanup: false, // Permanent feature toggle
},
// ... rest of registry
};
/**
* Validate flag exists in registry
* Throws at runtime if flag is unregistered
*/
export function validateFlag(flag: string): asserts flag is FlagKey {
if (!Object.values(FLAGS).includes(flag as FlagKey)) {
throw new Error(`Unregistered feature flag: ${flag}`);
}
}
/**
* Check if flag is expired (needs removal)
*/
export function isFlagExpired(flag: FlagKey): boolean {
const metadata = FLAG_REGISTRY[flag];
if (!metadata.expiryDate) return false;
const expiry = new Date(metadata.expiryDate);
return Date.now() > expiry.getTime();
}
/**
* Get all expired flags requiring cleanup
*/
export function getExpiredFlags(): FlagMetadata[] {
return Object.values(FLAG_REGISTRY).filter((meta) => isFlagExpired(meta.key));
}
```
**Usage in application code**:
```typescript
// components/Checkout.tsx
import { FLAGS } from '@/utils/feature-flags';
import { useFeatureFlag } from '@/hooks/useFeatureFlag';
export function Checkout() {
const isNewFlow = useFeatureFlag(FLAGS.NEW_CHECKOUT_FLOW);
return isNewFlow ? <NewCheckoutFlow /> : <LegacyCheckoutFlow />;
}
```
**Key Points**:
- **Type safety**: TypeScript catches typos at compile time
- **Runtime validation**: validateFlag ensures only registered flags used
- **Metadata tracking**: Owner, dates, dependencies documented
- **Expiry alerts**: Automated detection of stale flags
- **Single source of truth**: All flags defined in one place
---
### Example 2: Feature Flag Testing Pattern (Both States)
**Context**: Comprehensive testing of feature flag variations with proper cleanup.
**Implementation**:
```typescript
// tests/e2e/checkout-feature-flag.spec.ts
import { test, expect } from '@playwright/test';
import { FLAGS } from '@/utils/feature-flags';
/**
* Feature Flag Testing Strategy:
* 1. Test BOTH enabled and disabled states
* 2. Clean up targeting after each test
* 3. Use dedicated test users (not production data)
* 4. Verify telemetry events fire correctly
*/
test.describe('Checkout Flow - Feature Flag Variations', () => {
let testUserId: string;
test.beforeEach(async () => {
// Generate unique test user ID
testUserId = `test-user-${Date.now()}`;
});
test.afterEach(async ({ request }) => {
// CRITICAL: Clean up flag targeting to prevent shared env pollution
await request.post('/api/feature-flags/cleanup', {
data: {
flagKey: FLAGS.NEW_CHECKOUT_FLOW,
userId: testUserId,
},
});
});
test('should use NEW checkout flow when flag is ENABLED', async ({ page, request }) => {
// Arrange: Enable flag for test user
await request.post('/api/feature-flags/target', {
data: {
flagKey: FLAGS.NEW_CHECKOUT_FLOW,
userId: testUserId,
variation: true, // ENABLED
},
});
// Act: Navigate as targeted user
await page.goto('/checkout', {
extraHTTPHeaders: {
'X-Test-User-ID': testUserId,
},
});
// Assert: New flow UI elements visible
await expect(page.getByTestId('checkout-v2-container')).toBeVisible();
await expect(page.getByTestId('express-payment-options')).toBeVisible();
await expect(page.getByTestId('saved-addresses-dropdown')).toBeVisible();
// Assert: Legacy flow NOT visible
await expect(page.getByTestId('checkout-v1-container')).not.toBeVisible();
// Assert: Telemetry event fired
const analyticsEvents = await page.evaluate(() => (window as any).__ANALYTICS_EVENTS__ || []);
expect(analyticsEvents).toContainEqual(
expect.objectContaining({
event: 'checkout_started',
properties: expect.objectContaining({
variant: 'new_flow',
}),
}),
);
});
test('should use LEGACY checkout flow when flag is DISABLED', async ({ page, request }) => {
// Arrange: Disable flag for test user (or don't target at all)
await request.post('/api/feature-flags/target', {
data: {
flagKey: FLAGS.NEW_CHECKOUT_FLOW,
userId: testUserId,
variation: false, // DISABLED
},
});
// Act: Navigate as targeted user
await page.goto('/checkout', {
extraHTTPHeaders: {
'X-Test-User-ID': testUserId,
},
});
// Assert: Legacy flow UI elements visible
await expect(page.getByTestId('checkout-v1-container')).toBeVisible();
await expect(page.getByTestId('legacy-payment-form')).toBeVisible();
// Assert: New flow NOT visible
await expect(page.getByTestId('checkout-v2-container')).not.toBeVisible();
await expect(page.getByTestId('express-payment-options')).not.toBeVisible();
// Assert: Telemetry event fired with correct variant
const analyticsEvents = await page.evaluate(() => (window as any).__ANALYTICS_EVENTS__ || []);
expect(analyticsEvents).toContainEqual(
expect.objectContaining({
event: 'checkout_started',
properties: expect.objectContaining({
variant: 'legacy_flow',
}),
}),
);
});
test('should handle flag evaluation errors gracefully', async ({ page, request }) => {
// Arrange: Simulate flag service unavailable
await page.route('**/api/feature-flags/evaluate', (route) => route.fulfill({ status: 500, body: 'Service Unavailable' }));
// Act: Navigate (should fallback to default state)
await page.goto('/checkout', {
extraHTTPHeaders: {
'X-Test-User-ID': testUserId,
},
});
// Assert: Fallback to safe default (legacy flow)
await expect(page.getByTestId('checkout-v1-container')).toBeVisible();
// Assert: Error logged but no user-facing error
const consoleErrors = [];
page.on('console', (msg) => {
if (msg.type() === 'error') consoleErrors.push(msg.text());
});
expect(consoleErrors).toContain(expect.stringContaining('Feature flag evaluation failed'));
});
});
```
**Cypress equivalent**:
```javascript
// cypress/e2e/checkout-feature-flag.cy.ts
import { FLAGS } from '@/utils/feature-flags';
describe('Checkout Flow - Feature Flag Variations', () => {
let testUserId;
beforeEach(() => {
testUserId = `test-user-${Date.now()}`;
});
afterEach(() => {
// Clean up targeting
cy.task('removeFeatureFlagTarget', {
flagKey: FLAGS.NEW_CHECKOUT_FLOW,
userId: testUserId,
});
});
it('should use NEW checkout flow when flag is ENABLED', () => {
// Arrange: Enable flag via Cypress task
cy.task('setFeatureFlagVariation', {
flagKey: FLAGS.NEW_CHECKOUT_FLOW,
userId: testUserId,
variation: true,
});
// Act
cy.visit('/checkout', {
headers: { 'X-Test-User-ID': testUserId },
});
// Assert
cy.get('[data-testid="checkout-v2-container"]').should('be.visible');
cy.get('[data-testid="checkout-v1-container"]').should('not.exist');
});
it('should use LEGACY checkout flow when flag is DISABLED', () => {
// Arrange: Disable flag
cy.task('setFeatureFlagVariation', {
flagKey: FLAGS.NEW_CHECKOUT_FLOW,
userId: testUserId,
variation: false,
});
// Act
cy.visit('/checkout', {
headers: { 'X-Test-User-ID': testUserId },
});
// Assert
cy.get('[data-testid="checkout-v1-container"]').should('be.visible');
cy.get('[data-testid="checkout-v2-container"]').should('not.exist');
});
});
```
**Key Points**:
- **Test both states**: Enabled AND disabled variations
- **Automatic cleanup**: afterEach removes targeting (prevent pollution)
- **Unique test users**: Avoid conflicts with real user data
- **Telemetry validation**: Verify analytics events fire correctly
- **Graceful degradation**: Test fallback behavior on errors
---
### Example 3: Feature Flag Targeting Helper Pattern
**Context**: Reusable helpers for programmatic flag control via LaunchDarkly/Split.io API.
**Implementation**:
```typescript
// tests/support/feature-flag-helpers.ts
import { request as playwrightRequest } from '@playwright/test';
import { FLAGS, FlagKey } from '@/utils/feature-flags';
/**
* LaunchDarkly API client configuration
* Use test project SDK key (NOT production)
*/
const LD_SDK_KEY = process.env.LD_SDK_KEY_TEST;
const LD_API_BASE = 'https://app.launchdarkly.com/api/v2';
type FlagVariation = boolean | string | number | object;
/**
* Set flag variation for specific user
* Uses LaunchDarkly API to create user target
*/
export async function setFlagForUser(flagKey: FlagKey, userId: string, variation: FlagVariation): Promise<void> {
const response = await playwrightRequest.newContext().then((ctx) =>
ctx.post(`${LD_API_BASE}/flags/${flagKey}/targeting`, {
headers: {
Authorization: LD_SDK_KEY!,
'Content-Type': 'application/json',
},
data: {
targets: [
{
values: [userId],
variation: variation ? 1 : 0, // 0 = off, 1 = on
},
],
},
}),
);
if (!response.ok()) {
throw new Error(`Failed to set flag ${flagKey} for user ${userId}: ${response.status()}`);
}
}
/**
* Remove user from flag targeting
* CRITICAL for test cleanup
*/
export async function removeFlagTarget(flagKey: FlagKey, userId: string): Promise<void> {
const response = await playwrightRequest.newContext().then((ctx) =>
ctx.delete(`${LD_API_BASE}/flags/${flagKey}/targeting/users/${userId}`, {
headers: {
Authorization: LD_SDK_KEY!,
},
}),
);
if (!response.ok() && response.status() !== 404) {
// 404 is acceptable (user wasn't targeted)
throw new Error(`Failed to remove flag ${flagKey} target for user ${userId}: ${response.status()}`);
}
}
/**
* Percentage rollout helper
* Enable flag for N% of users
*/
export async function setFlagRolloutPercentage(flagKey: FlagKey, percentage: number): Promise<void> {
if (percentage < 0 || percentage > 100) {
throw new Error('Percentage must be between 0 and 100');
}
const response = await playwrightRequest.newContext().then((ctx) =>
ctx.patch(`${LD_API_BASE}/flags/${flagKey}`, {
headers: {
Authorization: LD_SDK_KEY!,
'Content-Type': 'application/json',
},
data: {
rollout: {
variations: [
{ variation: 0, weight: 100 - percentage }, // off
{ variation: 1, weight: percentage }, // on
],
},
},
}),
);
if (!response.ok()) {
throw new Error(`Failed to set rollout for flag ${flagKey}: ${response.status()}`);
}
}
/**
* Enable flag globally (100% rollout)
*/
export async function enableFlagGlobally(flagKey: FlagKey): Promise<void> {
await setFlagRolloutPercentage(flagKey, 100);
}
/**
* Disable flag globally (0% rollout)
*/
export async function disableFlagGlobally(flagKey: FlagKey): Promise<void> {
await setFlagRolloutPercentage(flagKey, 0);
}
/**
* Stub feature flags in local/test environments
* Bypasses LaunchDarkly entirely
*/
export function stubFeatureFlags(flags: Record<FlagKey, FlagVariation>): void {
// Set flags in localStorage or inject into window
if (typeof window !== 'undefined') {
(window as any).__STUBBED_FLAGS__ = flags;
}
}
```
**Usage in Playwright fixture**:
```typescript
// playwright/fixtures/feature-flag-fixture.ts
import { test as base } from '@playwright/test';
import { setFlagForUser, removeFlagTarget } from '../support/feature-flag-helpers';
import { FlagKey } from '@/utils/feature-flags';
type FeatureFlagFixture = {
featureFlags: {
enable: (flag: FlagKey, userId: string) => Promise<void>;
disable: (flag: FlagKey, userId: string) => Promise<void>;
cleanup: (flag: FlagKey, userId: string) => Promise<void>;
};
};
export const test = base.extend<FeatureFlagFixture>({
featureFlags: async ({}, use) => {
const cleanupQueue: Array<{ flag: FlagKey; userId: string }> = [];
await use({
enable: async (flag, userId) => {
await setFlagForUser(flag, userId, true);
cleanupQueue.push({ flag, userId });
},
disable: async (flag, userId) => {
await setFlagForUser(flag, userId, false);
cleanupQueue.push({ flag, userId });
},
cleanup: async (flag, userId) => {
await removeFlagTarget(flag, userId);
},
});
// Auto-cleanup after test
for (const { flag, userId } of cleanupQueue) {
await removeFlagTarget(flag, userId);
}
},
});
```
**Key Points**:
- **API-driven control**: No manual UI clicks required
- **Auto-cleanup**: Fixture tracks and removes targeting
- **Percentage rollouts**: Test gradual feature releases
- **Stubbing option**: Local development without LaunchDarkly
- **Type-safe**: FlagKey prevents typos
---
### Example 4: Feature Flag Lifecycle Checklist & Cleanup Strategy
**Context**: Governance checklist and automated cleanup detection for stale flags.
**Implementation**:
```typescript
// scripts/feature-flag-audit.ts
/**
* Feature Flag Lifecycle Audit Script
* Run weekly to detect stale flags requiring cleanup
*/
import { FLAG_REGISTRY, FLAGS, getExpiredFlags, FlagKey } from '../src/utils/feature-flags';
import * as fs from 'fs';
import * as path from 'path';
type AuditResult = {
totalFlags: number;
expiredFlags: FlagKey[];
missingOwners: FlagKey[];
missingDates: FlagKey[];
permanentFlags: FlagKey[];
flagsNearingExpiry: FlagKey[];
};
/**
* Audit all feature flags for governance compliance
*/
function auditFeatureFlags(): AuditResult {
const allFlags = Object.keys(FLAG_REGISTRY) as FlagKey[];
const expiredFlags = getExpiredFlags().map((meta) => meta.key);
// Flags expiring in next 30 days
const thirtyDaysFromNow = Date.now() + 30 * 24 * 60 * 60 * 1000;
const flagsNearingExpiry = allFlags.filter((flag) => {
const meta = FLAG_REGISTRY[flag];
if (!meta.expiryDate) return false;
const expiry = new Date(meta.expiryDate).getTime();
return expiry > Date.now() && expiry < thirtyDaysFromNow;
});
// Missing metadata
const missingOwners = allFlags.filter((flag) => !FLAG_REGISTRY[flag].owner);
const missingDates = allFlags.filter((flag) => !FLAG_REGISTRY[flag].createdDate);
// Permanent flags (no expiry, requiresCleanup = false)
const permanentFlags = allFlags.filter((flag) => {
const meta = FLAG_REGISTRY[flag];
return !meta.expiryDate && !meta.requiresCleanup;
});
return {
totalFlags: allFlags.length,
expiredFlags,
missingOwners,
missingDates,
permanentFlags,
flagsNearingExpiry,
};
}
/**
* Generate markdown report
*/
function generateReport(audit: AuditResult): string {
let report = `# Feature Flag Audit Report\n\n`;
report += `**Date**: ${new Date().toISOString()}\n`;
report += `**Total Flags**: ${audit.totalFlags}\n\n`;
if (audit.expiredFlags.length > 0) {
report += `## ⚠️ EXPIRED FLAGS - IMMEDIATE CLEANUP REQUIRED\n\n`;
audit.expiredFlags.forEach((flag) => {
const meta = FLAG_REGISTRY[flag];
report += `- **${meta.name}** (\`${flag}\`)\n`;
report += ` - Owner: ${meta.owner}\n`;
report += ` - Expired: ${meta.expiryDate}\n`;
report += ` - Action: Remove flag code, update tests, deploy\n\n`;
});
}
if (audit.flagsNearingExpiry.length > 0) {
report += `## ⏰ FLAGS EXPIRING SOON (Next 30 Days)\n\n`;
audit.flagsNearingExpiry.forEach((flag) => {
const meta = FLAG_REGISTRY[flag];
report += `- **${meta.name}** (\`${flag}\`)\n`;
report += ` - Owner: ${meta.owner}\n`;
report += ` - Expires: ${meta.expiryDate}\n`;
report += ` - Action: Plan cleanup or extend expiry\n\n`;
});
}
if (audit.permanentFlags.length > 0) {
report += `## 🔄 PERMANENT FLAGS (No Expiry)\n\n`;
audit.permanentFlags.forEach((flag) => {
const meta = FLAG_REGISTRY[flag];
report += `- **${meta.name}** (\`${flag}\`) - Owner: ${meta.owner}\n`;
});
report += `\n`;
}
if (audit.missingOwners.length > 0 || audit.missingDates.length > 0) {
report += `## ❌ GOVERNANCE ISSUES\n\n`;
if (audit.missingOwners.length > 0) {
report += `**Missing Owners**: ${audit.missingOwners.join(', ')}\n`;
}
if (audit.missingDates.length > 0) {
report += `**Missing Created Dates**: ${audit.missingDates.join(', ')}\n`;
}
report += `\n`;
}
return report;
}
/**
* Feature Flag Lifecycle Checklist
*/
const FLAG_LIFECYCLE_CHECKLIST = `
# Feature Flag Lifecycle Checklist
## Before Creating a New Flag
- [ ] **Name**: Follow naming convention (kebab-case, descriptive)
- [ ] **Owner**: Assign team/individual responsible
- [ ] **Default State**: Determine safe default (usually false)
- [ ] **Expiry Date**: Set removal date (30-90 days typical)
- [ ] **Dependencies**: Document related flags
- [ ] **Telemetry**: Plan analytics events to track
- [ ] **Rollback Plan**: Define how to disable quickly
## During Development
- [ ] **Code Paths**: Both enabled/disabled states implemented
- [ ] **Tests**: Both variations tested in CI
- [ ] **Documentation**: Flag purpose documented in code/PR
- [ ] **Telemetry**: Analytics events instrumented
- [ ] **Error Handling**: Graceful degradation on flag service failure
## Before Launch
- [ ] **QA**: Both states tested in staging
- [ ] **Rollout Plan**: Gradual rollout percentage defined
- [ ] **Monitoring**: Dashboards/alerts for flag-related metrics
- [ ] **Stakeholder Communication**: Product/design aligned
## After Launch (Monitoring)
- [ ] **Metrics**: Success criteria tracked
- [ ] **Error Rates**: No increase in errors
- [ ] **Performance**: No degradation
- [ ] **User Feedback**: Qualitative data collected
## Cleanup (Post-Launch)
- [ ] **Remove Flag Code**: Delete if/else branches
- [ ] **Update Tests**: Remove flag-specific tests
- [ ] **Remove Targeting**: Clear all user targets
- [ ] **Delete Flag Config**: Remove from LaunchDarkly/registry
- [ ] **Update Documentation**: Remove references
- [ ] **Deploy**: Ship cleanup changes
`;
// Run audit
const audit = auditFeatureFlags();
const report = generateReport(audit);
// Save report
const outputPath = path.join(__dirname, '../feature-flag-audit-report.md');
fs.writeFileSync(outputPath, report);
fs.writeFileSync(path.join(__dirname, '../FEATURE-FLAG-CHECKLIST.md'), FLAG_LIFECYCLE_CHECKLIST);
console.log(`✅ Audit complete. Report saved to: ${outputPath}`);
console.log(`Total flags: ${audit.totalFlags}`);
console.log(`Expired flags: ${audit.expiredFlags.length}`);
console.log(`Flags expiring soon: ${audit.flagsNearingExpiry.length}`);
// Exit with error if expired flags exist
if (audit.expiredFlags.length > 0) {
console.error(`\n❌ EXPIRED FLAGS DETECTED - CLEANUP REQUIRED`);
process.exit(1);
}
```
**package.json scripts**:
```json
{
"scripts": {
"feature-flags:audit": "ts-node scripts/feature-flag-audit.ts",
"feature-flags:audit:ci": "npm run feature-flags:audit || true"
}
}
```
**Key Points**:
- **Automated detection**: Weekly audit catches stale flags
- **Lifecycle checklist**: Comprehensive governance guide
- **Expiry tracking**: Flags auto-expire after defined date
- **CI integration**: Audit runs in pipeline, warns on expiry
- **Ownership clarity**: Every flag has assigned owner
---
## Feature Flag Testing Checklist
Before merging flag-related code, verify:
- [ ] **Both states tested**: Enabled AND disabled variations covered
- [ ] **Cleanup automated**: afterEach removes targeting (no manual cleanup)
- [ ] **Unique test data**: Test users don't collide with production
- [ ] **Telemetry validated**: Analytics events fire for both variations
- [ ] **Error handling**: Graceful fallback when flag service unavailable
- [ ] **Flag metadata**: Owner, dates, dependencies documented in registry
- [ ] **Rollback plan**: Clear steps to disable flag in production
- [ ] **Expiry date set**: Removal date defined (or marked permanent)
## Integration Points
- Used in workflows: `*automate` (test generation), `*framework` (flag setup)
- Related fragments: `test-quality.md`, `selective-testing.md`
- Flag services: LaunchDarkly, Split.io, Unleash, custom implementations
_Source: LaunchDarkly strategy blog, Murat test architecture notes, SEON feature flag governance_

View File

@@ -0,0 +1,401 @@
# Fixture Architecture Playbook
## Principle
Build test helpers as pure functions first, then wrap them in framework-specific fixtures. Compose capabilities using `mergeTests` (Playwright) or layered commands (Cypress) instead of inheritance. Each fixture should solve one isolated concern (auth, API, logs, network).
## Rationale
Traditional Page Object Models create tight coupling through inheritance chains (`BasePage → LoginPage → AdminPage`). When base classes change, all descendants break. Pure functions with fixture wrappers provide:
- **Testability**: Pure functions run in unit tests without framework overhead
- **Composability**: Mix capabilities freely via `mergeTests`, no inheritance constraints
- **Reusability**: Export fixtures via package subpaths for cross-project sharing
- **Maintainability**: One concern per fixture = clear responsibility boundaries
## Pattern Examples
### Example 1: Pure Function → Fixture Pattern
**Context**: When building any test helper, always start with a pure function that accepts all dependencies explicitly. Then wrap it in a Playwright fixture or Cypress command.
**Implementation**:
```typescript
// playwright/support/helpers/api-request.ts
// Step 1: Pure function (ALWAYS FIRST!)
type ApiRequestParams = {
request: APIRequestContext;
method: 'GET' | 'POST' | 'PUT' | 'DELETE';
url: string;
data?: unknown;
headers?: Record<string, string>;
};
export async function apiRequest({
request,
method,
url,
data,
headers = {}
}: ApiRequestParams) {
const response = await request.fetch(url, {
method,
data,
headers: {
'Content-Type': 'application/json',
...headers
}
});
if (!response.ok()) {
throw new Error(`API request failed: ${response.status()} ${await response.text()}`);
}
return response.json();
}
// Step 2: Fixture wrapper
// playwright/support/fixtures/api-request-fixture.ts
import { test as base } from '@playwright/test';
import { apiRequest } from '../helpers/api-request';
export const test = base.extend<{ apiRequest: typeof apiRequest }>({
apiRequest: async ({ request }, use) => {
// Inject framework dependency, expose pure function
await use((params) => apiRequest({ request, ...params }));
}
});
// Step 3: Package exports for reusability
// package.json
{
"exports": {
"./api-request": "./playwright/support/helpers/api-request.ts",
"./api-request/fixtures": "./playwright/support/fixtures/api-request-fixture.ts"
}
}
```
**Key Points**:
- Pure function is unit-testable without Playwright running
- Framework dependency (`request`) injected at fixture boundary
- Fixture exposes the pure function to test context
- Package subpath exports enable `import { apiRequest } from 'my-fixtures/api-request'`
### Example 2: Composable Fixture System with mergeTests
**Context**: When building comprehensive test capabilities, compose multiple focused fixtures instead of creating monolithic helper classes. Each fixture provides one capability.
**Implementation**:
```typescript
// playwright/support/fixtures/merged-fixtures.ts
import { test as base, mergeTests } from '@playwright/test';
import { test as apiRequestFixture } from './api-request-fixture';
import { test as networkFixture } from './network-fixture';
import { test as authFixture } from './auth-fixture';
import { test as logFixture } from './log-fixture';
// Compose all fixtures for comprehensive capabilities
export const test = mergeTests(base, apiRequestFixture, networkFixture, authFixture, logFixture);
export { expect } from '@playwright/test';
// Example usage in tests:
// import { test, expect } from './support/fixtures/merged-fixtures';
//
// test('user can create order', async ({ page, apiRequest, auth, network }) => {
// await auth.loginAs('customer@example.com');
// await network.interceptRoute('POST', '**/api/orders', { id: 123 });
// await page.goto('/checkout');
// await page.click('[data-testid="submit-order"]');
// await expect(page.getByText('Order #123')).toBeVisible();
// });
```
**Individual Fixture Examples**:
```typescript
// network-fixture.ts
export const test = base.extend({
network: async ({ page }, use) => {
const interceptedRoutes = new Map();
const interceptRoute = async (method: string, url: string, response: unknown) => {
await page.route(url, (route) => {
if (route.request().method() === method) {
route.fulfill({ body: JSON.stringify(response) });
}
});
interceptedRoutes.set(`${method}:${url}`, response);
};
await use({ interceptRoute });
// Cleanup
interceptedRoutes.clear();
},
});
// auth-fixture.ts
export const test = base.extend({
auth: async ({ page, context }, use) => {
const loginAs = async (email: string) => {
// Use API to setup auth (fast!)
const token = await getAuthToken(email);
await context.addCookies([
{
name: 'auth_token',
value: token,
domain: 'localhost',
path: '/',
},
]);
};
await use({ loginAs });
},
});
```
**Key Points**:
- `mergeTests` combines fixtures without inheritance
- Each fixture has single responsibility (network, auth, logs)
- Tests import merged fixture and access all capabilities
- No coupling between fixtures—add/remove freely
### Example 3: Framework-Agnostic HTTP Helper
**Context**: When building HTTP helpers, keep them framework-agnostic. Accept all params explicitly so they work in unit tests, Playwright, Cypress, or any context.
**Implementation**:
```typescript
// shared/helpers/http-helper.ts
// Pure, framework-agnostic function
type HttpHelperParams = {
baseUrl: string;
endpoint: string;
method: 'GET' | 'POST' | 'PUT' | 'DELETE';
body?: unknown;
headers?: Record<string, string>;
token?: string;
};
export async function makeHttpRequest({ baseUrl, endpoint, method, body, headers = {}, token }: HttpHelperParams): Promise<unknown> {
const url = `${baseUrl}${endpoint}`;
const requestHeaders = {
'Content-Type': 'application/json',
...(token && { Authorization: `Bearer ${token}` }),
...headers,
};
const response = await fetch(url, {
method,
headers: requestHeaders,
body: body ? JSON.stringify(body) : undefined,
});
if (!response.ok) {
const errorText = await response.text();
throw new Error(`HTTP ${method} ${url} failed: ${response.status} ${errorText}`);
}
return response.json();
}
// Playwright fixture wrapper
// playwright/support/fixtures/http-fixture.ts
import { test as base } from '@playwright/test';
import { makeHttpRequest } from '../../shared/helpers/http-helper';
export const test = base.extend({
httpHelper: async ({}, use) => {
const baseUrl = process.env.API_BASE_URL || 'http://localhost:3000';
await use((params) => makeHttpRequest({ baseUrl, ...params }));
},
});
// Cypress command wrapper
// cypress/support/commands.ts
import { makeHttpRequest } from '../../shared/helpers/http-helper';
Cypress.Commands.add('apiRequest', (params) => {
const baseUrl = Cypress.env('API_BASE_URL') || 'http://localhost:3000';
return cy.wrap(makeHttpRequest({ baseUrl, ...params }));
});
```
**Key Points**:
- Pure function uses only standard `fetch`, no framework dependencies
- Unit tests call `makeHttpRequest` directly with all params
- Playwright and Cypress wrappers inject framework-specific config
- Same logic runs everywhere—zero duplication
### Example 4: Fixture Cleanup Pattern
**Context**: When fixtures create resources (data, files, connections), ensure automatic cleanup in fixture teardown. Tests must not leak state.
**Implementation**:
```typescript
// playwright/support/fixtures/database-fixture.ts
import { test as base } from '@playwright/test';
import { seedDatabase, deleteRecord } from '../helpers/db-helpers';
type DatabaseFixture = {
seedUser: (userData: Partial<User>) => Promise<User>;
seedOrder: (orderData: Partial<Order>) => Promise<Order>;
};
export const test = base.extend<DatabaseFixture>({
seedUser: async ({}, use) => {
const createdUsers: string[] = [];
const seedUser = async (userData: Partial<User>) => {
const user = await seedDatabase('users', userData);
createdUsers.push(user.id);
return user;
};
await use(seedUser);
// Auto-cleanup: Delete all users created during test
for (const userId of createdUsers) {
await deleteRecord('users', userId);
}
createdUsers.length = 0;
},
seedOrder: async ({}, use) => {
const createdOrders: string[] = [];
const seedOrder = async (orderData: Partial<Order>) => {
const order = await seedDatabase('orders', orderData);
createdOrders.push(order.id);
return order;
};
await use(seedOrder);
// Auto-cleanup: Delete all orders
for (const orderId of createdOrders) {
await deleteRecord('orders', orderId);
}
createdOrders.length = 0;
},
});
// Example usage:
// test('user can place order', async ({ seedUser, seedOrder, page }) => {
// const user = await seedUser({ email: 'test@example.com' });
// const order = await seedOrder({ userId: user.id, total: 100 });
//
// await page.goto(`/orders/${order.id}`);
// await expect(page.getByText('Order Total: $100')).toBeVisible();
//
// // No manual cleanup needed—fixture handles it automatically
// });
```
**Key Points**:
- Track all created resources in array during test execution
- Teardown (after `use()`) deletes all tracked resources
- Tests don't manually clean up—happens automatically
- Prevents test pollution and flakiness from shared state
### Anti-Pattern: Inheritance-Based Page Objects
**Problem**:
```typescript
// ❌ BAD: Page Object Model with inheritance
class BasePage {
constructor(public page: Page) {}
async navigate(url: string) {
await this.page.goto(url);
}
async clickButton(selector: string) {
await this.page.click(selector);
}
}
class LoginPage extends BasePage {
async login(email: string, password: string) {
await this.navigate('/login');
await this.page.fill('#email', email);
await this.page.fill('#password', password);
await this.clickButton('#submit');
}
}
class AdminPage extends LoginPage {
async accessAdminPanel() {
await this.login('admin@example.com', 'admin123');
await this.navigate('/admin');
}
}
```
**Why It Fails**:
- Changes to `BasePage` break all descendants (`LoginPage`, `AdminPage`)
- `AdminPage` inherits unnecessary `login` details—tight coupling
- Cannot compose capabilities (e.g., admin + reporting features require multiple inheritance)
- Hard to test `BasePage` methods in isolation
- Hidden state in class instances leads to unpredictable behavior
**Better Approach**: Use pure functions + fixtures
```typescript
// ✅ GOOD: Pure functions with fixture composition
// helpers/navigation.ts
export async function navigate(page: Page, url: string) {
await page.goto(url);
}
// helpers/auth.ts
export async function login(page: Page, email: string, password: string) {
await page.fill('[data-testid="email"]', email);
await page.fill('[data-testid="password"]', password);
await page.click('[data-testid="submit"]');
}
// fixtures/admin-fixture.ts
export const test = base.extend({
adminPage: async ({ page }, use) => {
await login(page, 'admin@example.com', 'admin123');
await navigate(page, '/admin');
await use(page);
},
});
// Tests import exactly what they need—no inheritance
```
## Integration Points
- **Used in workflows**: `*atdd` (test generation), `*automate` (test expansion), `*framework` (initial setup)
- **Related fragments**:
- `data-factories.md` - Factory functions for test data
- `network-first.md` - Network interception patterns
- `test-quality.md` - Deterministic test design principles
## Helper Function Reuse Guidelines
When deciding whether to create a fixture, follow these rules:
- **3+ uses** → Create fixture with subpath export (shared across tests/projects)
- **2-3 uses** → Create utility module (shared within project)
- **1 use** → Keep inline (avoid premature abstraction)
- **Complex logic** → Factory function pattern (dynamic data generation)
_Source: Murat Testing Philosophy (lines 74-122), SEON production patterns, Playwright fixture docs._

View File

@@ -0,0 +1,486 @@
# Network-First Safeguards
## Principle
Register network interceptions **before** any navigation or user action. Store the interception promise and await it immediately after the triggering step. Replace implicit waits with deterministic signals based on network responses, spinner disappearance, or event hooks.
## Rationale
The most common source of flaky E2E tests is **race conditions** between navigation and network interception:
- Navigate then intercept = missed requests (too late)
- No explicit wait = assertion runs before response arrives
- Hard waits (`waitForTimeout(3000)`) = slow, unreliable, brittle
Network-first patterns provide:
- **Zero race conditions**: Intercept is active before triggering action
- **Deterministic waits**: Wait for actual response, not arbitrary timeouts
- **Actionable failures**: Assert on response status/body, not generic "element not found"
- **Speed**: No padding with extra wait time
## Pattern Examples
### Example 1: Intercept Before Navigate Pattern
**Context**: The foundational pattern for all E2E tests. Always register route interception **before** the action that triggers the request (navigation, click, form submit).
**Implementation**:
```typescript
// ✅ CORRECT: Intercept BEFORE navigate
test('user can view dashboard data', async ({ page }) => {
// Step 1: Register interception FIRST
const usersPromise = page.waitForResponse((resp) => resp.url().includes('/api/users') && resp.status() === 200);
// Step 2: THEN trigger the request
await page.goto('/dashboard');
// Step 3: THEN await the response
const usersResponse = await usersPromise;
const users = await usersResponse.json();
// Step 4: Assert on structured data
expect(users).toHaveLength(10);
await expect(page.getByText(users[0].name)).toBeVisible();
});
// Cypress equivalent
describe('Dashboard', () => {
it('should display users', () => {
// Step 1: Register interception FIRST
cy.intercept('GET', '**/api/users').as('getUsers');
// Step 2: THEN trigger
cy.visit('/dashboard');
// Step 3: THEN await
cy.wait('@getUsers').then((interception) => {
// Step 4: Assert on structured data
expect(interception.response.statusCode).to.equal(200);
expect(interception.response.body).to.have.length(10);
cy.contains(interception.response.body[0].name).should('be.visible');
});
});
});
// ❌ WRONG: Navigate BEFORE intercept (race condition!)
test('flaky test example', async ({ page }) => {
await page.goto('/dashboard'); // Request fires immediately
const usersPromise = page.waitForResponse('/api/users'); // TOO LATE - might miss it
const response = await usersPromise; // May timeout randomly
});
```
**Key Points**:
- Playwright: Use `page.waitForResponse()` with URL pattern or predicate **before** `page.goto()` or `page.click()`
- Cypress: Use `cy.intercept().as()` **before** `cy.visit()` or `cy.click()`
- Store promise/alias, trigger action, **then** await response
- This prevents 95% of race-condition flakiness in E2E tests
### Example 2: HAR Capture for Debugging
**Context**: When debugging flaky tests or building deterministic mocks, capture real network traffic with HAR files. Replay them in tests for consistent, offline-capable test runs.
**Implementation**:
```typescript
// playwright.config.ts - Enable HAR recording
export default defineConfig({
use: {
// Record HAR on first run
recordHar: { path: './hars/', mode: 'minimal' },
// Or replay HAR in tests
// serviceWorkers: 'block',
},
});
// Capture HAR for specific test
test('capture network for order flow', async ({ page, context }) => {
// Start recording
await context.routeFromHAR('./hars/order-flow.har', {
url: '**/api/**',
update: true, // Update HAR with new requests
});
await page.goto('/checkout');
await page.fill('[data-testid="credit-card"]', '4111111111111111');
await page.click('[data-testid="submit-order"]');
await expect(page.getByText('Order Confirmed')).toBeVisible();
// HAR saved to ./hars/order-flow.har
});
// Replay HAR for deterministic tests (no real API needed)
test('replay order flow from HAR', async ({ page, context }) => {
// Replay captured HAR
await context.routeFromHAR('./hars/order-flow.har', {
url: '**/api/**',
update: false, // Read-only mode
});
// Test runs with exact recorded responses - fully deterministic
await page.goto('/checkout');
await page.fill('[data-testid="credit-card"]', '4111111111111111');
await page.click('[data-testid="submit-order"]');
await expect(page.getByText('Order Confirmed')).toBeVisible();
});
// Custom mock based on HAR insights
test('mock order response based on HAR', async ({ page }) => {
// After analyzing HAR, create focused mock
await page.route('**/api/orders', (route) =>
route.fulfill({
status: 200,
contentType: 'application/json',
body: JSON.stringify({
orderId: '12345',
status: 'confirmed',
total: 99.99,
}),
}),
);
await page.goto('/checkout');
await page.click('[data-testid="submit-order"]');
await expect(page.getByText('Order #12345')).toBeVisible();
});
```
**Key Points**:
- HAR files capture real request/response pairs for analysis
- `update: true` records new traffic; `update: false` replays existing
- Replay mode makes tests fully deterministic (no upstream API needed)
- Use HAR to understand API contracts, then create focused mocks
### Example 3: Network Stub with Edge Cases
**Context**: When testing error handling, timeouts, and edge cases, stub network responses to simulate failures. Test both happy path and error scenarios.
**Implementation**:
```typescript
// Test happy path
test('order succeeds with valid data', async ({ page }) => {
await page.route('**/api/orders', (route) =>
route.fulfill({
status: 200,
contentType: 'application/json',
body: JSON.stringify({ orderId: '123', status: 'confirmed' }),
}),
);
await page.goto('/checkout');
await page.click('[data-testid="submit-order"]');
await expect(page.getByText('Order Confirmed')).toBeVisible();
});
// Test 500 error
test('order fails with server error', async ({ page }) => {
// Listen for console errors (app should log gracefully)
const consoleErrors: string[] = [];
page.on('console', (msg) => {
if (msg.type() === 'error') consoleErrors.push(msg.text());
});
// Stub 500 error
await page.route('**/api/orders', (route) =>
route.fulfill({
status: 500,
contentType: 'application/json',
body: JSON.stringify({ error: 'Internal Server Error' }),
}),
);
await page.goto('/checkout');
await page.click('[data-testid="submit-order"]');
// Assert UI shows error gracefully
await expect(page.getByText('Something went wrong')).toBeVisible();
await expect(page.getByText('Please try again')).toBeVisible();
// Verify error logged (not thrown)
expect(consoleErrors.some((e) => e.includes('Order failed'))).toBeTruthy();
});
// Test network timeout
test('order times out after 10 seconds', async ({ page }) => {
// Stub delayed response (never resolves within timeout)
await page.route(
'**/api/orders',
(route) => new Promise(() => {}), // Never resolves - simulates timeout
);
await page.goto('/checkout');
await page.click('[data-testid="submit-order"]');
// App should show timeout message after configured timeout
await expect(page.getByText('Request timed out')).toBeVisible({ timeout: 15000 });
});
// Test partial data response
test('order handles missing optional fields', async ({ page }) => {
await page.route('**/api/orders', (route) =>
route.fulfill({
status: 200,
contentType: 'application/json',
// Missing optional fields like 'trackingNumber', 'estimatedDelivery'
body: JSON.stringify({ orderId: '123', status: 'confirmed' }),
}),
);
await page.goto('/checkout');
await page.click('[data-testid="submit-order"]');
// App should handle gracefully - no crash, shows what's available
await expect(page.getByText('Order Confirmed')).toBeVisible();
await expect(page.getByText('Tracking information pending')).toBeVisible();
});
// Cypress equivalents
describe('Order Edge Cases', () => {
it('should handle 500 error', () => {
cy.intercept('POST', '**/api/orders', {
statusCode: 500,
body: { error: 'Internal Server Error' },
}).as('orderFailed');
cy.visit('/checkout');
cy.get('[data-testid="submit-order"]').click();
cy.wait('@orderFailed');
cy.contains('Something went wrong').should('be.visible');
});
it('should handle timeout', () => {
cy.intercept('POST', '**/api/orders', (req) => {
req.reply({ delay: 20000 }); // Delay beyond app timeout
}).as('orderTimeout');
cy.visit('/checkout');
cy.get('[data-testid="submit-order"]').click();
cy.contains('Request timed out', { timeout: 15000 }).should('be.visible');
});
});
```
**Key Points**:
- Stub different HTTP status codes (200, 400, 500, 503)
- Simulate timeouts with `delay` or non-resolving promises
- Test partial/incomplete data responses
- Verify app handles errors gracefully (no crashes, user-friendly messages)
### Example 4: Deterministic Waiting
**Context**: Never use hard waits (`waitForTimeout(3000)`). Always wait for explicit signals: network responses, element state changes, or custom events.
**Implementation**:
```typescript
// ✅ GOOD: Wait for response with predicate
test('wait for specific response', async ({ page }) => {
const responsePromise = page.waitForResponse((resp) => resp.url().includes('/api/users') && resp.status() === 200);
await page.goto('/dashboard');
const response = await responsePromise;
expect(response.status()).toBe(200);
await expect(page.getByText('Dashboard')).toBeVisible();
});
// ✅ GOOD: Wait for multiple responses
test('wait for all required data', async ({ page }) => {
const usersPromise = page.waitForResponse('**/api/users');
const productsPromise = page.waitForResponse('**/api/products');
const ordersPromise = page.waitForResponse('**/api/orders');
await page.goto('/dashboard');
// Wait for all in parallel
const [users, products, orders] = await Promise.all([usersPromise, productsPromise, ordersPromise]);
expect(users.status()).toBe(200);
expect(products.status()).toBe(200);
expect(orders.status()).toBe(200);
});
// ✅ GOOD: Wait for spinner to disappear
test('wait for loading indicator', async ({ page }) => {
await page.goto('/dashboard');
// Wait for spinner to disappear (signals data loaded)
await expect(page.getByTestId('loading-spinner')).not.toBeVisible();
await expect(page.getByText('Dashboard')).toBeVisible();
});
// ✅ GOOD: Wait for custom event (advanced)
test('wait for custom ready event', async ({ page }) => {
let appReady = false;
page.on('console', (msg) => {
if (msg.text() === 'App ready') appReady = true;
});
await page.goto('/dashboard');
// Poll until custom condition met
await page.waitForFunction(() => appReady, { timeout: 10000 });
await expect(page.getByText('Dashboard')).toBeVisible();
});
// ❌ BAD: Hard wait (arbitrary timeout)
test('flaky hard wait example', async ({ page }) => {
await page.goto('/dashboard');
await page.waitForTimeout(3000); // WHY 3 seconds? What if slower? What if faster?
await expect(page.getByText('Dashboard')).toBeVisible(); // May fail if >3s
});
// Cypress equivalents
describe('Deterministic Waiting', () => {
it('should wait for response', () => {
cy.intercept('GET', '**/api/users').as('getUsers');
cy.visit('/dashboard');
cy.wait('@getUsers').its('response.statusCode').should('eq', 200);
cy.contains('Dashboard').should('be.visible');
});
it('should wait for spinner to disappear', () => {
cy.visit('/dashboard');
cy.get('[data-testid="loading-spinner"]').should('not.exist');
cy.contains('Dashboard').should('be.visible');
});
// ❌ BAD: Hard wait
it('flaky hard wait', () => {
cy.visit('/dashboard');
cy.wait(3000); // NEVER DO THIS
cy.contains('Dashboard').should('be.visible');
});
});
```
**Key Points**:
- `waitForResponse()` with URL pattern or predicate = deterministic
- `waitForLoadState('networkidle')` = wait for all network activity to finish
- Wait for element state changes (spinner disappears, button enabled)
- **NEVER** use `waitForTimeout()` or `cy.wait(ms)` - always non-deterministic
### Example 5: Anti-Pattern - Navigate Then Mock
**Problem**:
```typescript
// ❌ BAD: Race condition - mock registered AFTER navigation starts
test('flaky test - navigate then mock', async ({ page }) => {
// Navigation starts immediately
await page.goto('/dashboard'); // Request to /api/users fires NOW
// Mock registered too late - request already sent
await page.route('**/api/users', (route) =>
route.fulfill({
status: 200,
body: JSON.stringify([{ id: 1, name: 'Test User' }]),
}),
);
// Test randomly passes/fails depending on timing
await expect(page.getByText('Test User')).toBeVisible(); // Flaky!
});
// ❌ BAD: No wait for response
test('flaky test - no explicit wait', async ({ page }) => {
await page.route('**/api/users', (route) => route.fulfill({ status: 200, body: JSON.stringify([]) }));
await page.goto('/dashboard');
// Assertion runs immediately - may fail if response slow
await expect(page.getByText('No users found')).toBeVisible(); // Flaky!
});
// ❌ BAD: Generic timeout
test('flaky test - hard wait', async ({ page }) => {
await page.goto('/dashboard');
await page.waitForTimeout(2000); // Arbitrary wait - brittle
await expect(page.getByText('Dashboard')).toBeVisible();
});
```
**Why It Fails**:
- **Mock after navigate**: Request fires during navigation, mock isn't active yet (race condition)
- **No explicit wait**: Assertion runs before response arrives (timing-dependent)
- **Hard waits**: Slow tests, brittle (fails if < timeout, wastes time if > timeout)
- **Non-deterministic**: Passes locally, fails in CI (different speeds)
**Better Approach**: Always intercept → trigger → await
```typescript
// ✅ GOOD: Intercept BEFORE navigate
test('deterministic test', async ({ page }) => {
// Step 1: Register mock FIRST
await page.route('**/api/users', (route) =>
route.fulfill({
status: 200,
contentType: 'application/json',
body: JSON.stringify([{ id: 1, name: 'Test User' }]),
}),
);
// Step 2: Store response promise BEFORE trigger
const responsePromise = page.waitForResponse('**/api/users');
// Step 3: THEN trigger
await page.goto('/dashboard');
// Step 4: THEN await response
await responsePromise;
// Step 5: THEN assert (data is guaranteed loaded)
await expect(page.getByText('Test User')).toBeVisible();
});
```
**Key Points**:
- Order matters: Mock → Promise → Trigger → Await → Assert
- No race conditions: Mock is active before request fires
- Explicit wait: Response promise ensures data loaded
- Deterministic: Always passes if app works correctly
## Integration Points
- **Used in workflows**: `*atdd` (test generation), `*automate` (test expansion), `*framework` (network setup)
- **Related fragments**:
- `fixture-architecture.md` - Network fixture patterns
- `data-factories.md` - API-first setup with network
- `test-quality.md` - Deterministic test principles
## Debugging Network Issues
When network tests fail, check:
1. **Timing**: Is interception registered **before** action?
2. **URL pattern**: Does pattern match actual request URL?
3. **Response format**: Is mocked response valid JSON/format?
4. **Status code**: Is app checking for 200 vs 201 vs 204?
5. **HAR file**: Capture real traffic to understand actual API contract
```typescript
// Debug network issues with logging
test('debug network', async ({ page }) => {
// Log all requests
page.on('request', (req) => console.log('→', req.method(), req.url()));
// Log all responses
page.on('response', (resp) => console.log('←', resp.status(), resp.url()));
await page.goto('/dashboard');
});
```
_Source: Murat Testing Philosophy (lines 94-137), Playwright network patterns, Cypress intercept best practices._

View File

@@ -0,0 +1,670 @@
# Non-Functional Requirements (NFR) Criteria
## Principle
Non-functional requirements (security, performance, reliability, maintainability) are **validated through automated tests**, not checklists. NFR assessment uses objective pass/fail criteria tied to measurable thresholds. Ambiguous requirements default to CONCERNS until clarified.
## Rationale
**The Problem**: Teams ship features that "work" functionally but fail under load, expose security vulnerabilities, or lack error recovery. NFRs are treated as optional "nice-to-haves" instead of release blockers.
**The Solution**: Define explicit NFR criteria with automated validation. Security tests verify auth/authz and secret handling. Performance tests enforce SLO/SLA thresholds with profiling evidence. Reliability tests validate error handling, retries, and health checks. Maintainability is measured by test coverage, code duplication, and observability.
**Why This Matters**:
- Prevents production incidents (security breaches, performance degradation, cascading failures)
- Provides objective release criteria (no subjective "feels fast enough")
- Automates compliance validation (audit trail for regulated environments)
- Forces clarity on ambiguous requirements (default to CONCERNS)
## Pattern Examples
### Example 1: Security NFR Validation (Auth, Secrets, OWASP)
**Context**: Automated security tests enforcing authentication, authorization, and secret handling
**Implementation**:
```typescript
// tests/nfr/security.spec.ts
import { test, expect } from '@playwright/test';
test.describe('Security NFR: Authentication & Authorization', () => {
test('unauthenticated users cannot access protected routes', async ({ page }) => {
// Attempt to access dashboard without auth
await page.goto('/dashboard');
// Should redirect to login (not expose data)
await expect(page).toHaveURL(/\/login/);
await expect(page.getByText('Please sign in')).toBeVisible();
// Verify no sensitive data leaked in response
const pageContent = await page.content();
expect(pageContent).not.toContain('user_id');
expect(pageContent).not.toContain('api_key');
});
test('JWT tokens expire after 15 minutes', async ({ page, request }) => {
// Login and capture token
await page.goto('/login');
await page.getByLabel('Email').fill('test@example.com');
await page.getByLabel('Password').fill('ValidPass123!');
await page.getByRole('button', { name: 'Sign In' }).click();
const token = await page.evaluate(() => localStorage.getItem('auth_token'));
expect(token).toBeTruthy();
// Wait 16 minutes (use mock clock in real tests)
await page.clock.fastForward('00:16:00');
// Token should be expired, API call should fail
const response = await request.get('/api/user/profile', {
headers: { Authorization: `Bearer ${token}` },
});
expect(response.status()).toBe(401);
const body = await response.json();
expect(body.error).toContain('expired');
});
test('passwords are never logged or exposed in errors', async ({ page }) => {
// Trigger login error
await page.goto('/login');
await page.getByLabel('Email').fill('test@example.com');
await page.getByLabel('Password').fill('WrongPassword123!');
// Monitor console for password leaks
const consoleLogs: string[] = [];
page.on('console', (msg) => consoleLogs.push(msg.text()));
await page.getByRole('button', { name: 'Sign In' }).click();
// Error shown to user (generic message)
await expect(page.getByText('Invalid credentials')).toBeVisible();
// Verify password NEVER appears in console, DOM, or network
const pageContent = await page.content();
expect(pageContent).not.toContain('WrongPassword123!');
expect(consoleLogs.join('\n')).not.toContain('WrongPassword123!');
});
test('RBAC: users can only access resources they own', async ({ page, request }) => {
// Login as User A
const userAToken = await login(request, 'userA@example.com', 'password');
// Try to access User B's order
const response = await request.get('/api/orders/user-b-order-id', {
headers: { Authorization: `Bearer ${userAToken}` },
});
expect(response.status()).toBe(403); // Forbidden
const body = await response.json();
expect(body.error).toContain('insufficient permissions');
});
test('SQL injection attempts are blocked', async ({ page }) => {
await page.goto('/search');
// Attempt SQL injection
await page.getByPlaceholder('Search products').fill("'; DROP TABLE users; --");
await page.getByRole('button', { name: 'Search' }).click();
// Should return empty results, NOT crash or expose error
await expect(page.getByText('No results found')).toBeVisible();
// Verify app still works (table not dropped)
await page.goto('/dashboard');
await expect(page.getByText('Welcome')).toBeVisible();
});
test('XSS attempts are sanitized', async ({ page }) => {
await page.goto('/profile/edit');
// Attempt XSS injection
const xssPayload = '<script>alert("XSS")</script>';
await page.getByLabel('Bio').fill(xssPayload);
await page.getByRole('button', { name: 'Save' }).click();
// Reload and verify XSS is escaped (not executed)
await page.reload();
const bio = await page.getByTestId('user-bio').textContent();
// Text should be escaped, script should NOT execute
expect(bio).toContain('&lt;script&gt;');
expect(bio).not.toContain('<script>');
});
});
// Helper
async function login(request: any, email: string, password: string): Promise<string> {
const response = await request.post('/api/auth/login', {
data: { email, password },
});
const body = await response.json();
return body.token;
}
```
**Key Points**:
- Authentication: Unauthenticated access redirected (not exposed)
- Authorization: RBAC enforced (403 for insufficient permissions)
- Token expiry: JWT expires after 15 minutes (automated validation)
- Secret handling: Passwords never logged or exposed in errors
- OWASP Top 10: SQL injection and XSS blocked (input sanitization)
**Security NFR Criteria**:
- ✅ PASS: All 6 tests green (auth, authz, token expiry, secret handling, SQL injection, XSS)
- ⚠️ CONCERNS: 1-2 tests failing with mitigation plan and owner assigned
- ❌ FAIL: Critical exposure (unauthenticated access, password leak, SQL injection succeeds)
---
### Example 2: Performance NFR Validation (k6 Load Testing for SLO/SLA)
**Context**: Use k6 for load testing, stress testing, and SLO/SLA enforcement (NOT Playwright)
**Implementation**:
```javascript
// tests/nfr/performance.k6.js
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate, Trend } from 'k6/metrics';
// Custom metrics
const errorRate = new Rate('errors');
const apiDuration = new Trend('api_duration');
// Performance thresholds (SLO/SLA)
export const options = {
stages: [
{ duration: '1m', target: 50 }, // Ramp up to 50 users
{ duration: '3m', target: 50 }, // Stay at 50 users for 3 minutes
{ duration: '1m', target: 100 }, // Spike to 100 users
{ duration: '3m', target: 100 }, // Stay at 100 users
{ duration: '1m', target: 0 }, // Ramp down
],
thresholds: {
// SLO: 95% of requests must complete in <500ms
http_req_duration: ['p(95)<500'],
// SLO: Error rate must be <1%
errors: ['rate<0.01'],
// SLA: API endpoints must respond in <1s (99th percentile)
api_duration: ['p(99)<1000'],
},
};
export default function () {
// Test 1: Homepage load performance
const homepageResponse = http.get(`${__ENV.BASE_URL}/`);
check(homepageResponse, {
'homepage status is 200': (r) => r.status === 200,
'homepage loads in <2s': (r) => r.timings.duration < 2000,
});
errorRate.add(homepageResponse.status !== 200);
// Test 2: API endpoint performance
const apiResponse = http.get(`${__ENV.BASE_URL}/api/products?limit=10`, {
headers: { Authorization: `Bearer ${__ENV.API_TOKEN}` },
});
check(apiResponse, {
'API status is 200': (r) => r.status === 200,
'API responds in <500ms': (r) => r.timings.duration < 500,
});
apiDuration.add(apiResponse.timings.duration);
errorRate.add(apiResponse.status !== 200);
// Test 3: Search endpoint under load
const searchResponse = http.get(`${__ENV.BASE_URL}/api/search?q=laptop&limit=100`);
check(searchResponse, {
'search status is 200': (r) => r.status === 200,
'search responds in <1s': (r) => r.timings.duration < 1000,
'search returns results': (r) => JSON.parse(r.body).results.length > 0,
});
errorRate.add(searchResponse.status !== 200);
sleep(1); // Realistic user think time
}
// Threshold validation (run after test)
export function handleSummary(data) {
const p95Duration = data.metrics.http_req_duration.values['p(95)'];
const p99ApiDuration = data.metrics.api_duration.values['p(99)'];
const errorRateValue = data.metrics.errors.values.rate;
console.log(`P95 request duration: ${p95Duration.toFixed(2)}ms`);
console.log(`P99 API duration: ${p99ApiDuration.toFixed(2)}ms`);
console.log(`Error rate: ${(errorRateValue * 100).toFixed(2)}%`);
return {
'summary.json': JSON.stringify(data),
stdout: `
Performance NFR Results:
- P95 request duration: ${p95Duration < 500 ? '✅ PASS' : '❌ FAIL'} (${p95Duration.toFixed(2)}ms / 500ms threshold)
- P99 API duration: ${p99ApiDuration < 1000 ? '✅ PASS' : '❌ FAIL'} (${p99ApiDuration.toFixed(2)}ms / 1000ms threshold)
- Error rate: ${errorRateValue < 0.01 ? '✅ PASS' : '❌ FAIL'} (${(errorRateValue * 100).toFixed(2)}% / 1% threshold)
`,
};
}
```
**Run k6 tests:**
```bash
# Local smoke test (10 VUs, 30s)
k6 run --vus 10 --duration 30s tests/nfr/performance.k6.js
# Full load test (stages defined in script)
k6 run tests/nfr/performance.k6.js
# CI integration with thresholds
k6 run --out json=performance-results.json tests/nfr/performance.k6.js
```
**Key Points**:
- **k6 is the right tool** for load testing (NOT Playwright)
- SLO/SLA thresholds enforced automatically (`p(95)<500`, `rate<0.01`)
- Realistic load simulation (ramp up, sustained load, spike testing)
- Comprehensive metrics (p50, p95, p99, error rate, throughput)
- CI-friendly (JSON output, exit codes based on thresholds)
**Performance NFR Criteria**:
- ✅ PASS: All SLO/SLA targets met with k6 profiling evidence (p95 < 500ms, error rate < 1%)
- CONCERNS: Trending toward limits (e.g., p95 = 480ms approaching 500ms) or missing baselines
- FAIL: SLO/SLA breached (e.g., p95 > 500ms) or error rate > 1%
**Performance Testing Levels (from Test Architect course):**
- **Load testing**: System behavior under expected load
- **Stress testing**: System behavior under extreme load (breaking point)
- **Spike testing**: Sudden load increases (traffic spikes)
- **Endurance/Soak testing**: System behavior under sustained load (memory leaks, resource exhaustion)
- **Benchmarking**: Baseline measurements for comparison
**Note**: Playwright can validate **perceived performance** (Core Web Vitals via Lighthouse), but k6 validates **system performance** (throughput, latency, resource limits under load)
---
### Example 3: Reliability NFR Validation (Playwright for UI Resilience)
**Context**: Automated reliability tests validating graceful degradation and recovery paths
**Implementation**:
```typescript
// tests/nfr/reliability.spec.ts
import { test, expect } from '@playwright/test';
test.describe('Reliability NFR: Error Handling & Recovery', () => {
test('app remains functional when API returns 500 error', async ({ page, context }) => {
// Mock API failure
await context.route('**/api/products', (route) => {
route.fulfill({ status: 500, body: JSON.stringify({ error: 'Internal Server Error' }) });
});
await page.goto('/products');
// User sees error message (not blank page or crash)
await expect(page.getByText('Unable to load products. Please try again.')).toBeVisible();
await expect(page.getByRole('button', { name: 'Retry' })).toBeVisible();
// App navigation still works (graceful degradation)
await page.getByRole('link', { name: 'Home' }).click();
await expect(page).toHaveURL('/');
});
test('API client retries on transient failures (3 attempts)', async ({ page, context }) => {
let attemptCount = 0;
await context.route('**/api/checkout', (route) => {
attemptCount++;
// Fail first 2 attempts, succeed on 3rd
if (attemptCount < 3) {
route.fulfill({ status: 503, body: JSON.stringify({ error: 'Service Unavailable' }) });
} else {
route.fulfill({ status: 200, body: JSON.stringify({ orderId: '12345' }) });
}
});
await page.goto('/checkout');
await page.getByRole('button', { name: 'Place Order' }).click();
// Should succeed after 3 attempts
await expect(page.getByText('Order placed successfully')).toBeVisible();
expect(attemptCount).toBe(3);
});
test('app handles network disconnection gracefully', async ({ page, context }) => {
await page.goto('/dashboard');
// Simulate offline mode
await context.setOffline(true);
// Trigger action requiring network
await page.getByRole('button', { name: 'Refresh Data' }).click();
// User sees offline indicator (not crash)
await expect(page.getByText('You are offline. Changes will sync when reconnected.')).toBeVisible();
// Reconnect
await context.setOffline(false);
await page.getByRole('button', { name: 'Refresh Data' }).click();
// Data loads successfully
await expect(page.getByText('Data updated')).toBeVisible();
});
test('health check endpoint returns service status', async ({ request }) => {
const response = await request.get('/api/health');
expect(response.status()).toBe(200);
const health = await response.json();
expect(health).toHaveProperty('status', 'healthy');
expect(health).toHaveProperty('timestamp');
expect(health).toHaveProperty('services');
// Verify critical services are monitored
expect(health.services).toHaveProperty('database');
expect(health.services).toHaveProperty('cache');
expect(health.services).toHaveProperty('queue');
// All services should be UP
expect(health.services.database.status).toBe('UP');
expect(health.services.cache.status).toBe('UP');
expect(health.services.queue.status).toBe('UP');
});
test('circuit breaker opens after 5 consecutive failures', async ({ page, context }) => {
let failureCount = 0;
await context.route('**/api/recommendations', (route) => {
failureCount++;
route.fulfill({ status: 500, body: JSON.stringify({ error: 'Service Error' }) });
});
await page.goto('/product/123');
// Wait for circuit breaker to open (fallback UI appears)
await expect(page.getByText('Recommendations temporarily unavailable')).toBeVisible({ timeout: 10000 });
// Verify circuit breaker stopped making requests after threshold (should be ≤5)
expect(failureCount).toBeLessThanOrEqual(5);
});
test('rate limiting gracefully handles 429 responses', async ({ page, context }) => {
let requestCount = 0;
await context.route('**/api/search', (route) => {
requestCount++;
if (requestCount > 10) {
// Rate limit exceeded
route.fulfill({
status: 429,
headers: { 'Retry-After': '5' },
body: JSON.stringify({ error: 'Rate limit exceeded' }),
});
} else {
route.fulfill({ status: 200, body: JSON.stringify({ results: [] }) });
}
});
await page.goto('/search');
// Make 15 search requests rapidly
for (let i = 0; i < 15; i++) {
await page.getByPlaceholder('Search').fill(`query-${i}`);
await page.getByRole('button', { name: 'Search' }).click();
}
// User sees rate limit message (not crash)
await expect(page.getByText('Too many requests. Please wait a moment.')).toBeVisible();
});
});
```
**Key Points**:
- Error handling: Graceful degradation (500 error → user-friendly message + retry button)
- Retries: 3 attempts on transient failures (503 → eventual success)
- Offline handling: Network disconnection detected (sync when reconnected)
- Health checks: `/api/health` monitors database, cache, queue
- Circuit breaker: Opens after 5 failures (fallback UI, stop retries)
- Rate limiting: 429 response handled (Retry-After header respected)
**Reliability NFR Criteria**:
- ✅ PASS: Error handling, retries, health checks verified (all 6 tests green)
- ⚠️ CONCERNS: Partial coverage (e.g., missing circuit breaker) or no telemetry
- ❌ FAIL: No recovery path (500 error crashes app) or unresolved crash scenarios
---
### Example 4: Maintainability NFR Validation (CI Tools, Not Playwright)
**Context**: Use proper CI tools for code quality validation (coverage, duplication, vulnerabilities)
**Implementation**:
```yaml
# .github/workflows/nfr-maintainability.yml
name: NFR - Maintainability
on: [push, pull_request]
jobs:
test-coverage:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- name: Install dependencies
run: npm ci
- name: Run tests with coverage
run: npm run test:coverage
- name: Check coverage threshold (80% minimum)
run: |
COVERAGE=$(jq '.total.lines.pct' coverage/coverage-summary.json)
echo "Coverage: $COVERAGE%"
if (( $(echo "$COVERAGE < 80" | bc -l) )); then
echo "❌ FAIL: Coverage $COVERAGE% below 80% threshold"
exit 1
else
echo "✅ PASS: Coverage $COVERAGE% meets 80% threshold"
fi
code-duplication:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- name: Check code duplication (<5% allowed)
run: |
npx jscpd src/ --threshold 5 --format json --output duplication.json
DUPLICATION=$(jq '.statistics.total.percentage' duplication.json)
echo "Duplication: $DUPLICATION%"
if (( $(echo "$DUPLICATION >= 5" | bc -l) )); then
echo "❌ FAIL: Duplication $DUPLICATION% exceeds 5% threshold"
exit 1
else
echo "✅ PASS: Duplication $DUPLICATION% below 5% threshold"
fi
vulnerability-scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- name: Install dependencies
run: npm ci
- name: Run npm audit (no critical/high vulnerabilities)
run: |
npm audit --json > audit.json || true
CRITICAL=$(jq '.metadata.vulnerabilities.critical' audit.json)
HIGH=$(jq '.metadata.vulnerabilities.high' audit.json)
echo "Critical: $CRITICAL, High: $HIGH"
if [ "$CRITICAL" -gt 0 ] || [ "$HIGH" -gt 0 ]; then
echo "❌ FAIL: Found $CRITICAL critical and $HIGH high vulnerabilities"
npm audit
exit 1
else
echo "✅ PASS: No critical/high vulnerabilities"
fi
```
**Playwright Tests for Observability (E2E Validation):**
```typescript
// tests/nfr/observability.spec.ts
import { test, expect } from '@playwright/test';
test.describe('Maintainability NFR: Observability Validation', () => {
test('critical errors are reported to monitoring service', async ({ page, context }) => {
const sentryEvents: any[] = [];
// Mock Sentry SDK to verify error tracking
await context.addInitScript(() => {
(window as any).Sentry = {
captureException: (error: Error) => {
console.log('SENTRY_CAPTURE:', JSON.stringify({ message: error.message, stack: error.stack }));
},
};
});
page.on('console', (msg) => {
if (msg.text().includes('SENTRY_CAPTURE:')) {
sentryEvents.push(JSON.parse(msg.text().replace('SENTRY_CAPTURE:', '')));
}
});
// Trigger error by mocking API failure
await context.route('**/api/products', (route) => {
route.fulfill({ status: 500, body: JSON.stringify({ error: 'Database Error' }) });
});
await page.goto('/products');
// Wait for error UI and Sentry capture
await expect(page.getByText('Unable to load products')).toBeVisible();
// Verify error was captured by monitoring
expect(sentryEvents.length).toBeGreaterThan(0);
expect(sentryEvents[0]).toHaveProperty('message');
expect(sentryEvents[0]).toHaveProperty('stack');
});
test('API response times are tracked in telemetry', async ({ request }) => {
const response = await request.get('/api/products?limit=10');
expect(response.ok()).toBeTruthy();
// Verify Server-Timing header for APM (Application Performance Monitoring)
const serverTiming = response.headers()['server-timing'];
expect(serverTiming).toBeTruthy();
expect(serverTiming).toContain('db'); // Database query time
expect(serverTiming).toContain('total'); // Total processing time
});
test('structured logging present in application', async ({ request }) => {
// Make API call that generates logs
const response = await request.post('/api/orders', {
data: { productId: '123', quantity: 2 },
});
expect(response.ok()).toBeTruthy();
// Note: In real scenarios, validate logs in monitoring system (Datadog, CloudWatch)
// This test validates the logging contract exists (Server-Timing, trace IDs in headers)
const traceId = response.headers()['x-trace-id'];
expect(traceId).toBeTruthy(); // Confirms structured logging with correlation IDs
});
});
```
**Key Points**:
- **Coverage/duplication**: CI jobs (GitHub Actions), not Playwright tests
- **Vulnerability scanning**: npm audit in CI, not Playwright tests
- **Observability**: Playwright validates error tracking (Sentry) and telemetry headers
- **Structured logging**: Validate logging contract (trace IDs, Server-Timing headers)
- **Separation of concerns**: Build-time checks (coverage, audit) vs runtime checks (error tracking, telemetry)
**Maintainability NFR Criteria**:
- ✅ PASS: Clean code (80%+ coverage from CI, <5% duplication from CI), observability validated in E2E, no critical vulnerabilities from npm audit
- CONCERNS: Duplication >5%, coverage 60-79%, or unclear ownership
- ❌ FAIL: Absent tests (<60%), tangled implementations (>10% duplication), or no observability
---
## NFR Assessment Checklist
Before release gate:
- [ ] **Security** (Playwright E2E + Security Tools):
- [ ] Auth/authz tests green (unauthenticated redirect, RBAC enforced)
- [ ] Secrets never logged or exposed in errors
- [ ] OWASP Top 10 validated (SQL injection blocked, XSS sanitized)
- [ ] Security audit completed (vulnerability scan, penetration test if applicable)
- [ ] **Performance** (k6 Load Testing):
- [ ] SLO/SLA targets met with k6 evidence (p95 <500ms, error rate <1%)
- [ ] Load testing completed (expected load)
- [ ] Stress testing completed (breaking point identified)
- [ ] Spike testing completed (handles traffic spikes)
- [ ] Endurance testing completed (no memory leaks under sustained load)
- [ ] **Reliability** (Playwright E2E + API Tests):
- [ ] Error handling graceful (500 user-friendly message + retry)
- [ ] Retries implemented (3 attempts on transient failures)
- [ ] Health checks monitored (/api/health endpoint)
- [ ] Circuit breaker tested (opens after failure threshold)
- [ ] Offline handling validated (network disconnection graceful)
- [ ] **Maintainability** (CI Tools):
- [ ] Test coverage 80% (from CI coverage report)
- [ ] Code duplication <5% (from jscpd CI job)
- [ ] No critical/high vulnerabilities (from npm audit CI job)
- [ ] Structured logging validated (Playwright validates telemetry headers)
- [ ] Error tracking configured (Sentry/monitoring integration validated)
- [ ] **Ambiguous requirements**: Default to CONCERNS (force team to clarify thresholds and evidence)
- [ ] **NFR criteria documented**: Measurable thresholds defined (not subjective "fast enough")
- [ ] **Automated validation**: NFR tests run in CI pipeline (not manual checklists)
- [ ] **Tool selection**: Right tool for each NFR (k6 for performance, Playwright for security/reliability E2E, CI tools for maintainability)
## NFR Gate Decision Matrix
| Category | PASS Criteria | CONCERNS Criteria | FAIL Criteria |
| ------------------- | -------------------------------------------- | -------------------------------------------- | ---------------------------------------------- |
| **Security** | Auth/authz, secret handling, OWASP verified | Minor gaps with clear owners | Critical exposure or missing controls |
| **Performance** | Metrics meet SLO/SLA with profiling evidence | Trending toward limits or missing baselines | SLO/SLA breached or resource leaks detected |
| **Reliability** | Error handling, retries, health checks OK | Partial coverage or missing telemetry | No recovery path or unresolved crash scenarios |
| **Maintainability** | Clean code, tests, docs shipped together | Duplication, low coverage, unclear ownership | Absent tests, tangled code, no observability |
**Default**: If targets or evidence are undefined **CONCERNS** (force team to clarify before sign-off)
## Integration Points
- **Used in workflows**: `*nfr-assess` (automated NFR validation), `*trace` (gate decision Phase 2), `*test-design` (NFR risk assessment via Utility Tree)
- **Related fragments**: `risk-governance.md` (NFR risk scoring), `probability-impact.md` (NFR impact assessment), `test-quality.md` (maintainability standards), `test-levels-framework.md` (system-level testing for NFRs)
- **Tools by NFR Category**:
- **Security**: Playwright (E2E auth/authz), OWASP ZAP, Burp Suite, npm audit, Snyk
- **Performance**: k6 (load/stress/spike/endurance), Lighthouse (Core Web Vitals), Artillery
- **Reliability**: Playwright (E2E error handling), API tests (retries, health checks), Chaos Engineering tools
- **Maintainability**: GitHub Actions (coverage, duplication, audit), jscpd, Playwright (observability validation)
_Source: Test Architect course (NFR testing approaches, Utility Tree, Quality Scenarios), ISO/IEC 25010 Software Quality Characteristics, OWASP Top 10, k6 documentation, SRE practices_

View File

@@ -0,0 +1,730 @@
# Playwright Configuration Guardrails
## Principle
Load environment configs via a central map (`envConfigMap`), standardize timeouts (action 15s, navigation 30s, expect 10s, test 60s), emit HTML + JUnit reporters, and store artifacts under `test-results/` for CI upload. Keep `.env.example`, `.nvmrc`, and browser dependencies versioned so local and CI runs stay aligned.
## Rationale
Environment-specific configuration prevents hardcoded URLs, timeouts, and credentials from leaking into tests. A central config map with fail-fast validation catches missing environments early. Standardized timeouts reduce flakiness while remaining long enough for real-world network conditions. Consistent artifact storage (`test-results/`, `playwright-report/`) enables CI pipelines to upload failure evidence automatically. Versioned dependencies (`.nvmrc`, `package.json` browser versions) eliminate "works on my machine" issues between local and CI environments.
## Pattern Examples
### Example 1: Environment-Based Configuration
**Context**: When testing against multiple environments (local, staging, production), use a central config map that loads environment-specific settings and fails fast if `TEST_ENV` is invalid.
**Implementation**:
```typescript
// playwright.config.ts - Central config loader
import { config as dotenvConfig } from 'dotenv';
import path from 'path';
// Load .env from project root
dotenvConfig({
path: path.resolve(__dirname, '../../.env'),
});
// Central environment config map
const envConfigMap = {
local: require('./playwright/config/local.config').default,
staging: require('./playwright/config/staging.config').default,
production: require('./playwright/config/production.config').default,
};
const environment = process.env.TEST_ENV || 'local';
// Fail fast if environment not supported
if (!Object.keys(envConfigMap).includes(environment)) {
console.error(`❌ No configuration found for environment: ${environment}`);
console.error(` Available environments: ${Object.keys(envConfigMap).join(', ')}`);
process.exit(1);
}
console.log(`✅ Running tests against: ${environment.toUpperCase()}`);
export default envConfigMap[environment as keyof typeof envConfigMap];
```
```typescript
// playwright/config/base.config.ts - Shared base configuration
import { defineConfig } from '@playwright/test';
import path from 'path';
export const baseConfig = defineConfig({
testDir: path.resolve(__dirname, '../tests'),
outputDir: path.resolve(__dirname, '../../test-results'),
fullyParallel: true,
forbidOnly: !!process.env.CI,
retries: process.env.CI ? 2 : 0,
workers: process.env.CI ? 1 : undefined,
reporter: [
['html', { outputFolder: 'playwright-report', open: 'never' }],
['junit', { outputFile: 'test-results/results.xml' }],
['list'],
],
use: {
actionTimeout: 15000,
navigationTimeout: 30000,
trace: 'on-first-retry',
screenshot: 'only-on-failure',
video: 'retain-on-failure',
},
globalSetup: path.resolve(__dirname, '../support/global-setup.ts'),
timeout: 60000,
expect: { timeout: 10000 },
});
```
```typescript
// playwright/config/local.config.ts - Local environment
import { defineConfig } from '@playwright/test';
import { baseConfig } from './base.config';
export default defineConfig({
...baseConfig,
use: {
...baseConfig.use,
baseURL: 'http://localhost:3000',
video: 'off', // No video locally for speed
},
webServer: {
command: 'npm run dev',
url: 'http://localhost:3000',
reuseExistingServer: !process.env.CI,
timeout: 120000,
},
});
```
```typescript
// playwright/config/staging.config.ts - Staging environment
import { defineConfig } from '@playwright/test';
import { baseConfig } from './base.config';
export default defineConfig({
...baseConfig,
use: {
...baseConfig.use,
baseURL: 'https://staging.example.com',
ignoreHTTPSErrors: true, // Allow self-signed certs in staging
},
});
```
```typescript
// playwright/config/production.config.ts - Production environment
import { defineConfig } from '@playwright/test';
import { baseConfig } from './base.config';
export default defineConfig({
...baseConfig,
retries: 3, // More retries in production
use: {
...baseConfig.use,
baseURL: 'https://example.com',
video: 'on', // Always record production failures
},
});
```
```bash
# .env.example - Template for developers
TEST_ENV=local
API_KEY=your_api_key_here
DATABASE_URL=postgresql://localhost:5432/test_db
```
**Key Points**:
- Central `envConfigMap` prevents environment misconfiguration
- Fail-fast validation with clear error message (available envs listed)
- Base config defines shared settings, environment configs override
- `.env.example` provides template for required secrets
- `TEST_ENV=local` as default for local development
- Production config increases retries and enables video recording
### Example 2: Timeout Standards
**Context**: When tests fail due to inconsistent timeout settings, standardize timeouts across all tests: action 15s, navigation 30s, expect 10s, test 60s. Expose overrides through fixtures rather than inline literals.
**Implementation**:
```typescript
// playwright/config/base.config.ts - Standardized timeouts
import { defineConfig } from '@playwright/test';
export default defineConfig({
// Global test timeout: 60 seconds
timeout: 60000,
use: {
// Action timeout: 15 seconds (click, fill, etc.)
actionTimeout: 15000,
// Navigation timeout: 30 seconds (page.goto, page.reload)
navigationTimeout: 30000,
},
// Expect timeout: 10 seconds (all assertions)
expect: {
timeout: 10000,
},
});
```
```typescript
// playwright/support/fixtures/timeout-fixture.ts - Timeout override fixture
import { test as base } from '@playwright/test';
type TimeoutOptions = {
extendedTimeout: (timeoutMs: number) => Promise<void>;
};
export const test = base.extend<TimeoutOptions>({
extendedTimeout: async ({}, use, testInfo) => {
const originalTimeout = testInfo.timeout;
await use(async (timeoutMs: number) => {
testInfo.setTimeout(timeoutMs);
});
// Restore original timeout after test
testInfo.setTimeout(originalTimeout);
},
});
export { expect } from '@playwright/test';
```
```typescript
// Usage in tests - Standard timeouts (implicit)
import { test, expect } from '@playwright/test';
test('user can log in', async ({ page }) => {
await page.goto('/login'); // Uses 30s navigation timeout
await page.fill('[data-testid="email"]', 'test@example.com'); // Uses 15s action timeout
await page.click('[data-testid="login-button"]'); // Uses 15s action timeout
await expect(page.getByText('Welcome')).toBeVisible(); // Uses 10s expect timeout
});
```
```typescript
// Usage in tests - Per-test timeout override
import { test, expect } from '../support/fixtures/timeout-fixture';
test('slow data processing operation', async ({ page, extendedTimeout }) => {
// Override default 60s timeout for this slow test
await extendedTimeout(180000); // 3 minutes
await page.goto('/data-processing');
await page.click('[data-testid="process-large-file"]');
// Wait for long-running operation
await expect(page.getByText('Processing complete')).toBeVisible({
timeout: 120000, // 2 minutes for assertion
});
});
```
```typescript
// Per-assertion timeout override (inline)
test('API returns quickly', async ({ page }) => {
await page.goto('/dashboard');
// Override expect timeout for fast API (reduce flakiness detection)
await expect(page.getByTestId('user-name')).toBeVisible({ timeout: 5000 }); // 5s instead of 10s
// Override expect timeout for slow external API
await expect(page.getByTestId('weather-widget')).toBeVisible({ timeout: 20000 }); // 20s instead of 10s
});
```
**Key Points**:
- **Standardized timeouts**: action 15s, navigation 30s, expect 10s, test 60s (global defaults)
- Fixture-based override (`extendedTimeout`) for slow tests (preferred over inline)
- Per-assertion timeout override via `{ timeout: X }` option (use sparingly)
- Avoid hard waits (`page.waitForTimeout(3000)`) - use event-based waits instead
- CI environments may need longer timeouts (handle in environment-specific config)
### Example 3: Artifact Output Configuration
**Context**: When debugging failures in CI, configure artifacts (screenshots, videos, traces, HTML reports) to be captured on failure and stored in consistent locations for upload.
**Implementation**:
```typescript
// playwright.config.ts - Artifact configuration
import { defineConfig } from '@playwright/test';
import path from 'path';
export default defineConfig({
// Output directory for test artifacts
outputDir: path.resolve(__dirname, './test-results'),
use: {
// Screenshot on failure only (saves space)
screenshot: 'only-on-failure',
// Video recording on failure + retry
video: 'retain-on-failure',
// Trace recording on first retry (best debugging data)
trace: 'on-first-retry',
},
reporter: [
// HTML report (visual, interactive)
[
'html',
{
outputFolder: 'playwright-report',
open: 'never', // Don't auto-open in CI
},
],
// JUnit XML (CI integration)
[
'junit',
{
outputFile: 'test-results/results.xml',
},
],
// List reporter (console output)
['list'],
],
});
```
```typescript
// playwright/support/fixtures/artifact-fixture.ts - Custom artifact capture
import { test as base } from '@playwright/test';
import fs from 'fs';
import path from 'path';
export const test = base.extend({
// Auto-capture console logs on failure
page: async ({ page }, use, testInfo) => {
const logs: string[] = [];
page.on('console', (msg) => {
logs.push(`[${msg.type()}] ${msg.text()}`);
});
await use(page);
// Save logs on failure
if (testInfo.status !== testInfo.expectedStatus) {
const logsPath = path.join(testInfo.outputDir, 'console-logs.txt');
fs.writeFileSync(logsPath, logs.join('\n'));
testInfo.attachments.push({
name: 'console-logs',
contentType: 'text/plain',
path: logsPath,
});
}
},
});
```
```yaml
# .github/workflows/e2e.yml - CI artifact upload
name: E2E Tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version-file: '.nvmrc'
- name: Install dependencies
run: npm ci
- name: Install Playwright browsers
run: npx playwright install --with-deps
- name: Run tests
run: npm run test
env:
TEST_ENV: staging
# Upload test artifacts on failure
- name: Upload test results
if: failure()
uses: actions/upload-artifact@v4
with:
name: test-results
path: test-results/
retention-days: 30
- name: Upload Playwright report
if: failure()
uses: actions/upload-artifact@v4
with:
name: playwright-report
path: playwright-report/
retention-days: 30
```
```typescript
// Example: Custom screenshot on specific condition
test('capture screenshot on specific error', async ({ page }) => {
await page.goto('/checkout');
try {
await page.click('[data-testid="submit-payment"]');
await expect(page.getByText('Order Confirmed')).toBeVisible();
} catch (error) {
// Capture custom screenshot with timestamp
await page.screenshot({
path: `test-results/payment-error-${Date.now()}.png`,
fullPage: true,
});
throw error;
}
});
```
**Key Points**:
- `screenshot: 'only-on-failure'` saves space (not every test)
- `video: 'retain-on-failure'` captures full flow on failures
- `trace: 'on-first-retry'` provides deep debugging data (network, DOM, console)
- HTML report at `playwright-report/` (visual debugging)
- JUnit XML at `test-results/results.xml` (CI integration)
- CI uploads artifacts on failure with 30-day retention
- Custom fixture can capture console logs, network logs, etc.
### Example 4: Parallelization Configuration
**Context**: When tests run slowly in CI, configure parallelization with worker count, sharding, and fully parallel execution to maximize speed while maintaining stability.
**Implementation**:
```typescript
// playwright.config.ts - Parallelization settings
import { defineConfig } from '@playwright/test';
import os from 'os';
export default defineConfig({
// Run tests in parallel within single file
fullyParallel: true,
// Worker configuration
workers: process.env.CI
? 1 // Serial in CI for stability (or 2 for faster CI)
: os.cpus().length - 1, // Parallel locally (leave 1 CPU for OS)
// Prevent accidentally committed .only() from blocking CI
forbidOnly: !!process.env.CI,
// Retry failed tests in CI
retries: process.env.CI ? 2 : 0,
// Shard configuration (split tests across multiple machines)
shard:
process.env.SHARD_INDEX && process.env.SHARD_TOTAL
? {
current: parseInt(process.env.SHARD_INDEX, 10),
total: parseInt(process.env.SHARD_TOTAL, 10),
}
: undefined,
});
```
```yaml
# .github/workflows/e2e-parallel.yml - Sharded CI execution
name: E2E Tests (Parallel)
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
shard: [1, 2, 3, 4] # Split tests across 4 machines
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version-file: '.nvmrc'
- name: Install dependencies
run: npm ci
- name: Install Playwright browsers
run: npx playwright install --with-deps
- name: Run tests (shard ${{ matrix.shard }})
run: npm run test
env:
SHARD_INDEX: ${{ matrix.shard }}
SHARD_TOTAL: 4
TEST_ENV: staging
- name: Upload test results
if: failure()
uses: actions/upload-artifact@v4
with:
name: test-results-shard-${{ matrix.shard }}
path: test-results/
```
```typescript
// playwright/config/serial.config.ts - Serial execution for flaky tests
import { defineConfig } from '@playwright/test';
import { baseConfig } from './base.config';
export default defineConfig({
...baseConfig,
// Disable parallel execution
fullyParallel: false,
workers: 1,
// Used for: authentication flows, database-dependent tests, feature flag tests
});
```
```typescript
// Usage: Force serial execution for specific tests
import { test } from '@playwright/test';
// Serial execution for auth tests (shared session state)
test.describe.configure({ mode: 'serial' });
test.describe('Authentication Flow', () => {
test('user can log in', async ({ page }) => {
// First test in serial block
});
test('user can access dashboard', async ({ page }) => {
// Depends on previous test (serial)
});
});
```
```typescript
// Usage: Parallel execution for independent tests (default)
import { test } from '@playwright/test';
test.describe('Product Catalog', () => {
test('can view product 1', async ({ page }) => {
// Runs in parallel with other tests
});
test('can view product 2', async ({ page }) => {
// Runs in parallel with other tests
});
});
```
**Key Points**:
- `fullyParallel: true` enables parallel execution within single test file
- Workers: 1 in CI (stability), N-1 CPUs locally (speed)
- Sharding splits tests across multiple CI machines (4x faster with 4 shards)
- `test.describe.configure({ mode: 'serial' })` for dependent tests
- `forbidOnly: true` in CI prevents `.only()` from blocking pipeline
- Matrix strategy in CI runs shards concurrently
### Example 5: Project Configuration
**Context**: When testing across multiple browsers, devices, or configurations, use Playwright projects to run the same tests against different environments (chromium, firefox, webkit, mobile).
**Implementation**:
```typescript
// playwright.config.ts - Multiple browser projects
import { defineConfig, devices } from '@playwright/test';
export default defineConfig({
projects: [
// Desktop browsers
{
name: 'chromium',
use: { ...devices['Desktop Chrome'] },
},
{
name: 'firefox',
use: { ...devices['Desktop Firefox'] },
},
{
name: 'webkit',
use: { ...devices['Desktop Safari'] },
},
// Mobile browsers
{
name: 'mobile-chrome',
use: { ...devices['Pixel 5'] },
},
{
name: 'mobile-safari',
use: { ...devices['iPhone 13'] },
},
// Tablet
{
name: 'tablet',
use: { ...devices['iPad Pro'] },
},
],
});
```
```typescript
// playwright.config.ts - Authenticated vs. unauthenticated projects
import { defineConfig } from '@playwright/test';
import path from 'path';
export default defineConfig({
projects: [
// Setup project (runs first, creates auth state)
{
name: 'setup',
testMatch: /global-setup\.ts/,
},
// Authenticated tests (reuse auth state)
{
name: 'authenticated',
dependencies: ['setup'],
use: {
storageState: path.resolve(__dirname, './playwright/.auth/user.json'),
},
testMatch: /.*authenticated\.spec\.ts/,
},
// Unauthenticated tests (public pages)
{
name: 'unauthenticated',
testMatch: /.*unauthenticated\.spec\.ts/,
},
],
});
```
```typescript
// playwright/support/global-setup.ts - Setup project for auth
import { chromium, FullConfig } from '@playwright/test';
import path from 'path';
async function globalSetup(config: FullConfig) {
const browser = await chromium.launch();
const page = await browser.newPage();
// Perform authentication
await page.goto('http://localhost:3000/login');
await page.fill('[data-testid="email"]', 'test@example.com');
await page.fill('[data-testid="password"]', 'password123');
await page.click('[data-testid="login-button"]');
// Wait for authentication to complete
await page.waitForURL('**/dashboard');
// Save authentication state
await page.context().storageState({
path: path.resolve(__dirname, '../.auth/user.json'),
});
await browser.close();
}
export default globalSetup;
```
```bash
# Run specific project
npx playwright test --project=chromium
npx playwright test --project=mobile-chrome
npx playwright test --project=authenticated
# Run multiple projects
npx playwright test --project=chromium --project=firefox
# Run all projects (default)
npx playwright test
```
```typescript
// Usage: Project-specific test
import { test, expect } from '@playwright/test';
test('mobile navigation works', async ({ page, isMobile }) => {
await page.goto('/');
if (isMobile) {
// Open mobile menu
await page.click('[data-testid="hamburger-menu"]');
}
await page.click('[data-testid="products-link"]');
await expect(page).toHaveURL(/.*products/);
});
```
```yaml
# .github/workflows/e2e-cross-browser.yml - CI cross-browser testing
name: E2E Tests (Cross-Browser)
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
project: [chromium, firefox, webkit, mobile-chrome]
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- run: npm ci
- run: npx playwright install --with-deps
- name: Run tests (${{ matrix.project }})
run: npx playwright test --project=${{ matrix.project }}
```
**Key Points**:
- Projects enable testing across browsers, devices, and configurations
- `devices` from `@playwright/test` provide preset configurations (Pixel 5, iPhone 13, etc.)
- `dependencies` ensures setup project runs first (auth, data seeding)
- `storageState` shares authentication across tests (0 seconds auth per test)
- `testMatch` filters which tests run in which project
- CI matrix strategy runs projects in parallel (4x faster with 4 projects)
- `isMobile` context property for conditional logic in tests
## Integration Points
- **Used in workflows**: `*framework` (config setup), `*ci` (parallelization, artifact upload)
- **Related fragments**:
- `fixture-architecture.md` - Fixture-based timeout overrides
- `ci-burn-in.md` - CI pipeline artifact upload
- `test-quality.md` - Timeout standards (no hard waits)
- `data-factories.md` - Per-test isolation (no shared global state)
## Configuration Checklist
**Before deploying tests, verify**:
- [ ] Environment config map with fail-fast validation
- [ ] Standardized timeouts (action 15s, navigation 30s, expect 10s, test 60s)
- [ ] Artifact storage at `test-results/` and `playwright-report/`
- [ ] HTML + JUnit reporters configured
- [ ] `.env.example`, `.nvmrc`, browser versions committed
- [ ] Parallelization configured (workers, sharding)
- [ ] Projects defined for cross-browser/device testing (if needed)
- [ ] CI uploads artifacts on failure with 30-day retention
_Source: Playwright book repo, SEON configuration example, Murat testing philosophy (lines 216-271)._

View File

@@ -0,0 +1,601 @@
# Probability and Impact Scale
## Principle
Risk scoring uses a **probability × impact** matrix (1-9 scale) to prioritize testing efforts. Higher scores (6-9) demand immediate action; lower scores (1-3) require documentation only. This systematic approach ensures testing resources focus on the highest-value risks.
## Rationale
**The Problem**: Without quantifiable risk assessment, teams over-test low-value scenarios while missing critical risks. Gut feeling leads to inconsistent prioritization and missed edge cases.
**The Solution**: Standardize risk evaluation with a 3×3 matrix (probability: 1-3, impact: 1-3). Multiply to derive risk score (1-9). Automate classification (DOCUMENT, MONITOR, MITIGATE, BLOCK) based on thresholds. This approach surfaces hidden risks early and justifies testing decisions to stakeholders.
**Why This Matters**:
- Consistent risk language across product, engineering, and QA
- Objective prioritization of test scenarios (not politics)
- Automatic gate decisions (score=9 → FAIL until resolved)
- Audit trail for compliance and retrospectives
## Pattern Examples
### Example 1: Probability-Impact Matrix Implementation (Automated Classification)
**Context**: Implement a reusable risk scoring system with automatic threshold classification
**Implementation**:
```typescript
// src/testing/risk-matrix.ts
/**
* Probability levels:
* 1 = Unlikely (standard implementation, low uncertainty)
* 2 = Possible (edge cases or partial unknowns)
* 3 = Likely (known issues, new integrations, high ambiguity)
*/
export type Probability = 1 | 2 | 3;
/**
* Impact levels:
* 1 = Minor (cosmetic issues or easy workarounds)
* 2 = Degraded (partial feature loss or manual workaround)
* 3 = Critical (blockers, data/security/regulatory exposure)
*/
export type Impact = 1 | 2 | 3;
/**
* Risk score (probability × impact): 1-9
*/
export type RiskScore = 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9;
/**
* Action categories based on risk score thresholds
*/
export type RiskAction = 'DOCUMENT' | 'MONITOR' | 'MITIGATE' | 'BLOCK';
export type RiskAssessment = {
probability: Probability;
impact: Impact;
score: RiskScore;
action: RiskAction;
reasoning: string;
};
/**
* Calculate risk score: probability × impact
*/
export function calculateRiskScore(probability: Probability, impact: Impact): RiskScore {
return (probability * impact) as RiskScore;
}
/**
* Classify risk action based on score thresholds:
* - 1-3: DOCUMENT (awareness only)
* - 4-5: MONITOR (watch closely, plan mitigations)
* - 6-8: MITIGATE (CONCERNS at gate until mitigated)
* - 9: BLOCK (automatic FAIL until resolved or waived)
*/
export function classifyRiskAction(score: RiskScore): RiskAction {
if (score >= 9) return 'BLOCK';
if (score >= 6) return 'MITIGATE';
if (score >= 4) return 'MONITOR';
return 'DOCUMENT';
}
/**
* Full risk assessment with automatic classification
*/
export function assessRisk(params: { probability: Probability; impact: Impact; reasoning: string }): RiskAssessment {
const { probability, impact, reasoning } = params;
const score = calculateRiskScore(probability, impact);
const action = classifyRiskAction(score);
return { probability, impact, score, action, reasoning };
}
/**
* Generate risk matrix visualization (3x3 grid)
* Returns markdown table with color-coded scores
*/
export function generateRiskMatrix(): string {
const matrix: string[][] = [];
const header = ['Impact \\ Probability', 'Unlikely (1)', 'Possible (2)', 'Likely (3)'];
matrix.push(header);
const impactLabels = ['Critical (3)', 'Degraded (2)', 'Minor (1)'];
for (let impact = 3; impact >= 1; impact--) {
const row = [impactLabels[3 - impact]];
for (let probability = 1; probability <= 3; probability++) {
const score = calculateRiskScore(probability as Probability, impact as Impact);
const action = classifyRiskAction(score);
const emoji = action === 'BLOCK' ? '🔴' : action === 'MITIGATE' ? '🟠' : action === 'MONITOR' ? '🟡' : '🟢';
row.push(`${emoji} ${score}`);
}
matrix.push(row);
}
return matrix.map((row) => `| ${row.join(' | ')} |`).join('\n');
}
```
**Key Points**:
- Type-safe probability/impact (1-3 enforced at compile time)
- Automatic action classification (DOCUMENT, MONITOR, MITIGATE, BLOCK)
- Visual matrix generation for documentation
- Risk score formula: `probability * impact` (max = 9)
- Threshold-based decision rules (6-8 = MITIGATE, 9 = BLOCK)
---
### Example 2: Risk Assessment Workflow (Test Planning Integration)
**Context**: Apply risk matrix during test design to prioritize scenarios
**Implementation**:
```typescript
// tests/e2e/test-planning/risk-assessment.ts
import { assessRisk, generateRiskMatrix, type RiskAssessment } from '../../../src/testing/risk-matrix';
export type TestScenario = {
id: string;
title: string;
feature: string;
risk: RiskAssessment;
testLevel: 'E2E' | 'API' | 'Unit';
priority: 'P0' | 'P1' | 'P2' | 'P3';
owner: string;
};
/**
* Assess test scenarios and auto-assign priority based on risk score
*/
export function assessTestScenarios(scenarios: Omit<TestScenario, 'risk' | 'priority'>[]): TestScenario[] {
return scenarios.map((scenario) => {
// Auto-assign priority based on risk score
const priority = mapRiskToPriority(scenario.risk.score);
return { ...scenario, priority };
});
}
/**
* Map risk score to test priority (P0-P3)
* P0: Critical (score 9) - blocks release
* P1: High (score 6-8) - must fix before release
* P2: Medium (score 4-5) - fix if time permits
* P3: Low (score 1-3) - document and defer
*/
function mapRiskToPriority(score: number): 'P0' | 'P1' | 'P2' | 'P3' {
if (score === 9) return 'P0';
if (score >= 6) return 'P1';
if (score >= 4) return 'P2';
return 'P3';
}
/**
* Example: Payment flow risk assessment
*/
export const paymentScenarios: Array<Omit<TestScenario, 'priority'>> = [
{
id: 'PAY-001',
title: 'Valid credit card payment completes successfully',
feature: 'Checkout',
risk: assessRisk({
probability: 2, // Possible (standard Stripe integration)
impact: 3, // Critical (revenue loss if broken)
reasoning: 'Core revenue flow, but Stripe is well-tested',
}),
testLevel: 'E2E',
owner: 'qa-team',
},
{
id: 'PAY-002',
title: 'Expired credit card shows user-friendly error',
feature: 'Checkout',
risk: assessRisk({
probability: 3, // Likely (edge case handling often buggy)
impact: 2, // Degraded (users see error, but can retry)
reasoning: 'Error handling logic is custom and complex',
}),
testLevel: 'E2E',
owner: 'qa-team',
},
{
id: 'PAY-003',
title: 'Payment confirmation email formatting is correct',
feature: 'Email',
risk: assessRisk({
probability: 2, // Possible (template changes occasionally break)
impact: 1, // Minor (cosmetic issue, email still sent)
reasoning: 'Non-blocking, users get email regardless',
}),
testLevel: 'Unit',
owner: 'dev-team',
},
{
id: 'PAY-004',
title: 'Payment fails gracefully when Stripe is down',
feature: 'Checkout',
risk: assessRisk({
probability: 1, // Unlikely (Stripe has 99.99% uptime)
impact: 3, // Critical (complete checkout failure)
reasoning: 'Rare but catastrophic, requires retry mechanism',
}),
testLevel: 'API',
owner: 'qa-team',
},
];
/**
* Generate risk assessment report with priority distribution
*/
export function generateRiskReport(scenarios: TestScenario[]): string {
const priorityCounts = scenarios.reduce(
(acc, s) => {
acc[s.priority] = (acc[s.priority] || 0) + 1;
return acc;
},
{} as Record<string, number>,
);
const actionCounts = scenarios.reduce(
(acc, s) => {
acc[s.risk.action] = (acc[s.risk.action] || 0) + 1;
return acc;
},
{} as Record<string, number>,
);
return `
# Risk Assessment Report
## Risk Matrix
${generateRiskMatrix()}
## Priority Distribution
- **P0 (Blocker)**: ${priorityCounts.P0 || 0} scenarios
- **P1 (High)**: ${priorityCounts.P1 || 0} scenarios
- **P2 (Medium)**: ${priorityCounts.P2 || 0} scenarios
- **P3 (Low)**: ${priorityCounts.P3 || 0} scenarios
## Action Required
- **BLOCK**: ${actionCounts.BLOCK || 0} scenarios (auto-fail gate)
- **MITIGATE**: ${actionCounts.MITIGATE || 0} scenarios (concerns at gate)
- **MONITOR**: ${actionCounts.MONITOR || 0} scenarios (watch closely)
- **DOCUMENT**: ${actionCounts.DOCUMENT || 0} scenarios (awareness only)
## Scenarios by Risk Score (Highest First)
${scenarios
.sort((a, b) => b.risk.score - a.risk.score)
.map((s) => `- **[${s.priority}]** ${s.id}: ${s.title} (Score: ${s.risk.score} - ${s.risk.action})`)
.join('\n')}
`.trim();
}
```
**Key Points**:
- Risk score → Priority mapping (P0-P3 automated)
- Report generation with priority/action distribution
- Scenarios sorted by risk score (highest first)
- Visual matrix included in reports
- Reusable across projects (extract to shared library)
---
### Example 3: Dynamic Risk Re-Assessment (Continuous Evaluation)
**Context**: Recalculate risk scores as project evolves (requirements change, mitigations implemented)
**Implementation**:
```typescript
// src/testing/risk-tracking.ts
import { type RiskAssessment, assessRisk, type Probability, type Impact } from './risk-matrix';
export type RiskHistory = {
timestamp: Date;
assessment: RiskAssessment;
changedBy: string;
reason: string;
};
export type TrackedRisk = {
id: string;
title: string;
feature: string;
currentRisk: RiskAssessment;
history: RiskHistory[];
mitigations: string[];
status: 'OPEN' | 'MITIGATED' | 'WAIVED' | 'RESOLVED';
};
export class RiskTracker {
private risks: Map<string, TrackedRisk> = new Map();
/**
* Add new risk to tracker
*/
addRisk(params: {
id: string;
title: string;
feature: string;
probability: Probability;
impact: Impact;
reasoning: string;
changedBy: string;
}): TrackedRisk {
const { id, title, feature, probability, impact, reasoning, changedBy } = params;
const assessment = assessRisk({ probability, impact, reasoning });
const risk: TrackedRisk = {
id,
title,
feature,
currentRisk: assessment,
history: [
{
timestamp: new Date(),
assessment,
changedBy,
reason: 'Initial assessment',
},
],
mitigations: [],
status: 'OPEN',
};
this.risks.set(id, risk);
return risk;
}
/**
* Reassess risk (probability or impact changed)
*/
reassessRisk(params: {
id: string;
probability?: Probability;
impact?: Impact;
reasoning: string;
changedBy: string;
}): TrackedRisk | null {
const { id, probability, impact, reasoning, changedBy } = params;
const risk = this.risks.get(id);
if (!risk) return null;
// Use existing values if not provided
const newProbability = probability ?? risk.currentRisk.probability;
const newImpact = impact ?? risk.currentRisk.impact;
const newAssessment = assessRisk({
probability: newProbability,
impact: newImpact,
reasoning,
});
risk.currentRisk = newAssessment;
risk.history.push({
timestamp: new Date(),
assessment: newAssessment,
changedBy,
reason: reasoning,
});
this.risks.set(id, risk);
return risk;
}
/**
* Mark risk as mitigated (probability reduced)
*/
mitigateRisk(params: { id: string; newProbability: Probability; mitigation: string; changedBy: string }): TrackedRisk | null {
const { id, newProbability, mitigation, changedBy } = params;
const risk = this.reassessRisk({
id,
probability: newProbability,
reasoning: `Mitigation implemented: ${mitigation}`,
changedBy,
});
if (risk) {
risk.mitigations.push(mitigation);
if (risk.currentRisk.action === 'DOCUMENT' || risk.currentRisk.action === 'MONITOR') {
risk.status = 'MITIGATED';
}
}
return risk;
}
/**
* Get risks requiring action (MITIGATE or BLOCK)
*/
getRisksRequiringAction(): TrackedRisk[] {
return Array.from(this.risks.values()).filter(
(r) => r.status === 'OPEN' && (r.currentRisk.action === 'MITIGATE' || r.currentRisk.action === 'BLOCK'),
);
}
/**
* Generate risk trend report (show changes over time)
*/
generateTrendReport(riskId: string): string | null {
const risk = this.risks.get(riskId);
if (!risk) return null;
return `
# Risk Trend Report: ${risk.id}
**Title**: ${risk.title}
**Feature**: ${risk.feature}
**Status**: ${risk.status}
## Current Assessment
- **Probability**: ${risk.currentRisk.probability}
- **Impact**: ${risk.currentRisk.impact}
- **Score**: ${risk.currentRisk.score}
- **Action**: ${risk.currentRisk.action}
- **Reasoning**: ${risk.currentRisk.reasoning}
## Mitigations Applied
${risk.mitigations.length > 0 ? risk.mitigations.map((m) => `- ${m}`).join('\n') : '- None'}
## History (${risk.history.length} changes)
${risk.history
.reverse()
.map((h) => `- **${h.timestamp.toISOString()}** by ${h.changedBy}: Score ${h.assessment.score} (${h.assessment.action}) - ${h.reason}`)
.join('\n')}
`.trim();
}
}
```
**Key Points**:
- Historical tracking (audit trail for risk changes)
- Mitigation impact tracking (probability reduction)
- Status lifecycle (OPEN → MITIGATED → RESOLVED)
- Trend reports (show risk evolution over time)
- Re-assessment triggers (requirements change, new info)
---
### Example 4: Risk Matrix in Gate Decision (Integration with Trace Workflow)
**Context**: Use probability-impact scores to drive gate decisions (PASS/CONCERNS/FAIL/WAIVED)
**Implementation**:
```typescript
// src/testing/gate-decision.ts
import { type RiskScore, classifyRiskAction, type RiskAction } from './risk-matrix';
import { type TrackedRisk } from './risk-tracking';
export type GateDecision = 'PASS' | 'CONCERNS' | 'FAIL' | 'WAIVED';
export type GateResult = {
decision: GateDecision;
blockers: TrackedRisk[]; // Score=9, action=BLOCK
concerns: TrackedRisk[]; // Score 6-8, action=MITIGATE
monitored: TrackedRisk[]; // Score 4-5, action=MONITOR
documented: TrackedRisk[]; // Score 1-3, action=DOCUMENT
summary: string;
};
/**
* Evaluate gate based on risk assessments
*/
export function evaluateGateFromRisks(risks: TrackedRisk[]): GateResult {
const blockers = risks.filter((r) => r.currentRisk.action === 'BLOCK' && r.status === 'OPEN');
const concerns = risks.filter((r) => r.currentRisk.action === 'MITIGATE' && r.status === 'OPEN');
const monitored = risks.filter((r) => r.currentRisk.action === 'MONITOR');
const documented = risks.filter((r) => r.currentRisk.action === 'DOCUMENT');
let decision: GateDecision;
if (blockers.length > 0) {
decision = 'FAIL';
} else if (concerns.length > 0) {
decision = 'CONCERNS';
} else {
decision = 'PASS';
}
const summary = generateGateSummary({ decision, blockers, concerns, monitored, documented });
return { decision, blockers, concerns, monitored, documented, summary };
}
/**
* Generate gate decision summary
*/
function generateGateSummary(result: Omit<GateResult, 'summary'>): string {
const { decision, blockers, concerns, monitored, documented } = result;
const lines: string[] = [`## Gate Decision: ${decision}`];
if (decision === 'FAIL') {
lines.push(`\n**Blockers** (${blockers.length}): Automatic FAIL until resolved or waived`);
blockers.forEach((r) => {
lines.push(`- **${r.id}**: ${r.title} (Score: ${r.currentRisk.score})`);
lines.push(` - Probability: ${r.currentRisk.probability}, Impact: ${r.currentRisk.impact}`);
lines.push(` - Reasoning: ${r.currentRisk.reasoning}`);
});
}
if (concerns.length > 0) {
lines.push(`\n**Concerns** (${concerns.length}): Address before release`);
concerns.forEach((r) => {
lines.push(`- **${r.id}**: ${r.title} (Score: ${r.currentRisk.score})`);
lines.push(` - Mitigations: ${r.mitigations.join(', ') || 'None'}`);
});
}
if (monitored.length > 0) {
lines.push(`\n**Monitored** (${monitored.length}): Watch closely`);
monitored.forEach((r) => lines.push(`- **${r.id}**: ${r.title} (Score: ${r.currentRisk.score})`));
}
if (documented.length > 0) {
lines.push(`\n**Documented** (${documented.length}): Awareness only`);
}
lines.push(`\n---\n`);
lines.push(`**Next Steps**:`);
if (decision === 'FAIL') {
lines.push(`- Resolve blockers or request formal waiver`);
} else if (decision === 'CONCERNS') {
lines.push(`- Implement mitigations for high-risk scenarios (score 6-8)`);
lines.push(`- Re-run gate after mitigations`);
} else {
lines.push(`- Proceed with release`);
}
return lines.join('\n');
}
```
**Key Points**:
- Gate decision driven by risk scores (not gut feeling)
- Automatic FAIL for score=9 (blockers)
- CONCERNS for score 6-8 (requires mitigation)
- PASS only when no blockers/concerns
- Actionable summary with next steps
- Integration with trace workflow (Phase 2)
---
## Probability-Impact Threshold Summary
| Score | Action | Gate Impact | Typical Use Case |
| ----- | -------- | -------------------- | -------------------------------------- |
| 1-3 | DOCUMENT | None | Cosmetic issues, low-priority bugs |
| 4-5 | MONITOR | None (watch closely) | Edge cases, partial unknowns |
| 6-8 | MITIGATE | CONCERNS at gate | High-impact scenarios needing coverage |
| 9 | BLOCK | Automatic FAIL | Critical blockers, must resolve |
## Risk Assessment Checklist
Before deploying risk matrix:
- [ ] **Probability scale defined**: 1 (unlikely), 2 (possible), 3 (likely) with clear examples
- [ ] **Impact scale defined**: 1 (minor), 2 (degraded), 3 (critical) with concrete criteria
- [ ] **Threshold rules documented**: Score → Action mapping (1-3 = DOCUMENT, 4-5 = MONITOR, 6-8 = MITIGATE, 9 = BLOCK)
- [ ] **Gate integration**: Risk scores drive gate decisions (PASS/CONCERNS/FAIL/WAIVED)
- [ ] **Re-assessment process**: Risks re-evaluated as project evolves (requirements change, mitigations applied)
- [ ] **Audit trail**: Historical tracking for risk changes (who, when, why)
- [ ] **Mitigation tracking**: Link mitigations to probability reduction (quantify impact)
- [ ] **Reporting**: Risk matrix visualization, trend reports, gate summaries
## Integration Points
- **Used in workflows**: `*test-design` (initial risk assessment), `*trace` (gate decision Phase 2), `*nfr-assess` (security/performance risks)
- **Related fragments**: `risk-governance.md` (risk scoring matrix, gate decision engine), `test-priorities-matrix.md` (P0-P3 mapping), `nfr-criteria.md` (impact assessment for NFRs)
- **Tools**: TypeScript for type safety, markdown for reports, version control for audit trail
_Source: Murat risk model summary, gate decision patterns from production systems, probability-impact matrix from risk governance practices_

View File

@@ -0,0 +1,615 @@
# Risk Governance and Gatekeeping
## Principle
Risk governance transforms subjective "should we ship?" debates into objective, data-driven decisions. By scoring risk (probability × impact), classifying by category (TECH, SEC, PERF, etc.), and tracking mitigation ownership, teams create transparent quality gates that balance speed with safety.
## Rationale
**The Problem**: Without formal risk governance, releases become political—loud voices win, quiet risks hide, and teams discover critical issues in production. "We thought it was fine" isn't a release strategy.
**The Solution**: Risk scoring (1-3 scale for probability and impact, total 1-9) creates shared language. Scores ≥6 demand documented mitigation. Scores = 9 mandate gate failure. Every acceptance criterion maps to a test, and gaps require explicit waivers with owners and expiry dates.
**Why This Matters**:
- Removes ambiguity from release decisions (objective scores vs subjective opinions)
- Creates audit trail for compliance (FDA, SOC2, ISO require documented risk management)
- Identifies true blockers early (prevents last-minute production fires)
- Distributes responsibility (owners, mitigation plans, deadlines for every risk >4)
## Pattern Examples
### Example 1: Risk Scoring Matrix with Automated Classification (TypeScript)
**Context**: Calculate risk scores automatically from test results and categorize by risk type
**Implementation**:
```typescript
// risk-scoring.ts - Risk classification and scoring system
export const RISK_CATEGORIES = {
TECH: 'TECH', // Technical debt, architecture fragility
SEC: 'SEC', // Security vulnerabilities
PERF: 'PERF', // Performance degradation
DATA: 'DATA', // Data integrity, corruption
BUS: 'BUS', // Business logic errors
OPS: 'OPS', // Operational issues (deployment, monitoring)
} as const;
export type RiskCategory = keyof typeof RISK_CATEGORIES;
export type RiskScore = {
id: string;
category: RiskCategory;
title: string;
description: string;
probability: 1 | 2 | 3; // 1=Low, 2=Medium, 3=High
impact: 1 | 2 | 3; // 1=Low, 2=Medium, 3=High
score: number; // probability × impact (1-9)
owner: string;
mitigationPlan?: string;
deadline?: Date;
status: 'OPEN' | 'MITIGATED' | 'WAIVED' | 'ACCEPTED';
waiverReason?: string;
waiverApprover?: string;
waiverExpiry?: Date;
};
// Risk scoring rules
export function calculateRiskScore(probability: 1 | 2 | 3, impact: 1 | 2 | 3): number {
return probability * impact;
}
export function requiresMitigation(score: number): boolean {
return score >= 6; // Scores 6-9 demand action
}
export function isCriticalBlocker(score: number): boolean {
return score === 9; // Probability=3 AND Impact=3 → FAIL gate
}
export function classifyRiskLevel(score: number): 'LOW' | 'MEDIUM' | 'HIGH' | 'CRITICAL' {
if (score === 9) return 'CRITICAL';
if (score >= 6) return 'HIGH';
if (score >= 4) return 'MEDIUM';
return 'LOW';
}
// Example: Risk assessment from test failures
export function assessTestFailureRisk(failure: {
test: string;
category: RiskCategory;
affectedUsers: number;
revenueImpact: number;
securityVulnerability: boolean;
}): RiskScore {
// Probability based on test failure frequency (simplified)
const probability: 1 | 2 | 3 = 3; // Test failed = High probability
// Impact based on business context
let impact: 1 | 2 | 3 = 1;
if (failure.securityVulnerability) impact = 3;
else if (failure.revenueImpact > 10000) impact = 3;
else if (failure.affectedUsers > 1000) impact = 2;
else impact = 1;
const score = calculateRiskScore(probability, impact);
return {
id: `risk-${Date.now()}`,
category: failure.category,
title: `Test failure: ${failure.test}`,
description: `Affects ${failure.affectedUsers} users, $${failure.revenueImpact} revenue`,
probability,
impact,
score,
owner: 'unassigned',
status: score === 9 ? 'OPEN' : 'OPEN',
};
}
```
**Key Points**:
- **Objective scoring**: Probability (1-3) × Impact (1-3) = Score (1-9)
- **Clear thresholds**: Score ≥6 requires mitigation, score = 9 blocks release
- **Business context**: Revenue, users, security drive impact calculation
- **Status tracking**: OPEN → MITIGATED → WAIVED → ACCEPTED lifecycle
---
### Example 2: Gate Decision Engine with Traceability Validation
**Context**: Automated gate decision based on risk scores and test coverage
**Implementation**:
```typescript
// gate-decision-engine.ts
export type GateDecision = 'PASS' | 'CONCERNS' | 'FAIL' | 'WAIVED';
export type CoverageGap = {
acceptanceCriteria: string;
testMissing: string;
reason: string;
};
export type GateResult = {
decision: GateDecision;
timestamp: Date;
criticalRisks: RiskScore[];
highRisks: RiskScore[];
coverageGaps: CoverageGap[];
summary: string;
recommendations: string[];
};
export function evaluateGate(params: { risks: RiskScore[]; coverageGaps: CoverageGap[]; waiverApprover?: string }): GateResult {
const { risks, coverageGaps, waiverApprover } = params;
// Categorize risks
const criticalRisks = risks.filter((r) => r.score === 9 && r.status === 'OPEN');
const highRisks = risks.filter((r) => r.score >= 6 && r.score < 9 && r.status === 'OPEN');
const unresolvedGaps = coverageGaps.filter((g) => !g.reason);
// Decision logic
let decision: GateDecision;
// FAIL: Critical blockers (score=9) or missing coverage
if (criticalRisks.length > 0 || unresolvedGaps.length > 0) {
decision = 'FAIL';
}
// WAIVED: All risks waived by authorized approver
else if (risks.every((r) => r.status === 'WAIVED') && waiverApprover) {
decision = 'WAIVED';
}
// CONCERNS: High risks (score 6-8) with mitigation plans
else if (highRisks.length > 0 && highRisks.every((r) => r.mitigationPlan && r.owner !== 'unassigned')) {
decision = 'CONCERNS';
}
// PASS: No critical issues, all risks mitigated or low
else {
decision = 'PASS';
}
// Generate recommendations
const recommendations: string[] = [];
if (criticalRisks.length > 0) {
recommendations.push(`🚨 ${criticalRisks.length} CRITICAL risk(s) must be mitigated before release`);
}
if (unresolvedGaps.length > 0) {
recommendations.push(`📋 ${unresolvedGaps.length} acceptance criteria lack test coverage`);
}
if (highRisks.some((r) => !r.mitigationPlan)) {
recommendations.push(`⚠️ High risks without mitigation plans: assign owners and deadlines`);
}
if (decision === 'PASS') {
recommendations.push(`✅ All risks mitigated or acceptable. Ready for release.`);
}
return {
decision,
timestamp: new Date(),
criticalRisks,
highRisks,
coverageGaps: unresolvedGaps,
summary: generateSummary(decision, risks, unresolvedGaps),
recommendations,
};
}
function generateSummary(decision: GateDecision, risks: RiskScore[], gaps: CoverageGap[]): string {
const total = risks.length;
const critical = risks.filter((r) => r.score === 9).length;
const high = risks.filter((r) => r.score >= 6 && r.score < 9).length;
return `Gate Decision: ${decision}. Total Risks: ${total} (${critical} critical, ${high} high). Coverage Gaps: ${gaps.length}.`;
}
```
**Usage Example**:
```typescript
// Example: Running gate check before deployment
import { assessTestFailureRisk, evaluateGate } from './gate-decision-engine';
// Collect risks from test results
const risks: RiskScore[] = [
assessTestFailureRisk({
test: 'Payment processing with expired card',
category: 'BUS',
affectedUsers: 5000,
revenueImpact: 50000,
securityVulnerability: false,
}),
assessTestFailureRisk({
test: 'SQL injection in search endpoint',
category: 'SEC',
affectedUsers: 10000,
revenueImpact: 0,
securityVulnerability: true,
}),
];
// Identify coverage gaps
const coverageGaps: CoverageGap[] = [
{
acceptanceCriteria: 'User can reset password via email',
testMissing: 'e2e/auth/password-reset.spec.ts',
reason: '', // Empty = unresolved
},
];
// Evaluate gate
const gateResult = evaluateGate({ risks, coverageGaps });
console.log(gateResult.decision); // 'FAIL'
console.log(gateResult.summary);
// "Gate Decision: FAIL. Total Risks: 2 (1 critical, 1 high). Coverage Gaps: 1."
console.log(gateResult.recommendations);
// [
// "🚨 1 CRITICAL risk(s) must be mitigated before release",
// "📋 1 acceptance criteria lack test coverage"
// ]
```
**Key Points**:
- **Automated decision**: No human interpretation required
- **Clear criteria**: FAIL = critical risks or gaps, CONCERNS = high risks with plans, PASS = low risks
- **Actionable output**: Recommendations drive next steps
- **Audit trail**: Timestamp, decision, and context for compliance
---
### Example 3: Risk Mitigation Workflow with Owner Tracking
**Context**: Track risk mitigation from identification to resolution
**Implementation**:
```typescript
// risk-mitigation.ts
export type MitigationAction = {
riskId: string;
action: string;
owner: string;
deadline: Date;
status: 'PENDING' | 'IN_PROGRESS' | 'COMPLETED' | 'BLOCKED';
completedAt?: Date;
blockedReason?: string;
};
export class RiskMitigationTracker {
private risks: Map<string, RiskScore> = new Map();
private actions: Map<string, MitigationAction[]> = new Map();
private history: Array<{ riskId: string; event: string; timestamp: Date }> = [];
// Register a new risk
addRisk(risk: RiskScore): void {
this.risks.set(risk.id, risk);
this.logHistory(risk.id, `Risk registered: ${risk.title} (Score: ${risk.score})`);
// Auto-assign mitigation requirements for score ≥6
if (requiresMitigation(risk.score) && !risk.mitigationPlan) {
this.logHistory(risk.id, `⚠️ Mitigation required (score ${risk.score}). Assign owner and plan.`);
}
}
// Add mitigation action
addMitigationAction(action: MitigationAction): void {
const risk = this.risks.get(action.riskId);
if (!risk) throw new Error(`Risk ${action.riskId} not found`);
const existingActions = this.actions.get(action.riskId) || [];
existingActions.push(action);
this.actions.set(action.riskId, existingActions);
this.logHistory(action.riskId, `Mitigation action added: ${action.action} (Owner: ${action.owner})`);
}
// Complete mitigation action
completeMitigation(riskId: string, actionIndex: number): void {
const actions = this.actions.get(riskId);
if (!actions || !actions[actionIndex]) throw new Error('Action not found');
actions[actionIndex].status = 'COMPLETED';
actions[actionIndex].completedAt = new Date();
this.logHistory(riskId, `Mitigation completed: ${actions[actionIndex].action}`);
// If all actions completed, mark risk as MITIGATED
if (actions.every((a) => a.status === 'COMPLETED')) {
const risk = this.risks.get(riskId)!;
risk.status = 'MITIGATED';
this.logHistory(riskId, `✅ Risk mitigated. All actions complete.`);
}
}
// Request waiver for a risk
requestWaiver(riskId: string, reason: string, approver: string, expiryDays: number): void {
const risk = this.risks.get(riskId);
if (!risk) throw new Error(`Risk ${riskId} not found`);
risk.status = 'WAIVED';
risk.waiverReason = reason;
risk.waiverApprover = approver;
risk.waiverExpiry = new Date(Date.now() + expiryDays * 24 * 60 * 60 * 1000);
this.logHistory(riskId, `⚠️ Waiver granted by ${approver}. Expires: ${risk.waiverExpiry}`);
}
// Generate risk report
generateReport(): string {
const allRisks = Array.from(this.risks.values());
const critical = allRisks.filter((r) => r.score === 9 && r.status === 'OPEN');
const high = allRisks.filter((r) => r.score >= 6 && r.score < 9 && r.status === 'OPEN');
const mitigated = allRisks.filter((r) => r.status === 'MITIGATED');
const waived = allRisks.filter((r) => r.status === 'WAIVED');
let report = `# Risk Mitigation Report\n\n`;
report += `**Generated**: ${new Date().toISOString()}\n\n`;
report += `## Summary\n`;
report += `- Total Risks: ${allRisks.length}\n`;
report += `- Critical (Score=9, OPEN): ${critical.length}\n`;
report += `- High (Score 6-8, OPEN): ${high.length}\n`;
report += `- Mitigated: ${mitigated.length}\n`;
report += `- Waived: ${waived.length}\n\n`;
if (critical.length > 0) {
report += `## 🚨 Critical Risks (BLOCKERS)\n\n`;
critical.forEach((r) => {
report += `- **${r.title}** (${r.category})\n`;
report += ` - Score: ${r.score} (Probability: ${r.probability}, Impact: ${r.impact})\n`;
report += ` - Owner: ${r.owner}\n`;
report += ` - Mitigation: ${r.mitigationPlan || 'NOT ASSIGNED'}\n\n`;
});
}
if (high.length > 0) {
report += `## ⚠️ High Risks\n\n`;
high.forEach((r) => {
report += `- **${r.title}** (${r.category})\n`;
report += ` - Score: ${r.score}\n`;
report += ` - Owner: ${r.owner}\n`;
report += ` - Deadline: ${r.deadline?.toISOString().split('T')[0] || 'NOT SET'}\n\n`;
});
}
return report;
}
private logHistory(riskId: string, event: string): void {
this.history.push({ riskId, event, timestamp: new Date() });
}
getHistory(riskId: string): Array<{ event: string; timestamp: Date }> {
return this.history.filter((h) => h.riskId === riskId).map((h) => ({ event: h.event, timestamp: h.timestamp }));
}
}
```
**Usage Example**:
```typescript
const tracker = new RiskMitigationTracker();
// Register critical security risk
tracker.addRisk({
id: 'risk-001',
category: 'SEC',
title: 'SQL injection vulnerability in user search',
description: 'Unsanitized input allows arbitrary SQL execution',
probability: 3,
impact: 3,
score: 9,
owner: 'security-team',
status: 'OPEN',
});
// Add mitigation actions
tracker.addMitigationAction({
riskId: 'risk-001',
action: 'Add parameterized queries to user-search endpoint',
owner: 'alice@example.com',
deadline: new Date('2025-10-20'),
status: 'IN_PROGRESS',
});
tracker.addMitigationAction({
riskId: 'risk-001',
action: 'Add WAF rule to block SQL injection patterns',
owner: 'bob@example.com',
deadline: new Date('2025-10-22'),
status: 'PENDING',
});
// Complete first action
tracker.completeMitigation('risk-001', 0);
// Generate report
console.log(tracker.generateReport());
// Markdown report with critical risks, owners, deadlines
// View history
console.log(tracker.getHistory('risk-001'));
// [
// { event: 'Risk registered: SQL injection...', timestamp: ... },
// { event: 'Mitigation action added: Add parameterized queries...', timestamp: ... },
// { event: 'Mitigation completed: Add parameterized queries...', timestamp: ... }
// ]
```
**Key Points**:
- **Ownership enforcement**: Every risk >4 requires owner assignment
- **Deadline tracking**: Mitigation actions have explicit deadlines
- **Audit trail**: Complete history of risk lifecycle (registered → mitigated)
- **Automated reports**: Markdown output for Confluence/GitHub wikis
---
### Example 4: Coverage Traceability Matrix (Test-to-Requirement Mapping)
**Context**: Validate that every acceptance criterion maps to at least one test
**Implementation**:
```typescript
// coverage-traceability.ts
export type AcceptanceCriterion = {
id: string;
story: string;
criterion: string;
priority: 'P0' | 'P1' | 'P2' | 'P3';
};
export type TestCase = {
file: string;
name: string;
criteriaIds: string[]; // Links to acceptance criteria
};
export type CoverageMatrix = {
criterion: AcceptanceCriterion;
tests: TestCase[];
covered: boolean;
waiverReason?: string;
};
export function buildCoverageMatrix(criteria: AcceptanceCriterion[], tests: TestCase[]): CoverageMatrix[] {
return criteria.map((criterion) => {
const matchingTests = tests.filter((t) => t.criteriaIds.includes(criterion.id));
return {
criterion,
tests: matchingTests,
covered: matchingTests.length > 0,
};
});
}
export function validateCoverage(matrix: CoverageMatrix[]): {
gaps: CoverageMatrix[];
passRate: number;
} {
const gaps = matrix.filter((m) => !m.covered && !m.waiverReason);
const passRate = ((matrix.length - gaps.length) / matrix.length) * 100;
return { gaps, passRate };
}
// Example: Extract criteria IDs from test names
export function extractCriteriaFromTests(testFiles: string[]): TestCase[] {
// Simplified: In real implementation, parse test files with AST
// Here we simulate extraction from test names
return [
{
file: 'tests/e2e/auth/login.spec.ts',
name: 'should allow user to login with valid credentials',
criteriaIds: ['AC-001', 'AC-002'], // Linked to acceptance criteria
},
{
file: 'tests/e2e/auth/password-reset.spec.ts',
name: 'should send password reset email',
criteriaIds: ['AC-003'],
},
];
}
// Generate Markdown traceability report
export function generateTraceabilityReport(matrix: CoverageMatrix[]): string {
let report = `# Requirements-to-Tests Traceability Matrix\n\n`;
report += `**Generated**: ${new Date().toISOString()}\n\n`;
const { gaps, passRate } = validateCoverage(matrix);
report += `## Summary\n`;
report += `- Total Criteria: ${matrix.length}\n`;
report += `- Covered: ${matrix.filter((m) => m.covered).length}\n`;
report += `- Gaps: ${gaps.length}\n`;
report += `- Waived: ${matrix.filter((m) => m.waiverReason).length}\n`;
report += `- Coverage Rate: ${passRate.toFixed(1)}%\n\n`;
if (gaps.length > 0) {
report += `## ❌ Coverage Gaps (MUST RESOLVE)\n\n`;
report += `| Story | Criterion | Priority | Tests |\n`;
report += `|-------|-----------|----------|-------|\n`;
gaps.forEach((m) => {
report += `| ${m.criterion.story} | ${m.criterion.criterion} | ${m.criterion.priority} | None |\n`;
});
report += `\n`;
}
report += `## ✅ Covered Criteria\n\n`;
report += `| Story | Criterion | Tests |\n`;
report += `|-------|-----------|-------|\n`;
matrix
.filter((m) => m.covered)
.forEach((m) => {
const testList = m.tests.map((t) => `\`${t.file}\``).join(', ');
report += `| ${m.criterion.story} | ${m.criterion.criterion} | ${testList} |\n`;
});
return report;
}
```
**Usage Example**:
```typescript
// Define acceptance criteria
const criteria: AcceptanceCriterion[] = [
{ id: 'AC-001', story: 'US-123', criterion: 'User can login with email', priority: 'P0' },
{ id: 'AC-002', story: 'US-123', criterion: 'User sees error on invalid password', priority: 'P0' },
{ id: 'AC-003', story: 'US-124', criterion: 'User receives password reset email', priority: 'P1' },
{ id: 'AC-004', story: 'US-125', criterion: 'User can update profile', priority: 'P2' }, // NO TEST
];
// Extract tests
const tests: TestCase[] = extractCriteriaFromTests(['tests/e2e/auth/login.spec.ts', 'tests/e2e/auth/password-reset.spec.ts']);
// Build matrix
const matrix = buildCoverageMatrix(criteria, tests);
// Validate
const { gaps, passRate } = validateCoverage(matrix);
console.log(`Coverage: ${passRate.toFixed(1)}%`); // "Coverage: 75.0%"
console.log(`Gaps: ${gaps.length}`); // "Gaps: 1" (AC-004 has no test)
// Generate report
const report = generateTraceabilityReport(matrix);
console.log(report);
// Markdown table showing coverage gaps
```
**Key Points**:
- **Bidirectional traceability**: Criteria → Tests and Tests → Criteria
- **Gap detection**: Automatically identifies missing coverage
- **Priority awareness**: P0 gaps are critical blockers
- **Waiver support**: Allow explicit waivers for low-priority gaps
---
## Risk Governance Checklist
Before deploying to production, ensure:
- [ ] **Risk scoring complete**: All identified risks scored (Probability × Impact)
- [ ] **Ownership assigned**: Every risk >4 has owner, mitigation plan, deadline
- [ ] **Coverage validated**: Every acceptance criterion maps to at least one test
- [ ] **Gate decision documented**: PASS/CONCERNS/FAIL/WAIVED with rationale
- [ ] **Waivers approved**: All waivers have approver, reason, expiry date
- [ ] **Audit trail captured**: Risk history log available for compliance review
- [ ] **Traceability matrix**: Requirements-to-tests mapping up to date
- [ ] **Critical risks resolved**: No score=9 risks in OPEN status
## Integration Points
- **Used in workflows**: `*trace` (Phase 2: gate decision), `*nfr-assess` (risk scoring), `*test-design` (risk identification)
- **Related fragments**: `probability-impact.md` (scoring definitions), `test-priorities-matrix.md` (P0-P3 classification), `nfr-criteria.md` (non-functional risks)
- **Tools**: Risk tracking dashboards (Jira, Linear), gate automation (CI/CD), traceability reports (Markdown, Confluence)
_Source: Murat risk governance notes, gate schema guidance, SEON production gate workflows, ISO 31000 risk management standards_

View File

@@ -0,0 +1,732 @@
# Selective and Targeted Test Execution
## Principle
Run only the tests you need, when you need them. Use tags/grep to slice suites by risk priority (not directory structure), filter by spec patterns or git diff to focus on impacted areas, and combine priority metadata (P0-P3) with change detection to optimize pre-commit vs. CI execution. Document the selection strategy clearly so teams understand when full regression is mandatory.
## Rationale
Running the entire test suite on every commit wastes time and resources. Smart test selection provides fast feedback (smoke tests in minutes, full regression in hours) while maintaining confidence. The "32+ ways of selective testing" philosophy balances speed with coverage: quick loops for developers, comprehensive validation before deployment. Poorly documented selection leads to confusion about when tests run and why.
## Pattern Examples
### Example 1: Tag-Based Execution with Priority Levels
**Context**: Organize tests by risk priority and execution stage using grep/tag patterns.
**Implementation**:
```typescript
// tests/e2e/checkout.spec.ts
import { test, expect } from '@playwright/test';
/**
* Tag-based test organization
* - @smoke: Critical path tests (run on every commit, < 5 min)
* - @regression: Full test suite (run pre-merge, < 30 min)
* - @p0: Critical business functions (payment, auth, data integrity)
* - @p1: Core features (primary user journeys)
* - @p2: Secondary features (supporting functionality)
* - @p3: Nice-to-have (cosmetic, non-critical)
*/
test.describe('Checkout Flow', () => {
// P0 + Smoke: Must run on every commit
test('@smoke @p0 should complete purchase with valid payment', async ({ page }) => {
await page.goto('/checkout');
await page.getByTestId('card-number').fill('4242424242424242');
await page.getByTestId('submit-payment').click();
await expect(page.getByTestId('order-confirmation')).toBeVisible();
});
// P0 but not smoke: Run pre-merge
test('@regression @p0 should handle payment decline gracefully', async ({ page }) => {
await page.goto('/checkout');
await page.getByTestId('card-number').fill('4000000000000002'); // Decline card
await page.getByTestId('submit-payment').click();
await expect(page.getByTestId('payment-error')).toBeVisible();
await expect(page.getByTestId('payment-error')).toContainText('declined');
});
// P1 + Smoke: Important but not critical
test('@smoke @p1 should apply discount code', async ({ page }) => {
await page.goto('/checkout');
await page.getByTestId('promo-code').fill('SAVE10');
await page.getByTestId('apply-promo').click();
await expect(page.getByTestId('discount-applied')).toBeVisible();
});
// P2: Run in full regression only
test('@regression @p2 should remember saved payment methods', async ({ page }) => {
await page.goto('/checkout');
await expect(page.getByTestId('saved-cards')).toBeVisible();
});
// P3: Low priority, run nightly or weekly
test('@nightly @p3 should display checkout page analytics', async ({ page }) => {
await page.goto('/checkout');
const analyticsEvents = await page.evaluate(() => (window as any).__ANALYTICS__);
expect(analyticsEvents).toBeDefined();
});
});
```
**package.json scripts**:
```json
{
"scripts": {
"test": "playwright test",
"test:smoke": "playwright test --grep '@smoke'",
"test:p0": "playwright test --grep '@p0'",
"test:p0-p1": "playwright test --grep '@p0|@p1'",
"test:regression": "playwright test --grep '@regression'",
"test:nightly": "playwright test --grep '@nightly'",
"test:not-slow": "playwright test --grep-invert '@slow'",
"test:critical-smoke": "playwright test --grep '@smoke.*@p0'"
}
}
```
**Cypress equivalent**:
```javascript
// cypress/e2e/checkout.cy.ts
describe('Checkout Flow', { tags: ['@checkout'] }, () => {
it('should complete purchase', { tags: ['@smoke', '@p0'] }, () => {
cy.visit('/checkout');
cy.get('[data-cy="card-number"]').type('4242424242424242');
cy.get('[data-cy="submit-payment"]').click();
cy.get('[data-cy="order-confirmation"]').should('be.visible');
});
it('should handle decline', { tags: ['@regression', '@p0'] }, () => {
cy.visit('/checkout');
cy.get('[data-cy="card-number"]').type('4000000000000002');
cy.get('[data-cy="submit-payment"]').click();
cy.get('[data-cy="payment-error"]').should('be.visible');
});
});
// cypress.config.ts
export default defineConfig({
e2e: {
env: {
grepTags: process.env.GREP_TAGS || '',
grepFilterSpecs: true,
},
setupNodeEvents(on, config) {
require('@cypress/grep/src/plugin')(config);
return config;
},
},
});
```
**Usage**:
```bash
# Playwright
npm run test:smoke # Run all @smoke tests
npm run test:p0 # Run all P0 tests
npm run test -- --grep "@smoke.*@p0" # Run tests with BOTH tags
# Cypress (with @cypress/grep plugin)
npx cypress run --env grepTags="@smoke"
npx cypress run --env grepTags="@p0+@smoke" # AND logic
npx cypress run --env grepTags="@p0 @p1" # OR logic
```
**Key Points**:
- **Multiple tags per test**: Combine priority (@p0) with stage (@smoke)
- **AND/OR logic**: Grep supports complex filtering
- **Clear naming**: Tags document test importance
- **Fast feedback**: @smoke runs < 5 min, full suite < 30 min
- **CI integration**: Different jobs run different tag combinations
---
### Example 2: Spec Filter Pattern (File-Based Selection)
**Context**: Run tests by file path pattern or directory for targeted execution.
**Implementation**:
```bash
#!/bin/bash
# scripts/selective-spec-runner.sh
# Run tests based on spec file patterns
set -e
PATTERN=${1:-"**/*.spec.ts"}
TEST_ENV=${TEST_ENV:-local}
echo "🎯 Selective Spec Runner"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "Pattern: $PATTERN"
echo "Environment: $TEST_ENV"
echo ""
# Pattern examples and their use cases
case "$PATTERN" in
"**/checkout*")
echo "📦 Running checkout-related tests"
npx playwright test --grep-files="**/checkout*"
;;
"**/auth*"|"**/login*"|"**/signup*")
echo "🔐 Running authentication tests"
npx playwright test --grep-files="**/auth*|**/login*|**/signup*"
;;
"tests/e2e/**")
echo "🌐 Running all E2E tests"
npx playwright test tests/e2e/
;;
"tests/integration/**")
echo "🔌 Running all integration tests"
npx playwright test tests/integration/
;;
"tests/component/**")
echo "🧩 Running all component tests"
npx playwright test tests/component/
;;
*)
echo "🔍 Running tests matching pattern: $PATTERN"
npx playwright test "$PATTERN"
;;
esac
```
**Playwright config for file filtering**:
```typescript
// playwright.config.ts
import { defineConfig, devices } from '@playwright/test';
export default defineConfig({
// ... other config
// Project-based organization
projects: [
{
name: 'smoke',
testMatch: /.*smoke.*\.spec\.ts/,
retries: 0,
},
{
name: 'e2e',
testMatch: /tests\/e2e\/.*\.spec\.ts/,
retries: 2,
},
{
name: 'integration',
testMatch: /tests\/integration\/.*\.spec\.ts/,
retries: 1,
},
{
name: 'component',
testMatch: /tests\/component\/.*\.spec\.ts/,
use: { ...devices['Desktop Chrome'] },
},
],
});
```
**Advanced pattern matching**:
```typescript
// scripts/run-by-component.ts
/**
* Run tests related to specific component(s)
* Usage: npm run test:component UserProfile,Settings
*/
import { execSync } from 'child_process';
const components = process.argv[2]?.split(',') || [];
if (components.length === 0) {
console.error('❌ No components specified');
console.log('Usage: npm run test:component UserProfile,Settings');
process.exit(1);
}
// Convert component names to glob patterns
const patterns = components.map((comp) => `**/*${comp}*.spec.ts`).join(' ');
console.log(`🧩 Running tests for components: ${components.join(', ')}`);
console.log(`Patterns: ${patterns}`);
try {
execSync(`npx playwright test ${patterns}`, {
stdio: 'inherit',
env: { ...process.env, CI: 'false' },
});
} catch (error) {
process.exit(1);
}
```
**package.json scripts**:
```json
{
"scripts": {
"test:checkout": "playwright test **/checkout*.spec.ts",
"test:auth": "playwright test **/auth*.spec.ts **/login*.spec.ts",
"test:e2e": "playwright test tests/e2e/",
"test:integration": "playwright test tests/integration/",
"test:component": "ts-node scripts/run-by-component.ts",
"test:project": "playwright test --project",
"test:smoke-project": "playwright test --project smoke"
}
}
```
**Key Points**:
- **Glob patterns**: Wildcards match file paths flexibly
- **Project isolation**: Separate projects have different configs
- **Component targeting**: Run tests for specific features
- **Directory-based**: Organize tests by type (e2e, integration, component)
- **CI optimization**: Run subsets in parallel CI jobs
---
### Example 3: Diff-Based Test Selection (Changed Files Only)
**Context**: Run only tests affected by code changes for maximum speed.
**Implementation**:
```bash
#!/bin/bash
# scripts/test-changed-files.sh
# Intelligent test selection based on git diff
set -e
BASE_BRANCH=${BASE_BRANCH:-main}
TEST_ENV=${TEST_ENV:-local}
echo "🔍 Changed File Test Selector"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "Base branch: $BASE_BRANCH"
echo "Environment: $TEST_ENV"
echo ""
# Get changed files
CHANGED_FILES=$(git diff --name-only $BASE_BRANCH...HEAD)
if [ -z "$CHANGED_FILES" ]; then
echo "✅ No files changed. Skipping tests."
exit 0
fi
echo "Changed files:"
echo "$CHANGED_FILES" | sed 's/^/ - /'
echo ""
# Arrays to collect test specs
DIRECT_TEST_FILES=()
RELATED_TEST_FILES=()
RUN_ALL_TESTS=false
# Process each changed file
while IFS= read -r file; do
case "$file" in
# Changed test files: run them directly
*.spec.ts|*.spec.js|*.test.ts|*.test.js|*.cy.ts|*.cy.js)
DIRECT_TEST_FILES+=("$file")
;;
# Critical config changes: run ALL tests
package.json|package-lock.json|playwright.config.ts|cypress.config.ts|tsconfig.json|.github/workflows/*)
echo "⚠️ Critical file changed: $file"
RUN_ALL_TESTS=true
break
;;
# Component changes: find related tests
src/components/*.tsx|src/components/*.jsx)
COMPONENT_NAME=$(basename "$file" | sed 's/\.[^.]*$//')
echo "🧩 Component changed: $COMPONENT_NAME"
# Find tests matching component name
FOUND_TESTS=$(find tests -name "*${COMPONENT_NAME}*.spec.ts" -o -name "*${COMPONENT_NAME}*.cy.ts" 2>/dev/null || true)
if [ -n "$FOUND_TESTS" ]; then
while IFS= read -r test_file; do
RELATED_TEST_FILES+=("$test_file")
done <<< "$FOUND_TESTS"
fi
;;
# Utility/lib changes: run integration + unit tests
src/utils/*|src/lib/*|src/helpers/*)
echo "⚙️ Utility file changed: $file"
RELATED_TEST_FILES+=($(find tests/unit tests/integration -name "*.spec.ts" 2>/dev/null || true))
;;
# API changes: run integration + e2e tests
src/api/*|src/services/*|src/controllers/*)
echo "🔌 API file changed: $file"
RELATED_TEST_FILES+=($(find tests/integration tests/e2e -name "*.spec.ts" 2>/dev/null || true))
;;
# Type changes: run all TypeScript tests
*.d.ts|src/types/*)
echo "📝 Type definition changed: $file"
RUN_ALL_TESTS=true
break
;;
# Documentation only: skip tests
*.md|docs/*|README*)
echo "📄 Documentation changed: $file (no tests needed)"
;;
*)
echo "❓ Unclassified change: $file (running smoke tests)"
RELATED_TEST_FILES+=($(find tests -name "*smoke*.spec.ts" 2>/dev/null || true))
;;
esac
done <<< "$CHANGED_FILES"
# Execute tests based on analysis
if [ "$RUN_ALL_TESTS" = true ]; then
echo ""
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "🚨 Running FULL test suite (critical changes detected)"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
npm run test
exit $?
fi
# Combine and deduplicate test files
ALL_TEST_FILES=(${DIRECT_TEST_FILES[@]} ${RELATED_TEST_FILES[@]})
UNIQUE_TEST_FILES=($(echo "${ALL_TEST_FILES[@]}" | tr ' ' '\n' | sort -u))
if [ ${#UNIQUE_TEST_FILES[@]} -eq 0 ]; then
echo ""
echo "✅ No tests found for changed files. Running smoke tests."
npm run test:smoke
exit $?
fi
echo ""
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "🎯 Running ${#UNIQUE_TEST_FILES[@]} test file(s)"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
for test_file in "${UNIQUE_TEST_FILES[@]}"; do
echo " - $test_file"
done
echo ""
npm run test -- "${UNIQUE_TEST_FILES[@]}"
```
**GitHub Actions integration**:
```yaml
# .github/workflows/test-changed.yml
name: Test Changed Files
on:
pull_request:
types: [opened, synchronize, reopened]
jobs:
detect-and-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Full history for accurate diff
- name: Get changed files
id: changed-files
uses: tj-actions/changed-files@v40
with:
files: |
src/**
tests/**
*.config.ts
files_ignore: |
**/*.md
docs/**
- name: Run tests for changed files
if: steps.changed-files.outputs.any_changed == 'true'
run: |
echo "Changed files: ${{ steps.changed-files.outputs.all_changed_files }}"
bash scripts/test-changed-files.sh
env:
BASE_BRANCH: ${{ github.base_ref }}
TEST_ENV: staging
```
**Key Points**:
- **Intelligent mapping**: Code changes related tests
- **Critical file detection**: Config changes = full suite
- **Component mapping**: UI changes component + E2E tests
- **Fast feedback**: Run only what's needed (< 2 min typical)
- **Safety net**: Unrecognized changes run smoke tests
---
### Example 4: Promotion Rules (Pre-Commit → CI → Staging → Production)
**Context**: Progressive test execution strategy across deployment stages.
**Implementation**:
```typescript
// scripts/test-promotion-strategy.ts
/**
* Test Promotion Strategy
* Defines which tests run at each stage of the development lifecycle
*/
export type TestStage = 'pre-commit' | 'ci-pr' | 'ci-merge' | 'staging' | 'production';
export type TestPromotion = {
stage: TestStage;
description: string;
testCommand: string;
timebudget: string; // minutes
required: boolean;
failureAction: 'block' | 'warn' | 'alert';
};
export const TEST_PROMOTION_RULES: Record<TestStage, TestPromotion> = {
'pre-commit': {
stage: 'pre-commit',
description: 'Local developer checks before git commit',
testCommand: 'npm run test:smoke',
timebudget: '2',
required: true,
failureAction: 'block',
},
'ci-pr': {
stage: 'ci-pr',
description: 'CI checks on pull request creation/update',
testCommand: 'npm run test:changed && npm run test:p0-p1',
timebudget: '10',
required: true,
failureAction: 'block',
},
'ci-merge': {
stage: 'ci-merge',
description: 'Full regression before merge to main',
testCommand: 'npm run test:regression',
timebudget: '30',
required: true,
failureAction: 'block',
},
staging: {
stage: 'staging',
description: 'Post-deployment validation in staging environment',
testCommand: 'npm run test:e2e -- --grep "@smoke"',
timebudget: '15',
required: true,
failureAction: 'block',
},
production: {
stage: 'production',
description: 'Production smoke tests post-deployment',
testCommand: 'npm run test:e2e:prod -- --grep "@smoke.*@p0"',
timebudget: '5',
required: false,
failureAction: 'alert',
},
};
/**
* Get tests to run for a specific stage
*/
export function getTestsForStage(stage: TestStage): TestPromotion {
return TEST_PROMOTION_RULES[stage];
}
/**
* Validate if tests can be promoted to next stage
*/
export function canPromote(currentStage: TestStage, testsPassed: boolean): boolean {
const promotion = TEST_PROMOTION_RULES[currentStage];
if (!promotion.required) {
return true; // Non-required tests don't block promotion
}
return testsPassed;
}
```
**Husky pre-commit hook**:
```bash
#!/bin/bash
# .husky/pre-commit
# Run smoke tests before allowing commit
echo "🔍 Running pre-commit tests..."
npm run test:smoke
if [ $? -ne 0 ]; then
echo ""
echo "❌ Pre-commit tests failed!"
echo "Please fix failures before committing."
echo ""
echo "To skip (NOT recommended): git commit --no-verify"
exit 1
fi
echo "✅ Pre-commit tests passed"
```
**GitHub Actions workflow**:
```yaml
# .github/workflows/test-promotion.yml
name: Test Promotion Strategy
on:
pull_request:
push:
branches: [main]
workflow_dispatch:
jobs:
# Stage 1: PR tests (changed + P0-P1)
pr-tests:
if: github.event_name == 'pull_request'
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
- name: Run PR-level tests
run: |
npm run test:changed
npm run test:p0-p1
# Stage 2: Full regression (pre-merge)
regression-tests:
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- uses: actions/checkout@v4
- name: Run full regression
run: npm run test:regression
# Stage 3: Staging validation (post-deploy)
staging-smoke:
if: github.event_name == 'workflow_dispatch'
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- uses: actions/checkout@v4
- name: Run staging smoke tests
run: npm run test:e2e -- --grep "@smoke"
env:
TEST_ENV: staging
# Stage 4: Production smoke (post-deploy, non-blocking)
production-smoke:
if: github.event_name == 'workflow_dispatch'
runs-on: ubuntu-latest
timeout-minutes: 5
continue-on-error: true # Don't fail deployment if smoke tests fail
steps:
- uses: actions/checkout@v4
- name: Run production smoke tests
run: npm run test:e2e:prod -- --grep "@smoke.*@p0"
env:
TEST_ENV: production
- name: Alert on failure
if: failure()
uses: 8398a7/action-slack@v3
with:
status: ${{ job.status }}
text: '🚨 Production smoke tests failed!'
webhook_url: ${{ secrets.SLACK_WEBHOOK }}
```
**Selection strategy documentation**:
````markdown
# Test Selection Strategy
## Test Promotion Stages
| Stage | Tests Run | Time Budget | Blocks Deploy | Failure Action |
| ---------- | ------------------- | ----------- | ------------- | -------------- |
| Pre-Commit | Smoke (@smoke) | 2 min | ✅ Yes | Block commit |
| CI PR | Changed + P0-P1 | 10 min | ✅ Yes | Block merge |
| CI Merge | Full regression | 30 min | ✅ Yes | Block deploy |
| Staging | E2E smoke | 15 min | ✅ Yes | Rollback |
| Production | Critical smoke only | 5 min | ❌ No | Alert team |
## When Full Regression Runs
Full regression suite (`npm run test:regression`) runs in these scenarios:
- ✅ Before merging to `main` (CI Merge stage)
- ✅ Nightly builds (scheduled workflow)
- ✅ Manual trigger (workflow_dispatch)
- ✅ Release candidate testing
Full regression does NOT run on:
- ❌ Every PR commit (too slow)
- ❌ Pre-commit hooks (too slow)
- ❌ Production deployments (deploy-blocking)
## Override Scenarios
Skip tests (emergency only):
```bash
git commit --no-verify # Skip pre-commit hook
gh pr merge --admin # Force merge (requires admin)
```
````
```
**Key Points**:
- **Progressive validation**: More tests at each stage
- **Time budgets**: Clear expectations per stage
- **Blocking vs. alerting**: Production tests don't block deploy
- **Documentation**: Team knows when full regression runs
- **Emergency overrides**: Documented but discouraged
---
## Test Selection Strategy Checklist
Before implementing selective testing, verify:
- [ ] **Tag strategy defined**: @smoke, @p0-p3, @regression documented
- [ ] **Time budgets set**: Each stage has clear timeout (smoke < 5 min, full < 30 min)
- [ ] **Changed file mapping**: Code changes → test selection logic implemented
- [ ] **Promotion rules documented**: README explains when full regression runs
- [ ] **CI integration**: GitHub Actions uses selective strategy
- [ ] **Local parity**: Developers can run same selections locally
- [ ] **Emergency overrides**: Skip mechanisms documented (--no-verify, admin merge)
- [ ] **Metrics tracked**: Monitor test execution time and selection accuracy
## Integration Points
- Used in workflows: `*ci` (CI/CD setup), `*automate` (test generation with tags)
- Related fragments: `ci-burn-in.md`, `test-priorities-matrix.md`, `test-quality.md`
- Selection tools: Playwright --grep, Cypress @cypress/grep, git diff
_Source: 32+ selective testing strategies blog, Murat testing philosophy, SEON CI optimization_
```

View File

@@ -0,0 +1,527 @@
# Selector Resilience
## Principle
Robust selectors follow a strict hierarchy: **data-testid > ARIA roles > text content > CSS/IDs** (last resort). Selectors must be resilient to UI changes (styling, layout, content updates) and remain human-readable for maintenance.
## Rationale
**The Problem**: Brittle selectors (CSS classes, nth-child, complex XPath) break when UI styling changes, elements are reordered, or design updates occur. This causes test maintenance burden and false negatives.
**The Solution**: Prioritize semantic selectors that reflect user intent (ARIA roles, accessible names, test IDs). Use dynamic filtering for lists instead of nth() indexes. Validate selectors during code review and refactor proactively.
**Why This Matters**:
- Prevents false test failures (UI refactoring doesn't break tests)
- Improves accessibility (ARIA roles benefit both tests and screen readers)
- Enhances readability (semantic selectors document user intent)
- Reduces maintenance burden (robust selectors survive design changes)
## Pattern Examples
### Example 1: Selector Hierarchy (Priority Order with Examples)
**Context**: Choose the most resilient selector for each element type
**Implementation**:
```typescript
// tests/selectors/hierarchy-examples.spec.ts
import { test, expect } from '@playwright/test';
test.describe('Selector Hierarchy Best Practices', () => {
test('Level 1: data-testid (BEST - most resilient)', async ({ page }) => {
await page.goto('/login');
// ✅ Best: Dedicated test attribute (survives all UI changes)
await page.getByTestId('email-input').fill('user@example.com');
await page.getByTestId('password-input').fill('password123');
await page.getByTestId('login-button').click();
await expect(page.getByTestId('welcome-message')).toBeVisible();
// Why it's best:
// - Survives CSS refactoring (class name changes)
// - Survives layout changes (element reordering)
// - Survives content changes (button text updates)
// - Explicit test contract (developer knows it's for testing)
});
test('Level 2: ARIA roles and accessible names (GOOD - future-proof)', async ({ page }) => {
await page.goto('/login');
// ✅ Good: Semantic HTML roles (benefits accessibility + tests)
await page.getByRole('textbox', { name: 'Email' }).fill('user@example.com');
await page.getByRole('textbox', { name: 'Password' }).fill('password123');
await page.getByRole('button', { name: 'Sign In' }).click();
await expect(page.getByRole('heading', { name: 'Welcome' })).toBeVisible();
// Why it's good:
// - Survives CSS refactoring
// - Survives layout changes
// - Enforces accessibility (screen reader compatible)
// - Self-documenting (role + name = clear intent)
});
test('Level 3: Text content (ACCEPTABLE - user-centric)', async ({ page }) => {
await page.goto('/dashboard');
// ✅ Acceptable: Text content (matches user perception)
await page.getByText('Create New Order').click();
await expect(page.getByText('Order Details')).toBeVisible();
// Why it's acceptable:
// - User-centric (what user sees)
// - Survives CSS/layout changes
// - Breaks when copy changes (forces test update with content)
// ⚠️ Use with caution for dynamic/localized content:
// - Avoid for content with variables: "User 123" (use regex instead)
// - Avoid for i18n content (use data-testid or ARIA)
});
test('Level 4: CSS classes/IDs (LAST RESORT - brittle)', async ({ page }) => {
await page.goto('/login');
// ❌ Last resort: CSS class (breaks with styling updates)
// await page.locator('.btn-primary').click()
// ❌ Last resort: ID (breaks if ID changes)
// await page.locator('#login-form').fill(...)
// ✅ Better: Use data-testid or ARIA instead
await page.getByTestId('login-button').click();
// Why CSS/ID is last resort:
// - Breaks with CSS refactoring (class name changes)
// - Breaks with HTML restructuring (ID changes)
// - Not semantic (unclear what element does)
// - Tight coupling between tests and styling
});
});
```
**Key Points**:
- Hierarchy: data-testid (best) > ARIA (good) > text (acceptable) > CSS/ID (last resort)
- data-testid survives ALL UI changes (explicit test contract)
- ARIA roles enforce accessibility (screen reader compatible)
- Text content is user-centric (but breaks with copy changes)
- CSS/ID are brittle (break with styling refactoring)
---
### Example 2: Dynamic Selector Patterns (Lists, Filters, Regex)
**Context**: Handle dynamic content, lists, and variable data with resilient selectors
**Implementation**:
```typescript
// tests/selectors/dynamic-selectors.spec.ts
import { test, expect } from '@playwright/test';
test.describe('Dynamic Selector Patterns', () => {
test('regex for variable content (user IDs, timestamps)', async ({ page }) => {
await page.goto('/users');
// ✅ Good: Regex pattern for dynamic user IDs
await expect(page.getByText(/User \d+/)).toBeVisible();
// ✅ Good: Regex for timestamps
await expect(page.getByText(/Last login: \d{4}-\d{2}-\d{2}/)).toBeVisible();
// ✅ Good: Regex for dynamic counts
await expect(page.getByText(/\d+ items in cart/)).toBeVisible();
});
test('partial text matching (case-insensitive, substring)', async ({ page }) => {
await page.goto('/products');
// ✅ Good: Partial match (survives minor text changes)
await page.getByText('Product', { exact: false }).first().click();
// ✅ Good: Case-insensitive (survives capitalization changes)
await expect(page.getByText(/sign in/i)).toBeVisible();
});
test('filter locators for lists (avoid brittle nth)', async ({ page }) => {
await page.goto('/products');
// ❌ Bad: Index-based (breaks when order changes)
// await page.locator('.product-card').nth(2).click()
// ✅ Good: Filter by content (resilient to reordering)
await page.locator('[data-testid="product-card"]').filter({ hasText: 'Premium Plan' }).click();
// ✅ Good: Filter by attribute
await page
.locator('[data-testid="product-card"]')
.filter({ has: page.locator('[data-status="active"]') })
.first()
.click();
});
test('nth() only when absolutely necessary', async ({ page }) => {
await page.goto('/dashboard');
// ⚠️ Acceptable: nth(0) for first item (common pattern)
const firstNotification = page.getByTestId('notification').nth(0);
await expect(firstNotification).toContainText('Welcome');
// ❌ Bad: nth(5) for arbitrary index (fragile)
// await page.getByTestId('notification').nth(5).click()
// ✅ Better: Use filter() with specific criteria
await page.getByTestId('notification').filter({ hasText: 'Critical Alert' }).click();
});
test('combine multiple locators for specificity', async ({ page }) => {
await page.goto('/checkout');
// ✅ Good: Narrow scope with combined locators
const shippingSection = page.getByTestId('shipping-section');
await shippingSection.getByLabel('Address Line 1').fill('123 Main St');
await shippingSection.getByLabel('City').fill('New York');
// Scoping prevents ambiguity (multiple "City" fields on page)
});
});
```
**Key Points**:
- Regex patterns handle variable content (IDs, timestamps, counts)
- Partial matching survives minor text changes (`exact: false`)
- `filter()` is more resilient than `nth()` (content-based vs index-based)
- `nth(0)` acceptable for "first item", avoid arbitrary indexes
- Combine locators to narrow scope (prevent ambiguity)
---
### Example 3: Selector Anti-Patterns (What NOT to Do)
**Context**: Common selector mistakes that cause brittle tests
**Problem Examples**:
```typescript
// tests/selectors/anti-patterns.spec.ts
import { test, expect } from '@playwright/test';
test.describe('Selector Anti-Patterns to Avoid', () => {
test('❌ Anti-Pattern 1: CSS classes (brittle)', async ({ page }) => {
await page.goto('/login');
// ❌ Bad: CSS class (breaks with design system updates)
// await page.locator('.btn-primary').click()
// await page.locator('.form-input-lg').fill('test@example.com')
// ✅ Good: Use data-testid or ARIA role
await page.getByTestId('login-button').click();
await page.getByRole('textbox', { name: 'Email' }).fill('test@example.com');
});
test('❌ Anti-Pattern 2: Index-based nth() (fragile)', async ({ page }) => {
await page.goto('/products');
// ❌ Bad: Index-based (breaks when product order changes)
// await page.locator('.product-card').nth(3).click()
// ✅ Good: Content-based filter
await page.locator('[data-testid="product-card"]').filter({ hasText: 'Laptop' }).click();
});
test('❌ Anti-Pattern 3: Complex XPath (hard to maintain)', async ({ page }) => {
await page.goto('/dashboard');
// ❌ Bad: Complex XPath (unreadable, breaks with structure changes)
// await page.locator('xpath=//div[@class="container"]//section[2]//button[contains(@class, "primary")]').click()
// ✅ Good: Semantic selector
await page.getByRole('button', { name: 'Create Order' }).click();
});
test('❌ Anti-Pattern 4: ID selectors (coupled to implementation)', async ({ page }) => {
await page.goto('/settings');
// ❌ Bad: HTML ID (breaks if ID changes for accessibility/SEO)
// await page.locator('#user-settings-form').fill(...)
// ✅ Good: data-testid or ARIA landmark
await page.getByTestId('user-settings-form').getByLabel('Display Name').fill('John Doe');
});
test('✅ Refactoring: Bad → Good Selector', async ({ page }) => {
await page.goto('/checkout');
// Before (brittle):
// await page.locator('.checkout-form > .payment-section > .btn-submit').click()
// After (resilient):
await page.getByTestId('checkout-form').getByRole('button', { name: 'Complete Payment' }).click();
await expect(page.getByText('Payment successful')).toBeVisible();
});
});
```
**Why These Fail**:
- **CSS classes**: Change frequently with design updates (Tailwind, CSS modules)
- **nth() indexes**: Fragile to element reordering (new features, A/B tests)
- **Complex XPath**: Unreadable, breaks with HTML structure changes
- **HTML IDs**: Not stable (accessibility improvements change IDs)
**Better Approach**: Use selector hierarchy (testid > ARIA > text)
---
### Example 4: Selector Debugging Techniques (Inspector, DevTools, MCP)
**Context**: Debug selector failures interactively to find better alternatives
**Implementation**:
```typescript
// tests/selectors/debugging-techniques.spec.ts
import { test, expect } from '@playwright/test';
test.describe('Selector Debugging Techniques', () => {
test('use Playwright Inspector to test selectors', async ({ page }) => {
await page.goto('/dashboard');
// Pause test to open Inspector
await page.pause();
// In Inspector console, test selectors:
// page.getByTestId('user-menu') ✅ Works
// page.getByRole('button', { name: 'Profile' }) ✅ Works
// page.locator('.btn-primary') ❌ Brittle
// Use "Pick Locator" feature to generate selectors
// Use "Record" mode to capture user interactions
await page.getByTestId('user-menu').click();
await expect(page.getByRole('menu')).toBeVisible();
});
test('use locator.all() to debug lists', async ({ page }) => {
await page.goto('/products');
// Debug: How many products are visible?
const products = await page.getByTestId('product-card').all();
console.log(`Found ${products.length} products`);
// Debug: What text is in each product?
for (const product of products) {
const text = await product.textContent();
console.log(`Product text: ${text}`);
}
// Use findings to build better selector
await page.getByTestId('product-card').filter({ hasText: 'Laptop' }).click();
});
test('use DevTools console to test selectors', async ({ page }) => {
await page.goto('/checkout');
// Open DevTools (manually or via page.pause())
// Test selectors in console:
// document.querySelectorAll('[data-testid="payment-method"]')
// document.querySelector('#credit-card-input')
// Find robust selector through trial and error
await page.getByTestId('payment-method').selectOption('credit-card');
});
test('MCP browser_generate_locator (if available)', async ({ page }) => {
await page.goto('/products');
// If Playwright MCP available, use browser_generate_locator:
// 1. Click element in browser
// 2. MCP generates optimal selector
// 3. Copy into test
// Example output from MCP:
// page.getByRole('link', { name: 'Product A' })
// Use generated selector
await page.getByRole('link', { name: 'Product A' }).click();
await expect(page).toHaveURL(/\/products\/\d+/);
});
});
```
**Key Points**:
- Playwright Inspector: Interactive selector testing with "Pick Locator" feature
- `locator.all()`: Debug lists to understand structure and content
- DevTools console: Test CSS selectors before adding to tests
- MCP browser_generate_locator: Auto-generate optimal selectors (if MCP available)
- Always validate selectors work before committing
---
### Example 2: Selector Refactoring Guide (Before/After Patterns)
**Context**: Systematically improve brittle selectors to resilient alternatives
**Implementation**:
```typescript
// tests/selectors/refactoring-guide.spec.ts
import { test, expect } from '@playwright/test';
test.describe('Selector Refactoring Patterns', () => {
test('refactor: CSS class → data-testid', async ({ page }) => {
await page.goto('/products');
// ❌ Before: CSS class (breaks with Tailwind updates)
// await page.locator('.bg-blue-500.px-4.py-2.rounded').click()
// ✅ After: data-testid
await page.getByTestId('add-to-cart-button').click();
// Implementation: Add data-testid to button component
// <button className="bg-blue-500 px-4 py-2 rounded" data-testid="add-to-cart-button">
});
test('refactor: nth() index → filter()', async ({ page }) => {
await page.goto('/users');
// ❌ Before: Index-based (breaks when users reorder)
// await page.locator('.user-row').nth(2).click()
// ✅ After: Content-based filter
await page.locator('[data-testid="user-row"]').filter({ hasText: 'john@example.com' }).click();
});
test('refactor: Complex XPath → ARIA role', async ({ page }) => {
await page.goto('/checkout');
// ❌ Before: Complex XPath (unreadable, brittle)
// await page.locator('xpath=//div[@id="payment"]//form//button[contains(@class, "submit")]').click()
// ✅ After: ARIA role
await page.getByRole('button', { name: 'Complete Payment' }).click();
});
test('refactor: ID selector → data-testid', async ({ page }) => {
await page.goto('/settings');
// ❌ Before: HTML ID (changes with accessibility improvements)
// await page.locator('#user-profile-section').getByLabel('Name').fill('John')
// ✅ After: data-testid + semantic label
await page.getByTestId('user-profile-section').getByLabel('Display Name').fill('John Doe');
});
test('refactor: Deeply nested CSS → scoped data-testid', async ({ page }) => {
await page.goto('/dashboard');
// ❌ Before: Deep nesting (breaks with structure changes)
// await page.locator('.container .sidebar .menu .item:nth-child(3) a').click()
// ✅ After: Scoped data-testid
const sidebar = page.getByTestId('sidebar');
await sidebar.getByRole('link', { name: 'Settings' }).click();
});
});
```
**Key Points**:
- CSS class → data-testid (survives design system updates)
- nth() → filter() (content-based vs index-based)
- Complex XPath → ARIA role (readable, semantic)
- ID → data-testid (decouples from HTML structure)
- Deep nesting → scoped locators (modular, maintainable)
---
### Example 3: Selector Best Practices Checklist
```typescript
// tests/selectors/validation-checklist.spec.ts
import { test, expect } from '@playwright/test';
/**
* Selector Validation Checklist
*
* Before committing test, verify selectors meet these criteria:
*/
test.describe('Selector Best Practices Validation', () => {
test('✅ 1. Prefer data-testid for interactive elements', async ({ page }) => {
await page.goto('/login');
// Interactive elements (buttons, inputs, links) should use data-testid
await page.getByTestId('email-input').fill('test@example.com');
await page.getByTestId('login-button').click();
});
test('✅ 2. Use ARIA roles for semantic elements', async ({ page }) => {
await page.goto('/dashboard');
// Semantic elements (headings, navigation, forms) use ARIA
await expect(page.getByRole('heading', { name: 'Dashboard' })).toBeVisible();
await page.getByRole('navigation').getByRole('link', { name: 'Settings' }).click();
});
test('✅ 3. Avoid CSS classes (except when testing styles)', async ({ page }) => {
await page.goto('/products');
// ❌ Never for interaction: page.locator('.btn-primary')
// ✅ Only for visual regression: await expect(page.locator('.error-banner')).toHaveCSS('color', 'rgb(255, 0, 0)')
});
test('✅ 4. Use filter() instead of nth() for lists', async ({ page }) => {
await page.goto('/orders');
// List selection should be content-based
await page.getByTestId('order-row').filter({ hasText: 'Order #12345' }).click();
});
test('✅ 5. Selectors are human-readable', async ({ page }) => {
await page.goto('/checkout');
// ✅ Good: Clear intent
await page.getByTestId('shipping-address-form').getByLabel('Street Address').fill('123 Main St');
// ❌ Bad: Cryptic
// await page.locator('div > div:nth-child(2) > input[type="text"]').fill('123 Main St')
});
});
```
**Validation Rules**:
1. **Interactive elements** (buttons, inputs) → data-testid
2. **Semantic elements** (headings, nav, forms) → ARIA roles
3. **CSS classes** → Avoid (except visual regression tests)
4. **Lists** → filter() over nth() (content-based selection)
5. **Readability** → Selectors document user intent (clear, semantic)
---
## Selector Resilience Checklist
Before deploying selectors:
- [ ] **Hierarchy followed**: data-testid (1st choice) > ARIA (2nd) > text (3rd) > CSS/ID (last resort)
- [ ] **Interactive elements use data-testid**: Buttons, inputs, links have dedicated test attributes
- [ ] **Semantic elements use ARIA**: Headings, navigation, forms use roles and accessible names
- [ ] **No brittle patterns**: No CSS classes (except visual tests), no arbitrary nth(), no complex XPath
- [ ] **Dynamic content handled**: Regex for IDs/timestamps, filter() for lists, partial matching for text
- [ ] **Selectors are scoped**: Use container locators to narrow scope (prevent ambiguity)
- [ ] **Human-readable**: Selectors document user intent (clear, semantic, maintainable)
- [ ] **Validated in Inspector**: Test selectors interactively before committing (page.pause())
## Integration Points
- **Used in workflows**: `*atdd` (generate tests with robust selectors), `*automate` (healing selector failures), `*test-review` (validate selector quality)
- **Related fragments**: `test-healing-patterns.md` (selector failure diagnosis), `fixture-architecture.md` (page object alternatives), `test-quality.md` (maintainability standards)
- **Tools**: Playwright Inspector (Pick Locator), DevTools console, Playwright MCP browser_generate_locator (optional)
_Source: Playwright selector best practices, accessibility guidelines (ARIA), production test maintenance patterns_

View File

@@ -0,0 +1,644 @@
# Test Healing Patterns
## Principle
Common test failures follow predictable patterns (stale selectors, race conditions, dynamic data assertions, network errors, hard waits). **Automated healing** identifies failure signatures and applies pattern-based fixes. Manual healing captures these patterns for future automation.
## Rationale
**The Problem**: Test failures waste developer time on repetitive debugging. Teams manually fix the same selector issues, timing bugs, and data mismatches repeatedly across test suites.
**The Solution**: Catalog common failure patterns with diagnostic signatures and automated fixes. When a test fails, match the error message/stack trace against known patterns and apply the corresponding fix. This transforms test maintenance from reactive debugging to proactive pattern application.
**Why This Matters**:
- Reduces test maintenance time by 60-80% (pattern-based fixes vs manual debugging)
- Prevents flakiness regression (same bug fixed once, applied everywhere)
- Builds institutional knowledge (failure catalog grows over time)
- Enables self-healing test suites (automate workflow validates and heals)
## Pattern Examples
### Example 1: Common Failure Pattern - Stale Selectors (Element Not Found)
**Context**: Test fails with "Element not found" or "Locator resolved to 0 elements" errors
**Diagnostic Signature**:
```typescript
// src/testing/healing/selector-healing.ts
export type SelectorFailure = {
errorMessage: string;
stackTrace: string;
selector: string;
testFile: string;
lineNumber: number;
};
/**
* Detect stale selector failures
*/
export function isSelectorFailure(error: Error): boolean {
const patterns = [
/locator.*resolved to 0 elements/i,
/element not found/i,
/waiting for locator.*to be visible/i,
/selector.*did not match any elements/i,
/unable to find element/i,
];
return patterns.some((pattern) => pattern.test(error.message));
}
/**
* Extract selector from error message
*/
export function extractSelector(errorMessage: string): string | null {
// Playwright: "locator('button[type=\"submit\"]') resolved to 0 elements"
const playwrightMatch = errorMessage.match(/locator\('([^']+)'\)/);
if (playwrightMatch) return playwrightMatch[1];
// Cypress: "Timed out retrying: Expected to find element: '.submit-button'"
const cypressMatch = errorMessage.match(/Expected to find element: ['"]([^'"]+)['"]/i);
if (cypressMatch) return cypressMatch[1];
return null;
}
/**
* Suggest better selector based on hierarchy
*/
export function suggestBetterSelector(badSelector: string): string {
// If using CSS class → suggest data-testid
if (badSelector.startsWith('.') || badSelector.includes('class=')) {
const elementName = badSelector.match(/class=["']([^"']+)["']/)?.[1] || badSelector.slice(1);
return `page.getByTestId('${elementName}') // Prefer data-testid over CSS class`;
}
// If using ID → suggest data-testid
if (badSelector.startsWith('#')) {
return `page.getByTestId('${badSelector.slice(1)}') // Prefer data-testid over ID`;
}
// If using nth() → suggest filter() or more specific selector
if (badSelector.includes('.nth(')) {
return `page.locator('${badSelector.split('.nth(')[0]}').filter({ hasText: 'specific text' }) // Avoid brittle nth(), use filter()`;
}
// If using complex CSS → suggest ARIA role
if (badSelector.includes('>') || badSelector.includes('+')) {
return `page.getByRole('button', { name: 'Submit' }) // Prefer ARIA roles over complex CSS`;
}
return `page.getByTestId('...') // Add data-testid attribute to element`;
}
```
**Healing Implementation**:
```typescript
// tests/healing/selector-healing.spec.ts
import { test, expect } from '@playwright/test';
import { isSelectorFailure, extractSelector, suggestBetterSelector } from '../../src/testing/healing/selector-healing';
test('heal stale selector failures automatically', async ({ page }) => {
await page.goto('/dashboard');
try {
// Original test with brittle CSS selector
await page.locator('.btn-primary').click();
} catch (error: any) {
if (isSelectorFailure(error)) {
const badSelector = extractSelector(error.message);
const suggestion = badSelector ? suggestBetterSelector(badSelector) : null;
console.log('HEALING SUGGESTION:', suggestion);
// Apply healed selector
await page.getByTestId('submit-button').click(); // Fixed!
} else {
throw error; // Not a selector issue, rethrow
}
}
await expect(page.getByText('Success')).toBeVisible();
});
```
**Key Points**:
- Diagnosis: Error message contains "locator resolved to 0 elements" or "element not found"
- Fix: Replace brittle selector (CSS class, ID, nth) with robust alternative (data-testid, ARIA role)
- Prevention: Follow selector hierarchy (data-testid > ARIA > text > CSS)
- Automation: Pattern matching on error message + stack trace
---
### Example 2: Common Failure Pattern - Race Conditions (Timing Errors)
**Context**: Test fails with "timeout waiting for element" or "element not visible" errors
**Diagnostic Signature**:
```typescript
// src/testing/healing/timing-healing.ts
export type TimingFailure = {
errorMessage: string;
testFile: string;
lineNumber: number;
actionType: 'click' | 'fill' | 'waitFor' | 'expect';
};
/**
* Detect race condition failures
*/
export function isTimingFailure(error: Error): boolean {
const patterns = [
/timeout.*waiting for/i,
/element is not visible/i,
/element is not attached to the dom/i,
/waiting for element to be visible.*exceeded/i,
/timed out retrying/i,
/waitForLoadState.*timeout/i,
];
return patterns.some((pattern) => pattern.test(error.message));
}
/**
* Detect hard wait anti-pattern
*/
export function hasHardWait(testCode: string): boolean {
const hardWaitPatterns = [/page\.waitForTimeout\(/, /cy\.wait\(\d+\)/, /await.*sleep\(/, /setTimeout\(/];
return hardWaitPatterns.some((pattern) => pattern.test(testCode));
}
/**
* Suggest deterministic wait replacement
*/
export function suggestDeterministicWait(testCode: string): string {
if (testCode.includes('page.waitForTimeout')) {
return `
// ❌ Bad: Hard wait (flaky)
// await page.waitForTimeout(3000)
// ✅ Good: Wait for network response
await page.waitForResponse(resp => resp.url().includes('/api/data') && resp.status() === 200)
// OR wait for element state
await page.getByTestId('loading-spinner').waitFor({ state: 'detached' })
`.trim();
}
if (testCode.includes('cy.wait(') && /cy\.wait\(\d+\)/.test(testCode)) {
return `
// ❌ Bad: Hard wait (flaky)
// cy.wait(3000)
// ✅ Good: Wait for aliased network request
cy.intercept('GET', '/api/data').as('getData')
cy.visit('/page')
cy.wait('@getData')
`.trim();
}
return `
// Add network-first interception BEFORE navigation:
await page.route('**/api/**', route => route.continue())
const responsePromise = page.waitForResponse('**/api/data')
await page.goto('/page')
await responsePromise
`.trim();
}
```
**Healing Implementation**:
```typescript
// tests/healing/timing-healing.spec.ts
import { test, expect } from '@playwright/test';
import { isTimingFailure, hasHardWait, suggestDeterministicWait } from '../../src/testing/healing/timing-healing';
test('heal race condition with network-first pattern', async ({ page, context }) => {
// Setup interception BEFORE navigation (prevent race)
await context.route('**/api/products', (route) => {
route.fulfill({
status: 200,
body: JSON.stringify({ products: [{ id: 1, name: 'Product A' }] }),
});
});
const responsePromise = page.waitForResponse('**/api/products');
await page.goto('/products');
await responsePromise; // Deterministic wait
// Element now reliably visible (no race condition)
await expect(page.getByText('Product A')).toBeVisible();
});
test('heal hard wait with event-based wait', async ({ page }) => {
await page.goto('/dashboard');
// ❌ Original (flaky): await page.waitForTimeout(3000)
// ✅ Healed: Wait for spinner to disappear
await page.getByTestId('loading-spinner').waitFor({ state: 'detached' });
// Element now reliably visible
await expect(page.getByText('Dashboard loaded')).toBeVisible();
});
```
**Key Points**:
- Diagnosis: Error contains "timeout" or "not visible", often after navigation
- Fix: Replace hard waits with network-first pattern or element state waits
- Prevention: ALWAYS intercept before navigate, use waitForResponse()
- Automation: Detect `page.waitForTimeout()` or `cy.wait(number)` in test code
---
### Example 3: Common Failure Pattern - Dynamic Data Assertions (Non-Deterministic IDs)
**Context**: Test fails with "Expected 'User 123' but received 'User 456'" or timestamp mismatches
**Diagnostic Signature**:
```typescript
// src/testing/healing/data-healing.ts
export type DataFailure = {
errorMessage: string;
expectedValue: string;
actualValue: string;
testFile: string;
lineNumber: number;
};
/**
* Detect dynamic data assertion failures
*/
export function isDynamicDataFailure(error: Error): boolean {
const patterns = [
/expected.*\d+.*received.*\d+/i, // ID mismatches
/expected.*\d{4}-\d{2}-\d{2}.*received/i, // Date mismatches
/expected.*user.*\d+/i, // Dynamic user IDs
/expected.*order.*\d+/i, // Dynamic order IDs
/expected.*to.*contain.*\d+/i, // Numeric assertions
];
return patterns.some((pattern) => pattern.test(error.message));
}
/**
* Suggest flexible assertion pattern
*/
export function suggestFlexibleAssertion(errorMessage: string): string {
if (/expected.*user.*\d+/i.test(errorMessage)) {
return `
// ❌ Bad: Hardcoded ID
// await expect(page.getByText('User 123')).toBeVisible()
// ✅ Good: Regex pattern for any user ID
await expect(page.getByText(/User \\d+/)).toBeVisible()
// OR use partial match
await expect(page.locator('[data-testid="user-name"]')).toContainText('User')
`.trim();
}
if (/expected.*\d{4}-\d{2}-\d{2}/i.test(errorMessage)) {
return `
// ❌ Bad: Hardcoded date
// await expect(page.getByText('2024-01-15')).toBeVisible()
// ✅ Good: Dynamic date validation
const today = new Date().toISOString().split('T')[0]
await expect(page.getByTestId('created-date')).toHaveText(today)
// OR use date format regex
await expect(page.getByTestId('created-date')).toHaveText(/\\d{4}-\\d{2}-\\d{2}/)
`.trim();
}
if (/expected.*order.*\d+/i.test(errorMessage)) {
return `
// ❌ Bad: Hardcoded order ID
// const orderId = '12345'
// ✅ Good: Capture dynamic order ID
const orderText = await page.getByTestId('order-id').textContent()
const orderId = orderText?.match(/Order #(\\d+)/)?.[1]
expect(orderId).toBeTruthy()
// Use captured ID in later assertions
await expect(page.getByText(\`Order #\${orderId} confirmed\`)).toBeVisible()
`.trim();
}
return `Use regex patterns, partial matching, or capture dynamic values instead of hardcoding`;
}
```
**Healing Implementation**:
```typescript
// tests/healing/data-healing.spec.ts
import { test, expect } from '@playwright/test';
test('heal dynamic ID assertion with regex', async ({ page }) => {
await page.goto('/users');
// ❌ Original (fails with random IDs): await expect(page.getByText('User 123')).toBeVisible()
// ✅ Healed: Regex pattern matches any user ID
await expect(page.getByText(/User \d+/)).toBeVisible();
});
test('heal timestamp assertion with dynamic generation', async ({ page }) => {
await page.goto('/dashboard');
// ❌ Original (fails daily): await expect(page.getByText('2024-01-15')).toBeVisible()
// ✅ Healed: Generate expected date dynamically
const today = new Date().toISOString().split('T')[0];
await expect(page.getByTestId('last-updated')).toContainText(today);
});
test('heal order ID assertion with capture', async ({ page, request }) => {
// Create order via API (dynamic ID)
const response = await request.post('/api/orders', {
data: { productId: '123', quantity: 1 },
});
const { orderId } = await response.json();
// ✅ Healed: Use captured dynamic ID
await page.goto(`/orders/${orderId}`);
await expect(page.getByText(`Order #${orderId}`)).toBeVisible();
});
```
**Key Points**:
- Diagnosis: Error message shows expected vs actual value mismatch with IDs/timestamps
- Fix: Use regex patterns (`/User \d+/`), partial matching, or capture dynamic values
- Prevention: Never hardcode IDs, timestamps, or random data in assertions
- Automation: Parse error message for expected/actual values, suggest regex patterns
---
### Example 4: Common Failure Pattern - Network Errors (Missing Route Interception)
**Context**: Test fails with "API call failed" or "500 error" during test execution
**Diagnostic Signature**:
```typescript
// src/testing/healing/network-healing.ts
export type NetworkFailure = {
errorMessage: string;
url: string;
statusCode: number;
method: string;
};
/**
* Detect network failure
*/
export function isNetworkFailure(error: Error): boolean {
const patterns = [
/api.*call.*failed/i,
/request.*failed/i,
/network.*error/i,
/500.*internal server error/i,
/503.*service unavailable/i,
/fetch.*failed/i,
];
return patterns.some((pattern) => pattern.test(error.message));
}
/**
* Suggest route interception
*/
export function suggestRouteInterception(url: string, method: string): string {
return `
// ❌ Bad: Real API call (unreliable, slow, external dependency)
// ✅ Good: Mock API response with route interception
await page.route('${url}', route => {
route.fulfill({
status: 200,
contentType: 'application/json',
body: JSON.stringify({
// Mock response data
id: 1,
name: 'Test User',
email: 'test@example.com'
})
})
})
// Then perform action
await page.goto('/page')
`.trim();
}
```
**Healing Implementation**:
```typescript
// tests/healing/network-healing.spec.ts
import { test, expect } from '@playwright/test';
test('heal network failure with route mocking', async ({ page, context }) => {
// ✅ Healed: Mock API to prevent real network calls
await context.route('**/api/products', (route) => {
route.fulfill({
status: 200,
contentType: 'application/json',
body: JSON.stringify({
products: [
{ id: 1, name: 'Product A', price: 29.99 },
{ id: 2, name: 'Product B', price: 49.99 },
],
}),
});
});
await page.goto('/products');
// Test now reliable (no external API dependency)
await expect(page.getByText('Product A')).toBeVisible();
await expect(page.getByText('$29.99')).toBeVisible();
});
test('heal 500 error with error state mocking', async ({ page, context }) => {
// Mock API failure scenario
await context.route('**/api/products', (route) => {
route.fulfill({ status: 500, body: JSON.stringify({ error: 'Internal Server Error' }) });
});
await page.goto('/products');
// Verify error handling (not crash)
await expect(page.getByText('Unable to load products')).toBeVisible();
await expect(page.getByRole('button', { name: 'Retry' })).toBeVisible();
});
```
**Key Points**:
- Diagnosis: Error message contains "API call failed", "500 error", or network-related failures
- Fix: Add `page.route()` or `cy.intercept()` to mock API responses
- Prevention: Mock ALL external dependencies (APIs, third-party services)
- Automation: Extract URL from error message, generate route interception code
---
### Example 5: Common Failure Pattern - Hard Waits (Unreliable Timing)
**Context**: Test fails intermittently with "timeout exceeded" or passes/fails randomly
**Diagnostic Signature**:
```typescript
// src/testing/healing/hard-wait-healing.ts
/**
* Detect hard wait anti-pattern in test code
*/
export function detectHardWaits(testCode: string): Array<{ line: number; code: string }> {
const lines = testCode.split('\n');
const violations: Array<{ line: number; code: string }> = [];
lines.forEach((line, index) => {
if (line.includes('page.waitForTimeout(') || /cy\.wait\(\d+\)/.test(line) || line.includes('sleep(') || line.includes('setTimeout(')) {
violations.push({ line: index + 1, code: line.trim() });
}
});
return violations;
}
/**
* Suggest event-based wait replacement
*/
export function suggestEventBasedWait(hardWaitLine: string): string {
if (hardWaitLine.includes('page.waitForTimeout')) {
return `
// ❌ Bad: Hard wait (flaky)
${hardWaitLine}
// ✅ Good: Wait for network response
await page.waitForResponse(resp => resp.url().includes('/api/') && resp.ok())
// OR wait for element state change
await page.getByTestId('loading-spinner').waitFor({ state: 'detached' })
await page.getByTestId('content').waitFor({ state: 'visible' })
`.trim();
}
if (/cy\.wait\(\d+\)/.test(hardWaitLine)) {
return `
// ❌ Bad: Hard wait (flaky)
${hardWaitLine}
// ✅ Good: Wait for aliased request
cy.intercept('GET', '/api/data').as('getData')
cy.visit('/page')
cy.wait('@getData') // Deterministic
`.trim();
}
return 'Replace hard waits with event-based waits (waitForResponse, waitFor state changes)';
}
```
**Healing Implementation**:
```typescript
// tests/healing/hard-wait-healing.spec.ts
import { test, expect } from '@playwright/test';
test('heal hard wait with deterministic wait', async ({ page }) => {
await page.goto('/dashboard');
// ❌ Original (flaky): await page.waitForTimeout(3000)
// ✅ Healed: Wait for loading spinner to disappear
await page.getByTestId('loading-spinner').waitFor({ state: 'detached' });
// OR wait for specific network response
await page.waitForResponse((resp) => resp.url().includes('/api/dashboard') && resp.ok());
await expect(page.getByText('Dashboard ready')).toBeVisible();
});
test('heal implicit wait with explicit network wait', async ({ page }) => {
const responsePromise = page.waitForResponse('**/api/products');
await page.goto('/products');
// ❌ Original (race condition): await page.getByText('Product A').click()
// ✅ Healed: Wait for network first
await responsePromise;
await page.getByText('Product A').click();
await expect(page).toHaveURL(/\/products\/\d+/);
});
```
**Key Points**:
- Diagnosis: Test code contains `page.waitForTimeout()` or `cy.wait(number)`
- Fix: Replace with `waitForResponse()`, `waitFor({ state })`, or aliased intercepts
- Prevention: NEVER use hard waits, always use event-based/response-based waits
- Automation: Scan test code for hard wait patterns, suggest deterministic replacements
---
## Healing Pattern Catalog
| Failure Type | Diagnostic Signature | Healing Strategy | Prevention Pattern |
| -------------- | --------------------------------------------- | ------------------------------------- | ----------------------------------------- |
| Stale Selector | "locator resolved to 0 elements" | Replace with data-testid or ARIA role | Selector hierarchy (testid > ARIA > text) |
| Race Condition | "timeout waiting for element" | Add network-first interception | Intercept before navigate |
| Dynamic Data | "Expected 'User 123' but got 'User 456'" | Use regex or capture dynamic values | Never hardcode IDs/timestamps |
| Network Error | "API call failed", "500 error" | Add route mocking | Mock all external dependencies |
| Hard Wait | Test contains `waitForTimeout()` or `wait(n)` | Replace with event-based waits | Always use deterministic waits |
## Healing Workflow
1. **Run test** → Capture failure
2. **Identify pattern** → Match error against diagnostic signatures
3. **Apply fix** → Use pattern-based healing strategy
4. **Re-run test** → Validate fix (max 3 iterations)
5. **Mark unfixable** → Use `test.fixme()` if healing fails after 3 attempts
## Healing Checklist
Before enabling auto-healing in workflows:
- [ ] **Failure catalog documented**: Common patterns identified (selectors, timing, data, network, hard waits)
- [ ] **Diagnostic signatures defined**: Error message patterns for each failure type
- [ ] **Healing strategies documented**: Fix patterns for each failure type
- [ ] **Prevention patterns documented**: Best practices to avoid recurrence
- [ ] **Healing iteration limit set**: Max 3 attempts before marking test.fixme()
- [ ] **MCP integration optional**: Graceful degradation without Playwright MCP
- [ ] **Pattern-based fallback**: Use knowledge base patterns when MCP unavailable
- [ ] **Healing report generated**: Document what was healed and how
## Integration Points
- **Used in workflows**: `*automate` (auto-healing after test generation), `*atdd` (optional healing for acceptance tests)
- **Related fragments**: `selector-resilience.md` (selector debugging), `timing-debugging.md` (race condition fixes), `network-first.md` (interception patterns), `data-factories.md` (dynamic data handling)
- **Tools**: Error message parsing, AST analysis for code patterns, Playwright MCP (optional), pattern matching
_Source: Playwright test-healer patterns, production test failure analysis, common anti-patterns from test-resources-for-ai_

View File

@@ -0,0 +1,473 @@
<!-- Powered by BMAD-CORE™ -->
# Test Levels Framework
Comprehensive guide for determining appropriate test levels (unit, integration, E2E) for different scenarios.
## Test Level Decision Matrix
### Unit Tests
**When to use:**
- Testing pure functions and business logic
- Algorithm correctness
- Input validation and data transformation
- Error handling in isolated components
- Complex calculations or state machines
**Characteristics:**
- Fast execution (immediate feedback)
- No external dependencies (DB, API, file system)
- Highly maintainable and stable
- Easy to debug failures
**Example scenarios:**
```yaml
unit_test:
component: 'PriceCalculator'
scenario: 'Calculate discount with multiple rules'
justification: 'Complex business logic with multiple branches'
mock_requirements: 'None - pure function'
```
### Integration Tests
**When to use:**
- Component interaction verification
- Database operations and transactions
- API endpoint contracts
- Service-to-service communication
- Middleware and interceptor behavior
**Characteristics:**
- Moderate execution time
- Tests component boundaries
- May use test databases or containers
- Validates system integration points
**Example scenarios:**
```yaml
integration_test:
components: ['UserService', 'AuthRepository']
scenario: 'Create user with role assignment'
justification: 'Critical data flow between service and persistence'
test_environment: 'In-memory database'
```
### End-to-End Tests
**When to use:**
- Critical user journeys
- Cross-system workflows
- Visual regression testing
- Compliance and regulatory requirements
- Final validation before release
**Characteristics:**
- Slower execution
- Tests complete workflows
- Requires full environment setup
- Most realistic but most brittle
**Example scenarios:**
```yaml
e2e_test:
journey: 'Complete checkout process'
scenario: 'User purchases with saved payment method'
justification: 'Revenue-critical path requiring full validation'
environment: 'Staging with test payment gateway'
```
## Test Level Selection Rules
### Favor Unit Tests When:
- Logic can be isolated
- No side effects involved
- Fast feedback needed
- High cyclomatic complexity
### Favor Integration Tests When:
- Testing persistence layer
- Validating service contracts
- Testing middleware/interceptors
- Component boundaries critical
### Favor E2E Tests When:
- User-facing critical paths
- Multi-system interactions
- Regulatory compliance scenarios
- Visual regression important
## Anti-patterns to Avoid
- E2E testing for business logic validation
- Unit testing framework behavior
- Integration testing third-party libraries
- Duplicate coverage across levels
## Duplicate Coverage Guard
**Before adding any test, check:**
1. Is this already tested at a lower level?
2. Can a unit test cover this instead of integration?
3. Can an integration test cover this instead of E2E?
**Coverage overlap is only acceptable when:**
- Testing different aspects (unit: logic, integration: interaction, e2e: user experience)
- Critical paths requiring defense in depth
- Regression prevention for previously broken functionality
## Test Naming Conventions
- Unit: `test_{component}_{scenario}`
- Integration: `test_{flow}_{interaction}`
- E2E: `test_{journey}_{outcome}`
## Test ID Format
`{EPIC}.{STORY}-{LEVEL}-{SEQ}`
Examples:
- `1.3-UNIT-001`
- `1.3-INT-002`
- `1.3-E2E-001`
## Real Code Examples
### Example 1: E2E Test (Full User Journey)
**Scenario**: User logs in, navigates to dashboard, and places an order.
```typescript
// tests/e2e/checkout-flow.spec.ts
import { test, expect } from '@playwright/test';
import { createUser, createProduct } from '../test-utils/factories';
test.describe('Checkout Flow', () => {
test('user can complete purchase with saved payment method', async ({ page, apiRequest }) => {
// Setup: Seed data via API (fast!)
const user = createUser({ email: 'buyer@example.com', hasSavedCard: true });
const product = createProduct({ name: 'Widget', price: 29.99, stock: 10 });
await apiRequest.post('/api/users', { data: user });
await apiRequest.post('/api/products', { data: product });
// Network-first: Intercept BEFORE action
const loginPromise = page.waitForResponse('**/api/auth/login');
const cartPromise = page.waitForResponse('**/api/cart');
const orderPromise = page.waitForResponse('**/api/orders');
// Step 1: Login
await page.goto('/login');
await page.fill('[data-testid="email"]', user.email);
await page.fill('[data-testid="password"]', 'password123');
await page.click('[data-testid="login-button"]');
await loginPromise;
// Assert: Dashboard visible
await expect(page).toHaveURL('/dashboard');
await expect(page.getByText(`Welcome, ${user.name}`)).toBeVisible();
// Step 2: Add product to cart
await page.goto(`/products/${product.id}`);
await page.click('[data-testid="add-to-cart"]');
await cartPromise;
await expect(page.getByText('Added to cart')).toBeVisible();
// Step 3: Checkout with saved payment
await page.goto('/checkout');
await expect(page.getByText('Visa ending in 1234')).toBeVisible(); // Saved card
await page.click('[data-testid="use-saved-card"]');
await page.click('[data-testid="place-order"]');
await orderPromise;
// Assert: Order confirmation
await expect(page.getByText('Order Confirmed')).toBeVisible();
await expect(page.getByText(/Order #\d+/)).toBeVisible();
await expect(page.getByText('$29.99')).toBeVisible();
});
});
```
**Key Points (E2E)**:
- Tests complete user journey across multiple pages
- API setup for data (fast), UI for assertions (user-centric)
- Network-first interception to prevent flakiness
- Validates critical revenue path end-to-end
### Example 2: Integration Test (API/Service Layer)
**Scenario**: UserService creates user and assigns role via AuthRepository.
```typescript
// tests/integration/user-service.spec.ts
import { test, expect } from '@playwright/test';
import { createUser } from '../test-utils/factories';
test.describe('UserService Integration', () => {
test('should create user with admin role via API', async ({ request }) => {
const userData = createUser({ role: 'admin' });
// Direct API call (no UI)
const response = await request.post('/api/users', {
data: userData,
});
expect(response.status()).toBe(201);
const createdUser = await response.json();
expect(createdUser.id).toBeTruthy();
expect(createdUser.email).toBe(userData.email);
expect(createdUser.role).toBe('admin');
// Verify database state
const getResponse = await request.get(`/api/users/${createdUser.id}`);
expect(getResponse.status()).toBe(200);
const fetchedUser = await getResponse.json();
expect(fetchedUser.role).toBe('admin');
expect(fetchedUser.permissions).toContain('user:delete');
expect(fetchedUser.permissions).toContain('user:update');
// Cleanup
await request.delete(`/api/users/${createdUser.id}`);
});
test('should validate email uniqueness constraint', async ({ request }) => {
const userData = createUser({ email: 'duplicate@example.com' });
// Create first user
const response1 = await request.post('/api/users', { data: userData });
expect(response1.status()).toBe(201);
const user1 = await response1.json();
// Attempt duplicate email
const response2 = await request.post('/api/users', { data: userData });
expect(response2.status()).toBe(409); // Conflict
const error = await response2.json();
expect(error.message).toContain('Email already exists');
// Cleanup
await request.delete(`/api/users/${user1.id}`);
});
});
```
**Key Points (Integration)**:
- Tests service layer + database interaction
- No UI involved—pure API validation
- Business logic focus (role assignment, constraints)
- Faster than E2E, more realistic than unit tests
### Example 3: Component Test (Isolated UI Component)
**Scenario**: Test button component in isolation with props and user interactions.
```typescript
// src/components/Button.cy.tsx (Cypress Component Test)
import { Button } from './Button';
describe('Button Component', () => {
it('should render with correct label', () => {
cy.mount(<Button label="Click Me" />);
cy.contains('Click Me').should('be.visible');
});
it('should call onClick handler when clicked', () => {
const onClickSpy = cy.stub().as('onClick');
cy.mount(<Button label="Submit" onClick={onClickSpy} />);
cy.get('button').click();
cy.get('@onClick').should('have.been.calledOnce');
});
it('should be disabled when disabled prop is true', () => {
cy.mount(<Button label="Disabled" disabled={true} />);
cy.get('button').should('be.disabled');
cy.get('button').should('have.attr', 'aria-disabled', 'true');
});
it('should show loading spinner when loading', () => {
cy.mount(<Button label="Loading" loading={true} />);
cy.get('[data-testid="spinner"]').should('be.visible');
cy.get('button').should('be.disabled');
});
it('should apply variant styles correctly', () => {
cy.mount(<Button label="Primary" variant="primary" />);
cy.get('button').should('have.class', 'btn-primary');
cy.mount(<Button label="Secondary" variant="secondary" />);
cy.get('button').should('have.class', 'btn-secondary');
});
});
// Playwright Component Test equivalent
import { test, expect } from '@playwright/experimental-ct-react';
import { Button } from './Button';
test.describe('Button Component', () => {
test('should call onClick handler when clicked', async ({ mount }) => {
let clicked = false;
const component = await mount(
<Button label="Submit" onClick={() => { clicked = true; }} />
);
await component.getByRole('button').click();
expect(clicked).toBe(true);
});
test('should be disabled when loading', async ({ mount }) => {
const component = await mount(<Button label="Loading" loading={true} />);
await expect(component.getByRole('button')).toBeDisabled();
await expect(component.getByTestId('spinner')).toBeVisible();
});
});
```
**Key Points (Component)**:
- Tests UI component in isolation (no full app)
- Props + user interactions + visual states
- Faster than E2E, more realistic than unit tests for UI
- Great for design system components
### Example 4: Unit Test (Pure Function)
**Scenario**: Test pure business logic function without framework dependencies.
```typescript
// src/utils/price-calculator.test.ts (Jest/Vitest)
import { calculateDiscount, applyTaxes, calculateTotal } from './price-calculator';
describe('PriceCalculator', () => {
describe('calculateDiscount', () => {
it('should apply percentage discount correctly', () => {
const result = calculateDiscount(100, { type: 'percentage', value: 20 });
expect(result).toBe(80);
});
it('should apply fixed amount discount correctly', () => {
const result = calculateDiscount(100, { type: 'fixed', value: 15 });
expect(result).toBe(85);
});
it('should not apply discount below zero', () => {
const result = calculateDiscount(10, { type: 'fixed', value: 20 });
expect(result).toBe(0);
});
it('should handle no discount', () => {
const result = calculateDiscount(100, { type: 'none', value: 0 });
expect(result).toBe(100);
});
});
describe('applyTaxes', () => {
it('should calculate tax correctly for US', () => {
const result = applyTaxes(100, { country: 'US', rate: 0.08 });
expect(result).toBe(108);
});
it('should calculate tax correctly for EU (VAT)', () => {
const result = applyTaxes(100, { country: 'DE', rate: 0.19 });
expect(result).toBe(119);
});
it('should handle zero tax rate', () => {
const result = applyTaxes(100, { country: 'US', rate: 0 });
expect(result).toBe(100);
});
});
describe('calculateTotal', () => {
it('should calculate total with discount and taxes', () => {
const items = [
{ price: 50, quantity: 2 }, // 100
{ price: 30, quantity: 1 }, // 30
];
const discount = { type: 'percentage', value: 10 }; // -13
const tax = { country: 'US', rate: 0.08 }; // +9.36
const result = calculateTotal(items, discount, tax);
expect(result).toBeCloseTo(126.36, 2);
});
it('should handle empty items array', () => {
const result = calculateTotal([], { type: 'none', value: 0 }, { country: 'US', rate: 0 });
expect(result).toBe(0);
});
it('should calculate correctly without discount or tax', () => {
const items = [{ price: 25, quantity: 4 }];
const result = calculateTotal(items, { type: 'none', value: 0 }, { country: 'US', rate: 0 });
expect(result).toBe(100);
});
});
});
```
**Key Points (Unit)**:
- Pure function testing—no framework dependencies
- Fast execution (milliseconds)
- Edge case coverage (zero, negative, empty inputs)
- High cyclomatic complexity handled at unit level
## When to Use Which Level
| Scenario | Unit | Integration | E2E |
| ---------------------- | ------------- | ----------------- | ------------- |
| Pure business logic | ✅ Primary | ❌ Overkill | ❌ Overkill |
| Database operations | ❌ Can't test | ✅ Primary | ❌ Overkill |
| API contracts | ❌ Can't test | ✅ Primary | ⚠️ Supplement |
| User journeys | ❌ Can't test | ❌ Can't test | ✅ Primary |
| Component props/events | ✅ Partial | ⚠️ Component test | ❌ Overkill |
| Visual regression | ❌ Can't test | ⚠️ Component test | ✅ Primary |
| Error handling (logic) | ✅ Primary | ⚠️ Integration | ❌ Overkill |
| Error handling (UI) | ❌ Partial | ⚠️ Component test | ✅ Primary |
## Anti-Pattern Examples
**❌ BAD: E2E test for business logic**
```typescript
// DON'T DO THIS
test('calculate discount via UI', async ({ page }) => {
await page.goto('/calculator');
await page.fill('[data-testid="price"]', '100');
await page.fill('[data-testid="discount"]', '20');
await page.click('[data-testid="calculate"]');
await expect(page.getByText('$80')).toBeVisible();
});
// Problem: Slow, brittle, tests logic that should be unit tested
```
**✅ GOOD: Unit test for business logic**
```typescript
test('calculate discount', () => {
expect(calculateDiscount(100, 20)).toBe(80);
});
// Fast, reliable, isolated
```
_Source: Murat Testing Philosophy (test pyramid), existing test-levels-framework.md structure._

View File

@@ -0,0 +1,373 @@
<!-- Powered by BMAD-CORE™ -->
# Test Priorities Matrix
Guide for prioritizing test scenarios based on risk, criticality, and business impact.
## Priority Levels
### P0 - Critical (Must Test)
**Criteria:**
- Revenue-impacting functionality
- Security-critical paths
- Data integrity operations
- Regulatory compliance requirements
- Previously broken functionality (regression prevention)
**Examples:**
- Payment processing
- Authentication/authorization
- User data creation/deletion
- Financial calculations
- GDPR/privacy compliance
**Testing Requirements:**
- Comprehensive coverage at all levels
- Both happy and unhappy paths
- Edge cases and error scenarios
- Performance under load
### P1 - High (Should Test)
**Criteria:**
- Core user journeys
- Frequently used features
- Features with complex logic
- Integration points between systems
- Features affecting user experience
**Examples:**
- User registration flow
- Search functionality
- Data import/export
- Notification systems
- Dashboard displays
**Testing Requirements:**
- Primary happy paths required
- Key error scenarios
- Critical edge cases
- Basic performance validation
### P2 - Medium (Nice to Test)
**Criteria:**
- Secondary features
- Admin functionality
- Reporting features
- Configuration options
- UI polish and aesthetics
**Examples:**
- Admin settings panels
- Report generation
- Theme customization
- Help documentation
- Analytics tracking
**Testing Requirements:**
- Happy path coverage
- Basic error handling
- Can defer edge cases
### P3 - Low (Test if Time Permits)
**Criteria:**
- Rarely used features
- Nice-to-have functionality
- Cosmetic issues
- Non-critical optimizations
**Examples:**
- Advanced preferences
- Legacy feature support
- Experimental features
- Debug utilities
**Testing Requirements:**
- Smoke tests only
- Can rely on manual testing
- Document known limitations
## Risk-Based Priority Adjustments
### Increase Priority When:
- High user impact (affects >50% of users)
- High financial impact (>$10K potential loss)
- Security vulnerability potential
- Compliance/legal requirements
- Customer-reported issues
- Complex implementation (>500 LOC)
- Multiple system dependencies
### Decrease Priority When:
- Feature flag protected
- Gradual rollout planned
- Strong monitoring in place
- Easy rollback capability
- Low usage metrics
- Simple implementation
- Well-isolated component
## Test Coverage by Priority
| Priority | Unit Coverage | Integration Coverage | E2E Coverage |
| -------- | ------------- | -------------------- | ------------------ |
| P0 | >90% | >80% | All critical paths |
| P1 | >80% | >60% | Main happy paths |
| P2 | >60% | >40% | Smoke tests |
| P3 | Best effort | Best effort | Manual only |
## Priority Assignment Rules
1. **Start with business impact** - What happens if this fails?
2. **Consider probability** - How likely is failure?
3. **Factor in detectability** - Would we know if it failed?
4. **Account for recoverability** - Can we fix it quickly?
## Priority Decision Tree
```
Is it revenue-critical?
├─ YES → P0
└─ NO → Does it affect core user journey?
├─ YES → Is it high-risk?
│ ├─ YES → P0
│ └─ NO → P1
└─ NO → Is it frequently used?
├─ YES → P1
└─ NO → Is it customer-facing?
├─ YES → P2
└─ NO → P3
```
## Test Execution Order
1. Execute P0 tests first (fail fast on critical issues)
2. Execute P1 tests second (core functionality)
3. Execute P2 tests if time permits
4. P3 tests only in full regression cycles
## Continuous Adjustment
Review and adjust priorities based on:
- Production incident patterns
- User feedback and complaints
- Usage analytics
- Test failure history
- Business priority changes
---
## Automated Priority Classification
### Example: Priority Calculator (Risk-Based Automation)
```typescript
// src/testing/priority-calculator.ts
export type Priority = 'P0' | 'P1' | 'P2' | 'P3';
export type PriorityFactors = {
revenueImpact: 'critical' | 'high' | 'medium' | 'low' | 'none';
userImpact: 'all' | 'majority' | 'some' | 'few' | 'minimal';
securityRisk: boolean;
complianceRequired: boolean;
previousFailure: boolean;
complexity: 'high' | 'medium' | 'low';
usage: 'frequent' | 'regular' | 'occasional' | 'rare';
};
/**
* Calculate test priority based on multiple factors
* Mirrors the priority decision tree with objective criteria
*/
export function calculatePriority(factors: PriorityFactors): Priority {
const { revenueImpact, userImpact, securityRisk, complianceRequired, previousFailure, complexity, usage } = factors;
// P0: Revenue-critical, security, or compliance
if (revenueImpact === 'critical' || securityRisk || complianceRequired || (previousFailure && revenueImpact === 'high')) {
return 'P0';
}
// P0: High revenue + high complexity + frequent usage
if (revenueImpact === 'high' && complexity === 'high' && usage === 'frequent') {
return 'P0';
}
// P1: Core user journey (majority impacted + frequent usage)
if (userImpact === 'all' || userImpact === 'majority') {
if (usage === 'frequent' || complexity === 'high') {
return 'P1';
}
}
// P1: High revenue OR high complexity with regular usage
if ((revenueImpact === 'high' && usage === 'regular') || (complexity === 'high' && usage === 'frequent')) {
return 'P1';
}
// P2: Secondary features (some impact, occasional usage)
if (userImpact === 'some' || usage === 'occasional') {
return 'P2';
}
// P3: Rarely used, low impact
return 'P3';
}
/**
* Generate priority justification (for audit trail)
*/
export function justifyPriority(factors: PriorityFactors): string {
const priority = calculatePriority(factors);
const reasons: string[] = [];
if (factors.revenueImpact === 'critical') reasons.push('critical revenue impact');
if (factors.securityRisk) reasons.push('security-critical');
if (factors.complianceRequired) reasons.push('compliance requirement');
if (factors.previousFailure) reasons.push('regression prevention');
if (factors.userImpact === 'all' || factors.userImpact === 'majority') {
reasons.push(`impacts ${factors.userImpact} users`);
}
if (factors.complexity === 'high') reasons.push('high complexity');
if (factors.usage === 'frequent') reasons.push('frequently used');
return `${priority}: ${reasons.join(', ')}`;
}
/**
* Example: Payment scenario priority calculation
*/
const paymentScenario: PriorityFactors = {
revenueImpact: 'critical',
userImpact: 'all',
securityRisk: true,
complianceRequired: true,
previousFailure: false,
complexity: 'high',
usage: 'frequent',
};
console.log(calculatePriority(paymentScenario)); // 'P0'
console.log(justifyPriority(paymentScenario));
// 'P0: critical revenue impact, security-critical, compliance requirement, impacts all users, high complexity, frequently used'
```
### Example: Test Suite Tagging Strategy
```typescript
// tests/e2e/checkout.spec.ts
import { test, expect } from '@playwright/test';
// Tag tests with priority for selective execution
test.describe('Checkout Flow', () => {
test('valid payment completes successfully @p0 @smoke @revenue', async ({ page }) => {
// P0: Revenue-critical happy path
await page.goto('/checkout');
await page.getByTestId('payment-method').selectOption('credit-card');
await page.getByTestId('card-number').fill('4242424242424242');
await page.getByRole('button', { name: 'Place Order' }).click();
await expect(page.getByText('Order confirmed')).toBeVisible();
});
test('expired card shows user-friendly error @p1 @error-handling', async ({ page }) => {
// P1: Core error scenario (frequent user impact)
await page.goto('/checkout');
await page.getByTestId('payment-method').selectOption('credit-card');
await page.getByTestId('card-number').fill('4000000000000069'); // Test card: expired
await page.getByRole('button', { name: 'Place Order' }).click();
await expect(page.getByText('Card expired. Please use a different card.')).toBeVisible();
});
test('coupon code applies discount correctly @p2', async ({ page }) => {
// P2: Secondary feature (nice-to-have)
await page.goto('/checkout');
await page.getByTestId('coupon-code').fill('SAVE10');
await page.getByRole('button', { name: 'Apply' }).click();
await expect(page.getByText('10% discount applied')).toBeVisible();
});
test('gift message formatting preserved @p3', async ({ page }) => {
// P3: Cosmetic feature (rarely used)
await page.goto('/checkout');
await page.getByTestId('gift-message').fill('Happy Birthday!\n\nWith love.');
await page.getByRole('button', { name: 'Place Order' }).click();
// Message formatting preserved (linebreaks intact)
await expect(page.getByTestId('order-summary')).toContainText('Happy Birthday!');
});
});
```
**Run tests by priority:**
```bash
# P0 only (smoke tests, 2-5 min)
npx playwright test --grep @p0
# P0 + P1 (core functionality, 10-15 min)
npx playwright test --grep "@p0|@p1"
# Full regression (all priorities, 30+ min)
npx playwright test
```
---
## Integration with Risk Scoring
Priority should align with risk score from `probability-impact.md`:
| Risk Score | Typical Priority | Rationale |
| ---------- | ---------------- | ------------------------------------------ |
| 9 | P0 | Critical blocker (probability=3, impact=3) |
| 6-8 | P0 or P1 | High risk (requires mitigation) |
| 4-5 | P1 or P2 | Medium risk (monitor closely) |
| 1-3 | P2 or P3 | Low risk (document and defer) |
**Example**: Risk score 9 (checkout API failure) → P0 priority → comprehensive coverage required.
---
## Priority Checklist
Before finalizing test priorities:
- [ ] **Revenue impact assessed**: Payment, subscription, billing features → P0
- [ ] **Security risks identified**: Auth, data exposure, injection attacks → P0
- [ ] **Compliance requirements documented**: GDPR, PCI-DSS, SOC2 → P0
- [ ] **User impact quantified**: >50% users → P0/P1, <10% P2/P3
- [ ] **Previous failures reviewed**: Regression prevention increase priority
- [ ] **Complexity evaluated**: >500 LOC or multiple dependencies → increase priority
- [ ] **Usage metrics consulted**: Frequent use → P0/P1, rare use → P2/P3
- [ ] **Monitoring coverage confirmed**: Strong monitoring → can decrease priority
- [ ] **Rollback capability verified**: Easy rollback → can decrease priority
- [ ] **Priorities tagged in tests**: @p0, @p1, @p2, @p3 for selective execution
## Integration Points
- **Used in workflows**: `*automate` (priority-based test generation), `*test-design` (scenario prioritization), `*trace` (coverage validation by priority)
- **Related fragments**: `risk-governance.md` (risk scoring), `probability-impact.md` (impact assessment), `selective-testing.md` (tag-based execution)
- **Tools**: Playwright/Cypress grep for tag filtering, CI scripts for priority-based execution
_Source: Risk-based testing practices, test prioritization strategies, production incident analysis_

View File

@@ -0,0 +1,664 @@
# Test Quality Definition of Done
## Principle
Tests must be deterministic, isolated, explicit, focused, and fast. Every test should execute in under 1.5 minutes, contain fewer than 300 lines, avoid hard waits and conditionals, keep assertions visible in test bodies, and clean up after itself for parallel execution.
## Rationale
Quality tests provide reliable signal about application health. Flaky tests erode confidence and waste engineering time. Tests that use hard waits (`waitForTimeout(3000)`) are non-deterministic and slow. Tests with hidden assertions or conditional logic become unmaintainable. Large tests (>300 lines) are hard to understand and debug. Slow tests (>1.5 min) block CI pipelines. Self-cleaning tests prevent state pollution in parallel runs.
## Pattern Examples
### Example 1: Deterministic Test Pattern
**Context**: When writing tests, eliminate all sources of non-determinism: hard waits, conditionals controlling flow, try-catch for flow control, and random data without seeds.
**Implementation**:
```typescript
// ❌ BAD: Non-deterministic test with conditionals and hard waits
test('user can view dashboard - FLAKY', async ({ page }) => {
await page.goto('/dashboard');
await page.waitForTimeout(3000); // NEVER - arbitrary wait
// Conditional flow control - test behavior varies
if (await page.locator('[data-testid="welcome-banner"]').isVisible()) {
await page.click('[data-testid="dismiss-banner"]');
await page.waitForTimeout(500);
}
// Try-catch for flow control - hides real issues
try {
await page.click('[data-testid="load-more"]');
} catch (e) {
// Silently continue - test passes even if button missing
}
// Random data without control
const randomEmail = `user${Math.random()}@example.com`;
await expect(page.getByText(randomEmail)).toBeVisible(); // Will fail randomly
});
// ✅ GOOD: Deterministic test with explicit waits
test('user can view dashboard', async ({ page, apiRequest }) => {
const user = createUser({ email: 'test@example.com', hasSeenWelcome: true });
// Setup via API (fast, controlled)
await apiRequest.post('/api/users', { data: user });
// Network-first: Intercept BEFORE navigate
const dashboardPromise = page.waitForResponse((resp) => resp.url().includes('/api/dashboard') && resp.status() === 200);
await page.goto('/dashboard');
// Wait for actual response, not arbitrary time
const dashboardResponse = await dashboardPromise;
const dashboard = await dashboardResponse.json();
// Explicit assertions with controlled data
await expect(page.getByText(`Welcome, ${user.name}`)).toBeVisible();
await expect(page.getByTestId('dashboard-items')).toHaveCount(dashboard.items.length);
// No conditionals - test always executes same path
// No try-catch - failures bubble up clearly
});
// Cypress equivalent
describe('Dashboard', () => {
it('should display user dashboard', () => {
const user = createUser({ email: 'test@example.com', hasSeenWelcome: true });
// Setup via task (fast, controlled)
cy.task('db:seed', { users: [user] });
// Network-first interception
cy.intercept('GET', '**/api/dashboard').as('getDashboard');
cy.visit('/dashboard');
// Deterministic wait for response
cy.wait('@getDashboard').then((interception) => {
const dashboard = interception.response.body;
// Explicit assertions
cy.contains(`Welcome, ${user.name}`).should('be.visible');
cy.get('[data-cy="dashboard-items"]').should('have.length', dashboard.items.length);
});
});
});
```
**Key Points**:
- Replace `waitForTimeout()` with `waitForResponse()` or element state checks
- Never use if/else to control test flow - tests should be deterministic
- Avoid try-catch for flow control - let failures bubble up clearly
- Use factory functions with controlled data, not `Math.random()`
- Network-first pattern prevents race conditions
### Example 2: Isolated Test with Cleanup
**Context**: When tests create data, they must clean up after themselves to prevent state pollution in parallel runs. Use fixture auto-cleanup or explicit teardown.
**Implementation**:
```typescript
// ❌ BAD: Test leaves data behind, pollutes other tests
test('admin can create user - POLLUTES STATE', async ({ page, apiRequest }) => {
await page.goto('/admin/users');
// Hardcoded email - collides in parallel runs
await page.fill('[data-testid="email"]', 'newuser@example.com');
await page.fill('[data-testid="name"]', 'New User');
await page.click('[data-testid="create-user"]');
await expect(page.getByText('User created')).toBeVisible();
// NO CLEANUP - user remains in database
// Next test run fails: "Email already exists"
});
// ✅ GOOD: Test cleans up with fixture auto-cleanup
// playwright/support/fixtures/database-fixture.ts
import { test as base } from '@playwright/test';
import { deleteRecord, seedDatabase } from '../helpers/db-helpers';
type DatabaseFixture = {
seedUser: (userData: Partial<User>) => Promise<User>;
};
export const test = base.extend<DatabaseFixture>({
seedUser: async ({}, use) => {
const createdUsers: string[] = [];
const seedUser = async (userData: Partial<User>) => {
const user = await seedDatabase('users', userData);
createdUsers.push(user.id); // Track for cleanup
return user;
};
await use(seedUser);
// Auto-cleanup: Delete all users created during test
for (const userId of createdUsers) {
await deleteRecord('users', userId);
}
createdUsers.length = 0;
},
});
// Use the fixture
test('admin can create user', async ({ page, seedUser }) => {
// Create admin with unique data
const admin = await seedUser({
email: faker.internet.email(), // Unique each run
role: 'admin',
});
await page.goto('/admin/users');
const newUserEmail = faker.internet.email(); // Unique
await page.fill('[data-testid="email"]', newUserEmail);
await page.fill('[data-testid="name"]', 'New User');
await page.click('[data-testid="create-user"]');
await expect(page.getByText('User created')).toBeVisible();
// Verify in database
const createdUser = await seedUser({ email: newUserEmail });
expect(createdUser.email).toBe(newUserEmail);
// Auto-cleanup happens via fixture teardown
});
// Cypress equivalent with explicit cleanup
describe('Admin User Management', () => {
const createdUserIds: string[] = [];
afterEach(() => {
// Cleanup: Delete all users created during test
createdUserIds.forEach((userId) => {
cy.task('db:delete', { table: 'users', id: userId });
});
createdUserIds.length = 0;
});
it('should create user', () => {
const admin = createUser({ role: 'admin' });
const newUser = createUser(); // Unique data via faker
cy.task('db:seed', { users: [admin] }).then((result: any) => {
createdUserIds.push(result.users[0].id);
});
cy.visit('/admin/users');
cy.get('[data-cy="email"]').type(newUser.email);
cy.get('[data-cy="name"]').type(newUser.name);
cy.get('[data-cy="create-user"]').click();
cy.contains('User created').should('be.visible');
// Track for cleanup
cy.task('db:findByEmail', newUser.email).then((user: any) => {
createdUserIds.push(user.id);
});
});
});
```
**Key Points**:
- Use fixtures with auto-cleanup via teardown (after `use()`)
- Track all created resources in array during test execution
- Use `faker` for unique data - prevents parallel collisions
- Cypress: Use `afterEach()` with explicit cleanup
- Never hardcode IDs or emails - always generate unique values
### Example 3: Explicit Assertions in Tests
**Context**: When validating test results, keep assertions visible in test bodies. Never hide assertions in helper functions - this obscures test intent and makes failures harder to diagnose.
**Implementation**:
```typescript
// ❌ BAD: Assertions hidden in helper functions
// helpers/api-validators.ts
export async function validateUserCreation(response: Response, expectedEmail: string) {
const user = await response.json();
expect(response.status()).toBe(201);
expect(user.email).toBe(expectedEmail);
expect(user.id).toBeTruthy();
expect(user.createdAt).toBeTruthy();
// Hidden assertions - not visible in test
}
test('create user via API - OPAQUE', async ({ request }) => {
const userData = createUser({ email: 'test@example.com' });
const response = await request.post('/api/users', { data: userData });
// What assertions are running? Have to check helper.
await validateUserCreation(response, userData.email);
// When this fails, error is: "validateUserCreation failed" - NOT helpful
});
// ✅ GOOD: Assertions explicit in test
test('create user via API', async ({ request }) => {
const userData = createUser({ email: 'test@example.com' });
const response = await request.post('/api/users', { data: userData });
// All assertions visible - clear test intent
expect(response.status()).toBe(201);
const createdUser = await response.json();
expect(createdUser.id).toBeTruthy();
expect(createdUser.email).toBe(userData.email);
expect(createdUser.name).toBe(userData.name);
expect(createdUser.role).toBe('user');
expect(createdUser.createdAt).toBeTruthy();
expect(createdUser.isActive).toBe(true);
// When this fails, error is: "Expected role to be 'user', got 'admin'" - HELPFUL
});
// ✅ ACCEPTABLE: Helper for data extraction, NOT assertions
// helpers/api-extractors.ts
export async function extractUserFromResponse(response: Response): Promise<User> {
const user = await response.json();
return user; // Just extracts, no assertions
}
test('create user with extraction helper', async ({ request }) => {
const userData = createUser({ email: 'test@example.com' });
const response = await request.post('/api/users', { data: userData });
// Extract data with helper (OK)
const createdUser = await extractUserFromResponse(response);
// But keep assertions in test (REQUIRED)
expect(response.status()).toBe(201);
expect(createdUser.email).toBe(userData.email);
expect(createdUser.role).toBe('user');
});
// Cypress equivalent
describe('User API', () => {
it('should create user with explicit assertions', () => {
const userData = createUser({ email: 'test@example.com' });
cy.request('POST', '/api/users', userData).then((response) => {
// All assertions visible in test
expect(response.status).to.equal(201);
expect(response.body.id).to.exist;
expect(response.body.email).to.equal(userData.email);
expect(response.body.name).to.equal(userData.name);
expect(response.body.role).to.equal('user');
expect(response.body.createdAt).to.exist;
expect(response.body.isActive).to.be.true;
});
});
});
// ✅ GOOD: Parametrized tests for soft assertions (bulk validation)
test.describe('User creation validation', () => {
const testCases = [
{ field: 'email', value: 'test@example.com', expected: 'test@example.com' },
{ field: 'name', value: 'Test User', expected: 'Test User' },
{ field: 'role', value: 'admin', expected: 'admin' },
{ field: 'isActive', value: true, expected: true },
];
for (const { field, value, expected } of testCases) {
test(`should set ${field} correctly`, async ({ request }) => {
const userData = createUser({ [field]: value });
const response = await request.post('/api/users', { data: userData });
const user = await response.json();
// Parametrized assertion - still explicit
expect(user[field]).toBe(expected);
});
}
});
```
**Key Points**:
- Never hide `expect()` calls in helper functions
- Helpers can extract/transform data, but assertions stay in tests
- Parametrized tests are acceptable for bulk validation (still explicit)
- Explicit assertions make failures actionable: "Expected X, got Y"
- Hidden assertions produce vague failures: "Helper function failed"
### Example 4: Test Length Limits
**Context**: When tests grow beyond 300 lines, they become hard to understand, debug, and maintain. Refactor long tests by extracting setup helpers, splitting scenarios, or using fixtures.
**Implementation**:
```typescript
// ❌ BAD: 400-line monolithic test (truncated for example)
test('complete user journey - TOO LONG', async ({ page, request }) => {
// 50 lines of setup
const admin = createUser({ role: 'admin' });
await request.post('/api/users', { data: admin });
await page.goto('/login');
await page.fill('[data-testid="email"]', admin.email);
await page.fill('[data-testid="password"]', 'password123');
await page.click('[data-testid="login"]');
await expect(page).toHaveURL('/dashboard');
// 100 lines of user creation
await page.goto('/admin/users');
const newUser = createUser();
await page.fill('[data-testid="email"]', newUser.email);
// ... 95 more lines of form filling, validation, etc.
// 100 lines of permissions assignment
await page.click('[data-testid="assign-permissions"]');
// ... 95 more lines
// 100 lines of notification preferences
await page.click('[data-testid="notification-settings"]');
// ... 95 more lines
// 50 lines of cleanup
await request.delete(`/api/users/${newUser.id}`);
// ... 45 more lines
// TOTAL: 400 lines - impossible to understand or debug
});
// ✅ GOOD: Split into focused tests with shared fixture
// playwright/support/fixtures/admin-fixture.ts
export const test = base.extend({
adminPage: async ({ page, request }, use) => {
// Shared setup: Login as admin
const admin = createUser({ role: 'admin' });
await request.post('/api/users', { data: admin });
await page.goto('/login');
await page.fill('[data-testid="email"]', admin.email);
await page.fill('[data-testid="password"]', 'password123');
await page.click('[data-testid="login"]');
await expect(page).toHaveURL('/dashboard');
await use(page); // Provide logged-in page
// Cleanup handled by fixture
},
});
// Test 1: User creation (50 lines)
test('admin can create user', async ({ adminPage, seedUser }) => {
await adminPage.goto('/admin/users');
const newUser = createUser();
await adminPage.fill('[data-testid="email"]', newUser.email);
await adminPage.fill('[data-testid="name"]', newUser.name);
await adminPage.click('[data-testid="role-dropdown"]');
await adminPage.click('[data-testid="role-user"]');
await adminPage.click('[data-testid="create-user"]');
await expect(adminPage.getByText('User created')).toBeVisible();
await expect(adminPage.getByText(newUser.email)).toBeVisible();
// Verify in database
const created = await seedUser({ email: newUser.email });
expect(created.role).toBe('user');
});
// Test 2: Permission assignment (60 lines)
test('admin can assign permissions', async ({ adminPage, seedUser }) => {
const user = await seedUser({ email: faker.internet.email() });
await adminPage.goto(`/admin/users/${user.id}`);
await adminPage.click('[data-testid="assign-permissions"]');
await adminPage.check('[data-testid="permission-read"]');
await adminPage.check('[data-testid="permission-write"]');
await adminPage.click('[data-testid="save-permissions"]');
await expect(adminPage.getByText('Permissions updated')).toBeVisible();
// Verify permissions assigned
const response = await adminPage.request.get(`/api/users/${user.id}`);
const updated = await response.json();
expect(updated.permissions).toContain('read');
expect(updated.permissions).toContain('write');
});
// Test 3: Notification preferences (70 lines)
test('admin can update notification preferences', async ({ adminPage, seedUser }) => {
const user = await seedUser({ email: faker.internet.email() });
await adminPage.goto(`/admin/users/${user.id}/notifications`);
await adminPage.check('[data-testid="email-notifications"]');
await adminPage.uncheck('[data-testid="sms-notifications"]');
await adminPage.selectOption('[data-testid="frequency"]', 'daily');
await adminPage.click('[data-testid="save-preferences"]');
await expect(adminPage.getByText('Preferences saved')).toBeVisible();
// Verify preferences
const response = await adminPage.request.get(`/api/users/${user.id}/preferences`);
const prefs = await response.json();
expect(prefs.emailEnabled).toBe(true);
expect(prefs.smsEnabled).toBe(false);
expect(prefs.frequency).toBe('daily');
});
// TOTAL: 3 tests × 60 lines avg = 180 lines
// Each test is focused, debuggable, and under 300 lines
```
**Key Points**:
- Split monolithic tests into focused scenarios (<300 lines each)
- Extract common setup into fixtures (auto-runs for each test)
- Each test validates one concern (user creation, permissions, preferences)
- Failures are easier to diagnose: "Permission assignment failed" vs "Complete journey failed"
- Tests can run in parallel (isolated concerns)
### Example 5: Execution Time Optimization
**Context**: When tests take longer than 1.5 minutes, they slow CI pipelines and feedback loops. Optimize by using API setup instead of UI navigation, parallelizing independent operations, and avoiding unnecessary waits.
**Implementation**:
```typescript
// ❌ BAD: 4-minute test (slow setup, sequential operations)
test('user completes order - SLOW (4 min)', async ({ page }) => {
// Step 1: Manual signup via UI (90 seconds)
await page.goto('/signup');
await page.fill('[data-testid="email"]', 'buyer@example.com');
await page.fill('[data-testid="password"]', 'password123');
await page.fill('[data-testid="confirm-password"]', 'password123');
await page.fill('[data-testid="name"]', 'Buyer User');
await page.click('[data-testid="signup"]');
await page.waitForURL('/verify-email'); // Wait for email verification
// ... manual email verification flow
// Step 2: Manual product creation via UI (60 seconds)
await page.goto('/admin/products');
await page.fill('[data-testid="product-name"]', 'Widget');
// ... 20 more fields
await page.click('[data-testid="create-product"]');
// Step 3: Navigate to checkout (30 seconds)
await page.goto('/products');
await page.waitForTimeout(5000); // Unnecessary hard wait
await page.click('[data-testid="product-widget"]');
await page.waitForTimeout(3000); // Unnecessary
await page.click('[data-testid="add-to-cart"]');
await page.waitForTimeout(2000); // Unnecessary
// Step 4: Complete checkout (40 seconds)
await page.goto('/checkout');
await page.waitForTimeout(5000); // Unnecessary
await page.fill('[data-testid="credit-card"]', '4111111111111111');
// ... more form filling
await page.click('[data-testid="submit-order"]');
await page.waitForTimeout(10000); // Unnecessary
await expect(page.getByText('Order Confirmed')).toBeVisible();
// TOTAL: ~240 seconds (4 minutes)
});
// ✅ GOOD: 45-second test (API setup, parallel ops, deterministic waits)
test('user completes order', async ({ page, apiRequest }) => {
// Step 1: API setup (parallel, 5 seconds total)
const [user, product] = await Promise.all([
// Create user via API (fast)
apiRequest
.post('/api/users', {
data: createUser({
email: 'buyer@example.com',
emailVerified: true, // Skip verification
}),
})
.then((r) => r.json()),
// Create product via API (fast)
apiRequest
.post('/api/products', {
data: createProduct({
name: 'Widget',
price: 29.99,
stock: 10,
}),
})
.then((r) => r.json()),
]);
// Step 2: Auth setup via storage state (instant, 0 seconds)
await page.context().addCookies([
{
name: 'auth_token',
value: user.token,
domain: 'localhost',
path: '/',
},
]);
// Step 3: Network-first interception BEFORE navigation (10 seconds)
const cartPromise = page.waitForResponse('**/api/cart');
const orderPromise = page.waitForResponse('**/api/orders');
await page.goto(`/products/${product.id}`);
await page.click('[data-testid="add-to-cart"]');
await cartPromise; // Deterministic wait (no hard wait)
// Step 4: Checkout with network waits (30 seconds)
await page.goto('/checkout');
await page.fill('[data-testid="credit-card"]', '4111111111111111');
await page.fill('[data-testid="cvv"]', '123');
await page.fill('[data-testid="expiry"]', '12/25');
await page.click('[data-testid="submit-order"]');
await orderPromise; // Deterministic wait (no hard wait)
await expect(page.getByText('Order Confirmed')).toBeVisible();
await expect(page.getByText(`Order #${product.id}`)).toBeVisible();
// TOTAL: ~45 seconds (6x faster)
});
// Cypress equivalent
describe('Order Flow', () => {
it('should complete purchase quickly', () => {
// Step 1: API setup (parallel, fast)
const user = createUser({ emailVerified: true });
const product = createProduct({ name: 'Widget', price: 29.99 });
cy.task('db:seed', { users: [user], products: [product] });
// Step 2: Auth setup via session (instant)
cy.setCookie('auth_token', user.token);
// Step 3: Network-first interception
cy.intercept('POST', '**/api/cart').as('addToCart');
cy.intercept('POST', '**/api/orders').as('createOrder');
cy.visit(`/products/${product.id}`);
cy.get('[data-cy="add-to-cart"]').click();
cy.wait('@addToCart'); // Deterministic wait
// Step 4: Checkout
cy.visit('/checkout');
cy.get('[data-cy="credit-card"]').type('4111111111111111');
cy.get('[data-cy="cvv"]').type('123');
cy.get('[data-cy="expiry"]').type('12/25');
cy.get('[data-cy="submit-order"]').click();
cy.wait('@createOrder'); // Deterministic wait
cy.contains('Order Confirmed').should('be.visible');
cy.contains(`Order #${product.id}`).should('be.visible');
});
});
// Additional optimization: Shared auth state (0 seconds per test)
// playwright/support/global-setup.ts
export default async function globalSetup() {
const browser = await chromium.launch();
const page = await browser.newPage();
// Create admin user once for all tests
const admin = createUser({ role: 'admin', emailVerified: true });
await page.request.post('/api/users', { data: admin });
// Login once, save session
await page.goto('/login');
await page.fill('[data-testid="email"]', admin.email);
await page.fill('[data-testid="password"]', 'password123');
await page.click('[data-testid="login"]');
// Save auth state for reuse
await page.context().storageState({ path: 'playwright/.auth/admin.json' });
await browser.close();
}
// Use shared auth in tests (instant)
test.use({ storageState: 'playwright/.auth/admin.json' });
test('admin action', async ({ page }) => {
// Already logged in - no auth overhead (0 seconds)
await page.goto('/admin');
// ... test logic
});
```
**Key Points**:
- Use API for data setup (10-50x faster than UI)
- Run independent operations in parallel (`Promise.all`)
- Replace hard waits with deterministic waits (`waitForResponse`)
- Reuse auth sessions via `storageState` (Playwright) or `setCookie` (Cypress)
- Skip unnecessary flows (email verification, multi-step signups)
## Integration Points
- **Used in workflows**: `*atdd` (test generation quality), `*automate` (test expansion quality), `*test-review` (quality validation)
- **Related fragments**:
- `network-first.md` - Deterministic waiting strategies
- `data-factories.md` - Isolated, parallel-safe data patterns
- `fixture-architecture.md` - Setup extraction and cleanup
- `test-levels-framework.md` - Choosing appropriate test granularity for speed
## Core Quality Checklist
Every test must pass these criteria:
- [ ] **No Hard Waits** - Use `waitForResponse`, `waitForLoadState`, or element state (not `waitForTimeout`)
- [ ] **No Conditionals** - Tests execute the same path every time (no if/else, try/catch for flow control)
- [ ] **< 300 Lines** - Keep tests focused; split large tests or extract setup to fixtures
- [ ] **< 1.5 Minutes** - Optimize with API setup, parallel operations, and shared auth
- [ ] **Self-Cleaning** - Use fixtures with auto-cleanup or explicit `afterEach()` teardown
- [ ] **Explicit Assertions** - Keep `expect()` calls in test bodies, not hidden in helpers
- [ ] **Unique Data** - Use `faker` for dynamic data; never hardcode IDs or emails
- [ ] **Parallel-Safe** - Tests don't share state; run successfully with `--workers=4`
_Source: Murat quality checklist, Definition of Done requirements (lines 370-381, 406-422)._

View File

@@ -0,0 +1,372 @@
# Timing Debugging and Race Condition Fixes
## Principle
Race conditions arise when tests make assumptions about asynchronous timing (network, animations, state updates). **Deterministic waiting** eliminates flakiness by explicitly waiting for observable events (network responses, element state changes) instead of arbitrary timeouts.
## Rationale
**The Problem**: Tests pass locally but fail in CI (different timing), or pass/fail randomly (race conditions). Hard waits (`waitForTimeout`, `sleep`) mask timing issues without solving them.
**The Solution**: Replace all hard waits with event-based waits (`waitForResponse`, `waitFor({ state })`). Implement network-first pattern (intercept before navigate). Use explicit state checks (loading spinner detached, data loaded). This makes tests deterministic regardless of network speed or system load.
**Why This Matters**:
- Eliminates flaky tests (0 tolerance for timing-based failures)
- Works consistently across environments (local, CI, production-like)
- Faster test execution (no unnecessary waits)
- Clearer test intent (explicit about what we're waiting for)
## Pattern Examples
### Example 1: Race Condition Identification (Network-First Pattern)
**Context**: Prevent race conditions by intercepting network requests before navigation
**Implementation**:
```typescript
// tests/timing/race-condition-prevention.spec.ts
import { test, expect } from '@playwright/test';
test.describe('Race Condition Prevention Patterns', () => {
test('❌ Anti-Pattern: Navigate then intercept (race condition)', async ({ page, context }) => {
// BAD: Navigation starts before interception ready
await page.goto('/products'); // ⚠️ Race! API might load before route is set
await context.route('**/api/products', (route) => {
route.fulfill({ status: 200, body: JSON.stringify({ products: [] }) });
});
// Test may see real API response or mock (non-deterministic)
});
test('✅ Pattern: Intercept BEFORE navigate (deterministic)', async ({ page, context }) => {
// GOOD: Interception ready before navigation
await context.route('**/api/products', (route) => {
route.fulfill({
status: 200,
contentType: 'application/json',
body: JSON.stringify({
products: [
{ id: 1, name: 'Product A', price: 29.99 },
{ id: 2, name: 'Product B', price: 49.99 },
],
}),
});
});
const responsePromise = page.waitForResponse('**/api/products');
await page.goto('/products'); // Navigation happens AFTER route is ready
await responsePromise; // Explicit wait for network
// Test sees mock response reliably (deterministic)
await expect(page.getByText('Product A')).toBeVisible();
});
test('✅ Pattern: Wait for element state change (loading → loaded)', async ({ page }) => {
await page.goto('/dashboard');
// Wait for loading indicator to appear (confirms load started)
await page.getByTestId('loading-spinner').waitFor({ state: 'visible' });
// Wait for loading indicator to disappear (confirms load complete)
await page.getByTestId('loading-spinner').waitFor({ state: 'detached' });
// Content now reliably visible
await expect(page.getByTestId('dashboard-data')).toBeVisible();
});
test('✅ Pattern: Explicit visibility check (not just presence)', async ({ page }) => {
await page.goto('/modal-demo');
await page.getByRole('button', { name: 'Open Modal' }).click();
// ❌ Bad: Element exists but may not be visible yet
// await expect(page.getByTestId('modal')).toBeAttached()
// ✅ Good: Wait for visibility (accounts for animations)
await expect(page.getByTestId('modal')).toBeVisible();
await expect(page.getByRole('heading', { name: 'Modal Title' })).toBeVisible();
});
test('❌ Anti-Pattern: waitForLoadState("networkidle") in SPAs', async ({ page }) => {
// ⚠️ Deprecated for SPAs (WebSocket connections never idle)
// await page.goto('/dashboard')
// await page.waitForLoadState('networkidle') // May timeout in SPAs
// ✅ Better: Wait for specific API response
const responsePromise = page.waitForResponse('**/api/dashboard');
await page.goto('/dashboard');
await responsePromise;
await expect(page.getByText('Dashboard loaded')).toBeVisible();
});
});
```
**Key Points**:
- Network-first: ALWAYS intercept before navigate (prevents race conditions)
- State changes: Wait for loading spinner detached (explicit load completion)
- Visibility vs presence: `toBeVisible()` accounts for animations, `toBeAttached()` doesn't
- Avoid networkidle: Unreliable in SPAs (WebSocket, polling connections)
- Explicit waits: Document exactly what we're waiting for
---
### Example 2: Deterministic Waiting Patterns (Event-Based, Not Time-Based)
**Context**: Replace all hard waits with observable event waits
**Implementation**:
```typescript
// tests/timing/deterministic-waits.spec.ts
import { test, expect } from '@playwright/test';
test.describe('Deterministic Waiting Patterns', () => {
test('waitForResponse() with URL pattern', async ({ page }) => {
const responsePromise = page.waitForResponse('**/api/products');
await page.goto('/products');
await responsePromise; // Deterministic (waits for exact API call)
await expect(page.getByText('Products loaded')).toBeVisible();
});
test('waitForResponse() with predicate function', async ({ page }) => {
const responsePromise = page.waitForResponse((resp) => resp.url().includes('/api/search') && resp.status() === 200);
await page.goto('/search');
await page.getByPlaceholder('Search').fill('laptop');
await page.getByRole('button', { name: 'Search' }).click();
await responsePromise; // Wait for successful search response
await expect(page.getByTestId('search-results')).toBeVisible();
});
test('waitForFunction() for custom conditions', async ({ page }) => {
await page.goto('/dashboard');
// Wait for custom JavaScript condition
await page.waitForFunction(() => {
const element = document.querySelector('[data-testid="user-count"]');
return element && parseInt(element.textContent || '0') > 0;
});
// User count now loaded
await expect(page.getByTestId('user-count')).not.toHaveText('0');
});
test('waitFor() element state (attached, visible, hidden, detached)', async ({ page }) => {
await page.goto('/products');
// Wait for element to be attached to DOM
await page.getByTestId('product-list').waitFor({ state: 'attached' });
// Wait for element to be visible (animations complete)
await page.getByTestId('product-list').waitFor({ state: 'visible' });
// Perform action
await page.getByText('Product A').click();
// Wait for modal to be hidden (close animation complete)
await page.getByTestId('modal').waitFor({ state: 'hidden' });
});
test('Cypress: cy.wait() with aliased intercepts', async () => {
// Cypress example (not Playwright)
/*
cy.intercept('GET', '/api/products').as('getProducts')
cy.visit('/products')
cy.wait('@getProducts') // Deterministic wait for specific request
cy.get('[data-testid="product-list"]').should('be.visible')
*/
});
});
```
**Key Points**:
- `waitForResponse()`: Wait for specific API calls (URL pattern or predicate)
- `waitForFunction()`: Wait for custom JavaScript conditions
- `waitFor({ state })`: Wait for element state changes (attached, visible, hidden, detached)
- Cypress `cy.wait('@alias')`: Deterministic wait for aliased intercepts
- All waits are event-based (not time-based)
---
### Example 3: Timing Anti-Patterns (What NEVER to Do)
**Context**: Common timing mistakes that cause flakiness
**Problem Examples**:
```typescript
// tests/timing/anti-patterns.spec.ts
import { test, expect } from '@playwright/test';
test.describe('Timing Anti-Patterns to Avoid', () => {
test('❌ NEVER: page.waitForTimeout() (arbitrary delay)', async ({ page }) => {
await page.goto('/dashboard');
// ❌ Bad: Arbitrary 3-second wait (flaky)
// await page.waitForTimeout(3000)
// Problem: Might be too short (CI slower) or too long (wastes time)
// ✅ Good: Wait for observable event
await page.waitForResponse('**/api/dashboard');
await expect(page.getByText('Dashboard loaded')).toBeVisible();
});
test('❌ NEVER: cy.wait(number) without alias (arbitrary delay)', async () => {
// Cypress example
/*
// ❌ Bad: Arbitrary delay
cy.visit('/products')
cy.wait(2000) // Flaky!
// ✅ Good: Wait for specific request
cy.intercept('GET', '/api/products').as('getProducts')
cy.visit('/products')
cy.wait('@getProducts') // Deterministic
*/
});
test('❌ NEVER: Multiple hard waits in sequence (compounding delays)', async ({ page }) => {
await page.goto('/checkout');
// ❌ Bad: Stacked hard waits (6+ seconds wasted)
// await page.waitForTimeout(2000) // Wait for form
// await page.getByTestId('email').fill('test@example.com')
// await page.waitForTimeout(1000) // Wait for validation
// await page.getByTestId('submit').click()
// await page.waitForTimeout(3000) // Wait for redirect
// ✅ Good: Event-based waits (no wasted time)
await page.getByTestId('checkout-form').waitFor({ state: 'visible' });
await page.getByTestId('email').fill('test@example.com');
await page.waitForResponse('**/api/validate-email');
await page.getByTestId('submit').click();
await page.waitForURL('**/confirmation');
});
test('❌ NEVER: waitForLoadState("networkidle") in SPAs', async ({ page }) => {
// ❌ Bad: Unreliable in SPAs (WebSocket connections never idle)
// await page.goto('/dashboard')
// await page.waitForLoadState('networkidle') // Timeout in SPAs!
// ✅ Good: Wait for specific API responses
await page.goto('/dashboard');
await page.waitForResponse('**/api/dashboard');
await page.waitForResponse('**/api/user');
await expect(page.getByTestId('dashboard-content')).toBeVisible();
});
test('❌ NEVER: Sleep/setTimeout in tests', async ({ page }) => {
await page.goto('/products');
// ❌ Bad: Node.js sleep (blocks test thread)
// await new Promise(resolve => setTimeout(resolve, 2000))
// ✅ Good: Playwright auto-waits for element
await expect(page.getByText('Products loaded')).toBeVisible();
});
});
```
**Why These Fail**:
- **Hard waits**: Arbitrary timeouts (too short → flaky, too long → slow)
- **Stacked waits**: Compound delays (wasteful, unreliable)
- **networkidle**: Broken in SPAs (WebSocket/polling never idle)
- **Sleep**: Blocks execution (wastes time, doesn't solve race conditions)
**Better Approach**: Use event-based waits from examples above
---
## Async Debugging Techniques
### Technique 1: Promise Chain Analysis
```typescript
test('debug async waterfall with console logs', async ({ page }) => {
console.log('1. Starting navigation...');
await page.goto('/products');
console.log('2. Waiting for API response...');
const response = await page.waitForResponse('**/api/products');
console.log('3. API responded:', response.status());
console.log('4. Waiting for UI update...');
await expect(page.getByText('Products loaded')).toBeVisible();
console.log('5. Test complete');
// Console output shows exactly where timing issue occurs
});
```
### Technique 2: Network Waterfall Inspection (DevTools)
```typescript
test('inspect network timing with trace viewer', async ({ page }) => {
await page.goto('/dashboard');
// Generate trace for analysis
// npx playwright test --trace on
// npx playwright show-trace trace.zip
// In trace viewer:
// 1. Check Network tab for API call timing
// 2. Identify slow requests (>1s response time)
// 3. Find race conditions (overlapping requests)
// 4. Verify request order (dependencies)
});
```
### Technique 3: Trace Viewer for Timing Visualization
```typescript
test('use trace viewer to debug timing', async ({ page }) => {
// Run with trace: npx playwright test --trace on
await page.goto('/checkout');
await page.getByTestId('submit').click();
// In trace viewer, examine:
// - Timeline: See exact timing of each action
// - Snapshots: Hover to see DOM state at each moment
// - Network: Identify slow/failed requests
// - Console: Check for async errors
await expect(page.getByText('Success')).toBeVisible();
});
```
---
## Race Condition Checklist
Before deploying tests:
- [ ] **Network-first pattern**: All routes intercepted BEFORE navigation (no race conditions)
- [ ] **Explicit waits**: Every navigation followed by `waitForResponse()` or state check
- [ ] **No hard waits**: Zero instances of `waitForTimeout()`, `cy.wait(number)`, `sleep()`
- [ ] **Element state waits**: Loading spinners use `waitFor({ state: 'detached' })`
- [ ] **Visibility checks**: Use `toBeVisible()` (accounts for animations), not just `toBeAttached()`
- [ ] **Response validation**: Wait for successful responses (`resp.ok()` or `status === 200`)
- [ ] **Trace viewer analysis**: Generate traces to identify timing issues (network waterfall, console errors)
- [ ] **CI/local parity**: Tests pass reliably in both environments (no timing assumptions)
## Integration Points
- **Used in workflows**: `*automate` (healing timing failures), `*test-review` (detect hard wait anti-patterns), `*framework` (configure timeout standards)
- **Related fragments**: `test-healing-patterns.md` (race condition diagnosis), `network-first.md` (interception patterns), `playwright-config.md` (timeout configuration), `visual-debugging.md` (trace viewer analysis)
- **Tools**: Playwright Inspector (`--debug`), Trace Viewer (`--trace on`), DevTools Network tab
_Source: Playwright timing best practices, network-first pattern from test-resources-for-ai, production race condition debugging_

View File

@@ -0,0 +1,524 @@
# Visual Debugging and Developer Ergonomics
## Principle
Fast feedback loops and transparent debugging artifacts are critical for maintaining test reliability and developer confidence. Visual debugging tools (trace viewers, screenshots, videos, HAR files) turn cryptic test failures into actionable insights, reducing triage time from hours to minutes.
## Rationale
**The Problem**: CI failures often provide minimal context—a timeout, a selector mismatch, or a network error—forcing developers to reproduce issues locally (if they can). This wastes time and discourages test maintenance.
**The Solution**: Capture rich debugging artifacts **only on failure** to balance storage costs with diagnostic value. Modern tools like Playwright Trace Viewer, Cypress Debug UI, and HAR recordings provide interactive, time-travel debugging that reveals exactly what the test saw at each step.
**Why This Matters**:
- Reduces failure triage time by 80-90% (visual context vs logs alone)
- Enables debugging without local reproduction
- Improves test maintenance confidence (clear failure root cause)
- Catches timing/race conditions that are hard to reproduce locally
## Pattern Examples
### Example 1: Playwright Trace Viewer Configuration (Production Pattern)
**Context**: Capture traces on first retry only (balances storage and diagnostics)
**Implementation**:
```typescript
// playwright.config.ts
import { defineConfig } from '@playwright/test';
export default defineConfig({
use: {
// Visual debugging artifacts (space-efficient)
trace: 'on-first-retry', // Only when test fails once
screenshot: 'only-on-failure', // Not on success
video: 'retain-on-failure', // Delete on pass
// Context for debugging
baseURL: process.env.BASE_URL || 'http://localhost:3000',
// Timeout context
actionTimeout: 15_000, // 15s for clicks/fills
navigationTimeout: 30_000, // 30s for page loads
},
// CI-specific artifact retention
reporter: [
['html', { outputFolder: 'playwright-report', open: 'never' }],
['junit', { outputFile: 'results.xml' }],
['list'], // Console output
],
// Failure handling
retries: process.env.CI ? 2 : 0, // Retry in CI to capture trace
workers: process.env.CI ? 1 : undefined,
});
```
**Opening and Using Trace Viewer**:
```bash
# After test failure in CI, download trace artifact
# Then open locally:
npx playwright show-trace path/to/trace.zip
# Or serve trace viewer:
npx playwright show-report
```
**Key Features to Use in Trace Viewer**:
1. **Timeline**: See each action (click, navigate, assertion) with timing
2. **Snapshots**: Hover over timeline to see DOM state at that moment
3. **Network Tab**: Inspect all API calls, headers, payloads, timing
4. **Console Tab**: View console.log/error messages
5. **Source Tab**: See test code with execution markers
6. **Metadata**: Browser, OS, test duration, screenshots
**Why This Works**:
- `on-first-retry` avoids capturing traces for flaky passes (saves storage)
- Screenshots + video give visual context without trace overhead
- Interactive timeline makes timing issues obvious (race conditions, slow API)
---
### Example 2: HAR File Recording for Network Debugging
**Context**: Capture all network activity for reproducible API debugging
**Implementation**:
```typescript
// tests/e2e/checkout-with-har.spec.ts
import { test, expect } from '@playwright/test';
import path from 'path';
test.describe('Checkout Flow with HAR Recording', () => {
test('should complete payment with full network capture', async ({ page, context }) => {
// Start HAR recording BEFORE navigation
await context.routeFromHAR(path.join(__dirname, '../fixtures/checkout.har'), {
url: '**/api/**', // Only capture API calls
update: true, // Update HAR if file exists
});
await page.goto('/checkout');
// Interact with page
await page.getByTestId('payment-method').selectOption('credit-card');
await page.getByTestId('card-number').fill('4242424242424242');
await page.getByTestId('submit-payment').click();
// Wait for payment confirmation
await expect(page.getByTestId('success-message')).toBeVisible();
// HAR file saved to fixtures/checkout.har
// Contains all network requests/responses for replay
});
});
```
**Using HAR for Deterministic Mocking**:
```typescript
// tests/e2e/checkout-replay-har.spec.ts
import { test, expect } from '@playwright/test';
import path from 'path';
test('should replay checkout flow from HAR', async ({ page, context }) => {
// Replay network from HAR (no real API calls)
await context.routeFromHAR(path.join(__dirname, '../fixtures/checkout.har'), {
url: '**/api/**',
update: false, // Read-only mode
});
await page.goto('/checkout');
// Same test, but network responses come from HAR file
await page.getByTestId('payment-method').selectOption('credit-card');
await page.getByTestId('card-number').fill('4242424242424242');
await page.getByTestId('submit-payment').click();
await expect(page.getByTestId('success-message')).toBeVisible();
});
```
**Key Points**:
- **`update: true`** records new HAR or updates existing (for flaky API debugging)
- **`update: false`** replays from HAR (deterministic, no real API)
- Filter by URL pattern (`**/api/**`) to avoid capturing static assets
- HAR files are human-readable JSON (easy to inspect/modify)
**When to Use HAR**:
- Debugging flaky tests caused by API timing/responses
- Creating deterministic mocks for integration tests
- Analyzing third-party API behavior (Stripe, Auth0)
- Reproducing production issues locally (record HAR in staging)
---
### Example 3: Custom Artifact Capture (Console Logs + Network on Failure)
**Context**: Capture additional debugging context automatically on test failure
**Implementation**:
```typescript
// playwright/support/fixtures/debug-fixture.ts
import { test as base } from '@playwright/test';
import fs from 'fs';
import path from 'path';
type DebugFixture = {
captureDebugArtifacts: () => Promise<void>;
};
export const test = base.extend<DebugFixture>({
captureDebugArtifacts: async ({ page }, use, testInfo) => {
const consoleLogs: string[] = [];
const networkRequests: Array<{ url: string; status: number; method: string }> = [];
// Capture console messages
page.on('console', (msg) => {
consoleLogs.push(`[${msg.type()}] ${msg.text()}`);
});
// Capture network requests
page.on('request', (request) => {
networkRequests.push({
url: request.url(),
method: request.method(),
status: 0, // Will be updated on response
});
});
page.on('response', (response) => {
const req = networkRequests.find((r) => r.url === response.url());
if (req) req.status = response.status();
});
await use(async () => {
// This function can be called manually in tests
// But it also runs automatically on failure via afterEach
});
// After test completes, save artifacts if failed
if (testInfo.status !== testInfo.expectedStatus) {
const artifactDir = path.join(testInfo.outputDir, 'debug-artifacts');
fs.mkdirSync(artifactDir, { recursive: true });
// Save console logs
fs.writeFileSync(path.join(artifactDir, 'console.log'), consoleLogs.join('\n'), 'utf-8');
// Save network summary
fs.writeFileSync(path.join(artifactDir, 'network.json'), JSON.stringify(networkRequests, null, 2), 'utf-8');
console.log(`Debug artifacts saved to: ${artifactDir}`);
}
},
});
```
**Usage in Tests**:
```typescript
// tests/e2e/payment-with-debug.spec.ts
import { test, expect } from '../support/fixtures/debug-fixture';
test('payment flow captures debug artifacts on failure', async ({ page, captureDebugArtifacts }) => {
await page.goto('/checkout');
// Test will automatically capture console + network on failure
await page.getByTestId('submit-payment').click();
await expect(page.getByTestId('success-message')).toBeVisible({ timeout: 5000 });
// If this fails, console.log and network.json saved automatically
});
```
**CI Integration (GitHub Actions)**:
```yaml
# .github/workflows/e2e.yml
name: E2E Tests with Artifacts
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version-file: '.nvmrc'
- name: Install dependencies
run: npm ci
- name: Run Playwright tests
run: npm run test:e2e
continue-on-error: true # Capture artifacts even on failure
- name: Upload test artifacts on failure
if: failure()
uses: actions/upload-artifact@v4
with:
name: playwright-artifacts
path: |
test-results/
playwright-report/
retention-days: 30
```
**Key Points**:
- Fixtures automatically capture context without polluting test code
- Only saves artifacts on failure (storage-efficient)
- CI uploads artifacts for post-mortem analysis
- `continue-on-error: true` ensures artifact upload even when tests fail
---
### Example 4: Accessibility Debugging Integration (axe-core in Trace Viewer)
**Context**: Catch accessibility regressions during visual debugging
**Implementation**:
```typescript
// playwright/support/fixtures/a11y-fixture.ts
import { test as base } from '@playwright/test';
import AxeBuilder from '@axe-core/playwright';
type A11yFixture = {
checkA11y: () => Promise<void>;
};
export const test = base.extend<A11yFixture>({
checkA11y: async ({ page }, use) => {
await use(async () => {
// Run axe accessibility scan
const results = await new AxeBuilder({ page }).analyze();
// Attach results to test report (visible in trace viewer)
if (results.violations.length > 0) {
console.log(`Found ${results.violations.length} accessibility violations:`);
results.violations.forEach((violation) => {
console.log(`- [${violation.impact}] ${violation.id}: ${violation.description}`);
console.log(` Help: ${violation.helpUrl}`);
});
throw new Error(`Accessibility violations found: ${results.violations.length}`);
}
});
},
});
```
**Usage with Visual Debugging**:
```typescript
// tests/e2e/checkout-a11y.spec.ts
import { test, expect } from '../support/fixtures/a11y-fixture';
test('checkout page is accessible', async ({ page, checkA11y }) => {
await page.goto('/checkout');
// Verify page loaded
await expect(page.getByRole('heading', { name: 'Checkout' })).toBeVisible();
// Run accessibility check
await checkA11y();
// If violations found, test fails and trace captures:
// - Screenshot showing the problematic element
// - Console log with violation details
// - Network tab showing any failed resource loads
});
```
**Trace Viewer Benefits**:
- **Screenshot shows visual context** of accessibility issue (contrast, missing labels)
- **Console tab shows axe-core violations** with impact level and helpUrl
- **DOM snapshot** allows inspecting ARIA attributes at failure point
- **Network tab** reveals if icon fonts or images failed (common a11y issue)
**Cypress Equivalent**:
```javascript
// cypress/support/commands.ts
import 'cypress-axe';
Cypress.Commands.add('checkA11y', (context = null, options = {}) => {
cy.injectAxe(); // Inject axe-core
cy.checkA11y(context, options, (violations) => {
if (violations.length) {
cy.task('log', `Found ${violations.length} accessibility violations`);
violations.forEach((violation) => {
cy.task('log', `- [${violation.impact}] ${violation.id}: ${violation.description}`);
});
}
});
});
// tests/e2e/checkout-a11y.cy.ts
describe('Checkout Accessibility', () => {
it('should have no a11y violations', () => {
cy.visit('/checkout');
cy.injectAxe();
cy.checkA11y();
// On failure, Cypress UI shows:
// - Screenshot of page
// - Console log with violation details
// - Network tab with API calls
});
});
```
**Key Points**:
- Accessibility checks integrate seamlessly with visual debugging
- Violations are captured in trace viewer/Cypress UI automatically
- Provides actionable links (helpUrl) to fix issues
- Screenshots show visual context (contrast, layout)
---
### Example 5: Time-Travel Debugging Workflow (Playwright Inspector)
**Context**: Debug tests interactively with step-through execution
**Implementation**:
```typescript
// tests/e2e/checkout-debug.spec.ts
import { test, expect } from '@playwright/test';
test('debug checkout flow step-by-step', async ({ page }) => {
// Set breakpoint by uncommenting this:
// await page.pause()
await page.goto('/checkout');
// Use Playwright Inspector to:
// 1. Step through each action
// 2. Inspect DOM at each step
// 3. View network calls per action
// 4. Take screenshots manually
await page.getByTestId('payment-method').selectOption('credit-card');
// Pause here to inspect form state
// await page.pause()
await page.getByTestId('card-number').fill('4242424242424242');
await page.getByTestId('submit-payment').click();
await expect(page.getByTestId('success-message')).toBeVisible();
});
```
**Running with Inspector**:
```bash
# Open Playwright Inspector (GUI debugger)
npx playwright test --debug
# Or use headed mode with slowMo
npx playwright test --headed --slow-mo=1000
# Debug specific test
npx playwright test checkout-debug.spec.ts --debug
# Set environment variable for persistent debugging
PWDEBUG=1 npx playwright test
```
**Inspector Features**:
1. **Step-through execution**: Click "Next" to execute one action at a time
2. **DOM inspector**: Hover over elements to see selectors
3. **Network panel**: See API calls with timing
4. **Console panel**: View console.log output
5. **Pick locator**: Click element in browser to get selector
6. **Record mode**: Record interactions to generate test code
**Common Debugging Patterns**:
```typescript
// Pattern 1: Debug selector issues
test('debug selector', async ({ page }) => {
await page.goto('/dashboard');
await page.pause(); // Inspector opens
// In Inspector console, test selectors:
// page.getByTestId('user-menu') ✅
// page.getByRole('button', { name: 'Profile' }) ✅
// page.locator('.btn-primary') ❌ (fragile)
});
// Pattern 2: Debug timing issues
test('debug network timing', async ({ page }) => {
await page.goto('/dashboard');
// Set up network listener BEFORE interaction
const responsePromise = page.waitForResponse('**/api/users');
await page.getByTestId('load-users').click();
await page.pause(); // Check network panel for timing
const response = await responsePromise;
expect(response.status()).toBe(200);
});
// Pattern 3: Debug state changes
test('debug state mutation', async ({ page }) => {
await page.goto('/cart');
// Check initial state
await expect(page.getByTestId('cart-count')).toHaveText('0');
await page.pause(); // Inspect DOM
await page.getByTestId('add-to-cart').click();
await page.pause(); // Inspect DOM again (compare state)
await expect(page.getByTestId('cart-count')).toHaveText('1');
});
```
**Key Points**:
- `page.pause()` opens Inspector at that exact moment
- Inspector shows DOM state, network activity, console at pause point
- "Pick locator" feature helps find robust selectors
- Record mode generates test code from manual interactions
---
## Visual Debugging Checklist
Before deploying tests to CI, ensure:
- [ ] **Artifact configuration**: `trace: 'on-first-retry'`, `screenshot: 'only-on-failure'`, `video: 'retain-on-failure'`
- [ ] **CI artifact upload**: GitHub Actions/GitLab CI configured to upload `test-results/` and `playwright-report/`
- [ ] **HAR recording**: Set up for flaky API tests (record once, replay deterministically)
- [ ] **Custom debug fixtures**: Console logs + network summary captured on failure
- [ ] **Accessibility integration**: axe-core violations visible in trace viewer
- [ ] **Trace viewer docs**: README explains how to open traces locally (`npx playwright show-trace`)
- [ ] **Inspector workflow**: Document `--debug` flag for interactive debugging
- [ ] **Storage optimization**: Artifacts deleted after 30 days (CI retention policy)
## Integration Points
- **Used in workflows**: `*framework` (initial setup), `*ci` (artifact upload), `*test-review` (validate artifact config)
- **Related fragments**: `playwright-config.md` (artifact configuration), `ci-burn-in.md` (CI artifact upload), `test-quality.md` (debugging best practices)
- **Tools**: Playwright Trace Viewer, Cypress Debug UI, axe-core, HAR files
_Source: Playwright official docs, Murat testing philosophy (visual debugging manifesto), SEON production debugging patterns_