pig/pig-farm-controller

Fork 0

Files

huang 426ae41f54 bmad初始化

2025-11-01 19:22:39 +08:00

42 KiB

Raw Blame History

Test Automation Expansion

Workflow ID: bmad/bmm/testarch/automate Version: 4.0 (BMad v6)

Overview

Expands test automation coverage by generating comprehensive test suites at appropriate levels (E2E, API, Component, Unit) with supporting infrastructure. This workflow operates in dual mode:

BMad-Integrated Mode: Works WITH BMad artifacts (story, tech-spec, PRD, test-design) to expand coverage after story implementation
Standalone Mode: Works WITHOUT BMad artifacts - analyzes existing codebase and generates tests independently

Core Principle: Generate prioritized, deterministic tests that avoid duplicate coverage and follow testing best practices.

Preflight Requirements

Flexible: This workflow can run with minimal prerequisites. Only HALT if framework is completely missing.

Required (Always)

✅ Framework scaffolding configured (run framework workflow if missing)
✅ Test framework configuration available (playwright.config.ts or cypress.config.ts)

Optional (BMad-Integrated Mode)

Story markdown with acceptance criteria (enhances coverage targeting)
Tech spec or PRD (provides architectural context)
Test design document (provides risk/priority context)

Optional (Standalone Mode)

Source code to analyze (feature implementation)
Existing tests (for gap analysis)

If framework is missing: HALT with message: "Framework scaffolding required. Run bmad tea *framework first."

Step 1: Determine Execution Mode and Load Context

Actions

Detect Execution Mode

Check if BMad artifacts are available:
- If {story_file} variable is set → BMad-Integrated Mode
- If {target_feature} or {target_files} set → Standalone Mode
- If neither set → Auto-discover mode (scan codebase for features needing tests)
Load BMad Artifacts (If Available)

BMad-Integrated Mode:
- Read story markdown from {story_file}
- Extract acceptance criteria and technical requirements
- Load tech-spec.md if {use_tech_spec} is true
- Load test-design.md if {use_test_design} is true
- Load PRD.md if {use_prd} is true
- Note: These are optional enhancements, not hard requirements
Standalone Mode:
- Skip BMad artifact loading
- Proceed directly to source code analysis
Load Framework Configuration
- Read test framework config (playwright.config.ts or cypress.config.ts)
- Identify test directory structure from {test_dir}
- Check existing test patterns in {test_dir}
- Note test runner capabilities (parallel execution, fixtures, etc.)
Analyze Existing Test Coverage

If {analyze_coverage} is true:
- Search {test_dir} for existing test files
- Identify tested features vs untested features
- Map tests to source files (coverage gaps)
- Check existing fixture and factory patterns
Load Knowledge Base Fragments

Critical: Consult {project-root}/bmad/bmm/testarch/tea-index.csv to load:
- test-levels-framework.md - Test level selection (E2E vs API vs Component vs Unit with decision matrix, 467 lines, 4 examples)
- test-priorities-matrix.md - Priority classification (P0-P3 with automated scoring, risk mapping, 389 lines, 2 examples)
- fixture-architecture.md - Test fixture patterns (pure function → fixture → mergeTests, auto-cleanup, 406 lines, 5 examples)
- data-factories.md - Factory patterns with faker (overrides, nested factories, API seeding, 498 lines, 5 examples)
- selective-testing.md - Targeted test execution strategies (tag-based, spec filters, diff-based, promotion rules, 727 lines, 4 examples)
- ci-burn-in.md - Flaky test detection patterns (10-iteration burn-in, sharding, selective execution, 678 lines, 4 examples)
- test-quality.md - Test design principles (deterministic, isolated, explicit assertions, length/time limits, 658 lines, 5 examples)
- network-first.md - Route interception patterns (intercept before navigate, HAR capture, deterministic waiting, 489 lines, 5 examples)
Healing Knowledge (If {auto_heal_failures} is true):
- test-healing-patterns.md - Common failure patterns and automated fixes (stale selectors, race conditions, dynamic data, network errors, hard waits, 648 lines, 5 examples)
- selector-resilience.md - Selector debugging and refactoring guide (data-testid > ARIA > text > CSS hierarchy, anti-patterns, 541 lines, 4 examples)
- timing-debugging.md - Race condition identification and fixes (network-first, deterministic waiting, async debugging, 370 lines, 3 examples)

Step 2: Identify Automation Targets

Actions

Determine What Needs Testing

BMad-Integrated Mode (story available):
- Map acceptance criteria from story to test scenarios
- Identify features implemented in this story
- Check if story has existing ATDD tests (from *atdd workflow)
- Expand beyond ATDD with edge cases and negative paths
Standalone Mode (no story):
- If {target_feature} specified: Analyze that specific feature
- If {target_files} specified: Analyze those specific files
- If {auto_discover_features} is true: Scan {source_dir} for features
- Prioritize features with:
  - No test coverage (highest priority)
  - Complex business logic
  - External integrations (API calls, database, auth)
  - Critical user paths (login, checkout, etc.)
Apply Test Level Selection Framework

Knowledge Base Reference: test-levels-framework.md

For each feature or acceptance criterion, determine appropriate test level:

E2E (End-to-End):
- Critical user journeys (login, checkout, core workflows)
- Multi-system integration
- Full user-facing scenarios
- Characteristics: High confidence, slow, brittle
API (Integration):
- Business logic validation
- Service contracts and data transformations
- Backend integration without UI
- Characteristics: Fast feedback, stable, good balance
Component:
- UI component behavior (buttons, forms, modals)
- Interaction testing (click, hover, keyboard)
- State management within component
- Characteristics: Fast, isolated, granular
Unit:
- Pure business logic and algorithms
- Edge cases and error handling
- Minimal dependencies
- Characteristics: Fastest, most granular
Avoid Duplicate Coverage

Critical principle: Don't test same behavior at multiple levels unless necessary
- Use E2E for critical happy path only
- Use API tests for business logic variations
- Use component tests for UI interaction edge cases
- Use unit tests for pure logic edge cases
Example:
- E2E: User can log in with valid credentials → Dashboard loads
- API: POST /auth/login returns 401 for invalid credentials
- API: POST /auth/login returns 200 and JWT token for valid credentials
- Component: LoginForm disables submit button when fields are empty
- Unit: validateEmail() returns false for malformed email addresses
Assign Test Priorities

Knowledge Base Reference: test-priorities-matrix.md

P0 (Critical - Every commit):
- Critical user paths that must always work
- Security-critical functionality (auth, permissions)
- Data integrity scenarios
- Run in pre-commit hooks or PR checks
P1 (High - PR to main):
- Important features with high user impact
- Integration points between systems
- Error handling for common failures
- Run before merging to main branch
P2 (Medium - Nightly):
- Edge cases with moderate impact
- Less-critical feature variations
- Performance/load testing
- Run in nightly CI builds
P3 (Low - On-demand):
- Nice-to-have validations
- Rarely-used features
- Exploratory testing scenarios
- Run manually or weekly
Priority Variables:
- {include_p0} - Always include (default: true)
- {include_p1} - High priority (default: true)
- {include_p2} - Medium priority (default: true)
- {include_p3} - Low priority (default: false)

Create Test Coverage Plan

Document what will be tested at each level with priorities:

## Test Coverage Plan

### E2E Tests (P0)

- User login with valid credentials → Dashboard loads
- User logout → Redirects to login page

### API Tests (P1)

- POST /auth/login - valid credentials → 200 + JWT token
- POST /auth/login - invalid credentials → 401 + error message
- POST /auth/login - missing fields → 400 + validation errors

### Component Tests (P1)

- LoginForm - empty fields → submit button disabled
- LoginForm - valid input → submit button enabled

### Unit Tests (P2)

- validateEmail() - valid email → returns true
- validateEmail() - malformed email → returns false

Step 3: Generate Test Infrastructure

Actions

Enhance Fixture Architecture

Knowledge Base Reference: fixture-architecture.md

Check existing fixtures in tests/support/fixtures/:

If missing or incomplete, create fixture architecture
Use Playwright's test.extend() pattern
Ensure all fixtures have auto-cleanup in teardown

Common fixtures to create/enhance:

authenticatedUser: User with valid session (auto-deletes user after test)
apiRequest: Authenticated API client with base URL and headers
mockNetwork: Network mocking for external services
testDatabase: Database with test data (auto-cleanup after test)

Example fixture:

// tests/support/fixtures/auth.fixture.ts
import { test as base } from '@playwright/test';
import { createUser, deleteUser } from '../factories/user.factory';

export const test = base.extend({
  authenticatedUser: async ({ page }, use) => {
    // Setup: Create and authenticate user
    const user = await createUser();
    await page.goto('/login');
    await page.fill('[data-testid="email"]', user.email);
    await page.fill('[data-testid="password"]', user.password);
    await page.click('[data-testid="login-button"]');
    await page.waitForURL('/dashboard');

    // Provide to test
    await use(user);

    // Cleanup: Delete user automatically
    await deleteUser(user.id);
  },
});

Enhance Data Factories

Knowledge Base Reference: data-factories.md

Check existing factories in tests/support/factories/:

If missing or incomplete, create factory architecture
Use @faker-js/faker for all random data (no hardcoded values)
Support overrides for specific test scenarios

Common factories to create/enhance:

User factory (email, password, name, role)
Product factory (name, price, description, SKU)
Order factory (items, total, status, customer)

Example factory:

// tests/support/factories/user.factory.ts
import { faker } from '@faker-js/faker';

export const createUser = (overrides = {}) => ({
  id: faker.number.int(),
  email: faker.internet.email(),
  password: faker.internet.password(),
  name: faker.person.fullName(),
  role: 'user',
  createdAt: faker.date.recent().toISOString(),
  ...overrides,
});

export const createUsers = (count: number) => Array.from({ length: count }, () => createUser());

// API helper for cleanup
export const deleteUser = async (userId: number) => {
  await fetch(`/api/users/${userId}`, { method: 'DELETE' });
};

Create/Enhance Helper Utilities

If {update_helpers} is true:

Check tests/support/helpers/ for common utilities:

waitFor: Polling helper for complex conditions
retry: Retry helper for flaky operations
testData: Test data generation helpers
assertions: Custom assertion helpers

Example helper:

// tests/support/helpers/wait-for.ts
export const waitFor = async (condition: () => Promise<boolean>, timeout = 5000, interval = 100): Promise<void> => {
  const startTime = Date.now();
  while (Date.now() - startTime < timeout) {
    if (await condition()) return;
    await new Promise((resolve) => setTimeout(resolve, interval));
  }
  throw new Error(`Condition not met within ${timeout}ms`);
};

Step 4: Generate Test Files

Actions

Create Test File Structure

tests/
├── e2e/
│   └── {feature-name}.spec.ts        # E2E tests (P0-P1)
├── api/
│   └── {feature-name}.api.spec.ts    # API tests (P1-P2)
├── component/
│   └── {ComponentName}.test.tsx      # Component tests (P1-P2)
├── unit/
│   └── {module-name}.test.ts         # Unit tests (P2-P3)
└── support/
    ├── fixtures/                      # Test fixtures
    ├── factories/                     # Data factories
    └── helpers/                       # Utility functions

Write E2E Tests (If Applicable)

Follow Given-When-Then format:

import { test, expect } from '@playwright/test';

test.describe('User Authentication', () => {
  test('[P0] should login with valid credentials and load dashboard', async ({ page }) => {
    // GIVEN: User is on login page
    await page.goto('/login');

    // WHEN: User submits valid credentials
    await page.fill('[data-testid="email-input"]', 'user@example.com');
    await page.fill('[data-testid="password-input"]', 'Password123!');
    await page.click('[data-testid="login-button"]');

    // THEN: User is redirected to dashboard
    await expect(page).toHaveURL('/dashboard');
    await expect(page.locator('[data-testid="user-name"]')).toBeVisible();
  });

  test('[P1] should display error for invalid credentials', async ({ page }) => {
    // GIVEN: User is on login page
    await page.goto('/login');

    // WHEN: User submits invalid credentials
    await page.fill('[data-testid="email-input"]', 'invalid@example.com');
    await page.fill('[data-testid="password-input"]', 'wrongpassword');
    await page.click('[data-testid="login-button"]');

    // THEN: Error message is displayed
    await expect(page.locator('[data-testid="error-message"]')).toHaveText('Invalid email or password');
  });
});

Critical patterns:

Tag tests with priority: [P0], [P1], [P2], [P3] in test name
One assertion per test (atomic tests)
Explicit waits (no hard waits/sleeps)
Network-first approach (route interception before navigation)
data-testid selectors for stability
Clear Given-When-Then structure

Write API Tests (If Applicable)

import { test, expect } from '@playwright/test';

test.describe('User Authentication API', () => {
  test('[P1] POST /api/auth/login - should return token for valid credentials', async ({ request }) => {
    // GIVEN: Valid user credentials
    const credentials = {
      email: 'user@example.com',
      password: 'Password123!',
    };

    // WHEN: Logging in via API
    const response = await request.post('/api/auth/login', {
      data: credentials,
    });

    // THEN: Returns 200 and JWT token
    expect(response.status()).toBe(200);
    const body = await response.json();
    expect(body).toHaveProperty('token');
    expect(body.token).toMatch(/^[A-Za-z0-9-_]+\.[A-Za-z0-9-_]+\.[A-Za-z0-9-_]+$/); // JWT format
  });

  test('[P1] POST /api/auth/login - should return 401 for invalid credentials', async ({ request }) => {
    // GIVEN: Invalid credentials
    const credentials = {
      email: 'invalid@example.com',
      password: 'wrongpassword',
    };

    // WHEN: Attempting login
    const response = await request.post('/api/auth/login', {
      data: credentials,
    });

    // THEN: Returns 401 with error
    expect(response.status()).toBe(401);
    const body = await response.json();
    expect(body).toMatchObject({
      error: 'Invalid credentials',
    });
  });
});

Write Component Tests (If Applicable)

Knowledge Base Reference: component-tdd.md

import { test, expect } from '@playwright/experimental-ct-react';
import { LoginForm } from './LoginForm';

test.describe('LoginForm Component', () => {
  test('[P1] should disable submit button when fields are empty', async ({ mount }) => {
    // GIVEN: LoginForm is mounted
    const component = await mount(<LoginForm />);

    // WHEN: Form is initially rendered
    const submitButton = component.locator('button[type="submit"]');

    // THEN: Submit button is disabled
    await expect(submitButton).toBeDisabled();
  });

  test('[P1] should enable submit button when fields are filled', async ({ mount }) => {
    // GIVEN: LoginForm is mounted
    const component = await mount(<LoginForm />);

    // WHEN: User fills in email and password
    await component.locator('[data-testid="email-input"]').fill('user@example.com');
    await component.locator('[data-testid="password-input"]').fill('Password123!');

    // THEN: Submit button is enabled
    const submitButton = component.locator('button[type="submit"]');
    await expect(submitButton).toBeEnabled();
  });
});

Write Unit Tests (If Applicable)

import { validateEmail } from './validation';

describe('Email Validation', () => {
  test('[P2] should return true for valid email', () => {
    // GIVEN: Valid email address
    const email = 'user@example.com';

    // WHEN: Validating email
    const result = validateEmail(email);

    // THEN: Returns true
    expect(result).toBe(true);
  });

  test('[P2] should return false for malformed email', () => {
    // GIVEN: Malformed email addresses
    const invalidEmails = ['notanemail', '@example.com', 'user@', 'user @example.com'];

    // WHEN/THEN: Each should fail validation
    invalidEmails.forEach((email) => {
      expect(validateEmail(email)).toBe(false);
    });
  });
});

Apply Network-First Pattern (E2E tests)

Knowledge Base Reference: network-first.md

Critical pattern to prevent race conditions:

test('should load user dashboard after login', async ({ page }) => {
  // CRITICAL: Intercept routes BEFORE navigation
  await page.route('**/api/user', (route) =>
    route.fulfill({
      status: 200,
      body: JSON.stringify({ id: 1, name: 'Test User' }),
    }),
  );

  // NOW navigate
  await page.goto('/dashboard');

  await expect(page.locator('[data-testid="user-name"]')).toHaveText('Test User');
});

Enforce Quality Standards

For every test:
- ✅ Uses Given-When-Then format
- ✅ Has clear, descriptive name with priority tag
- ✅ One assertion per test (atomic)
- ✅ No hard waits or sleeps (use explicit waits)
- ✅ Self-cleaning (uses fixtures with auto-cleanup)
- ✅ Deterministic (no flaky patterns)
- ✅ Fast (under {max_test_duration} seconds)
- ✅ Lean (test file under {max_file_lines} lines)
Forbidden patterns:
- ❌ Hard waits: await page.waitForTimeout(2000)
- ❌ Conditional flow: if (await element.isVisible()) { ... }
- ❌ Try-catch for test logic (use for cleanup only)
- ❌ Hardcoded test data (use factories)
- ❌ Page objects (keep tests simple and direct)
- ❌ Shared state between tests

Step 5: Execute, Validate & Heal Generated Tests (NEW - Phase 2.5)

Purpose: Automatically validate generated tests and heal common failures before delivery

Actions

Validate Generated Tests

Always validate (auto_validate is always true):
- Run generated tests to verify they work
- Continue with healing if config.tea_use_mcp_enhancements is true
Run Generated Tests

Execute the full test suite that was just generated:
```
npx playwright test {generated_test_files}
```
Capture results:
- Total tests run
- Passing tests count
- Failing tests count
- Error messages and stack traces for failures
Evaluate Results

If ALL tests pass:
- ✅ Generate report with success summary
- Proceed to Step 6 (Documentation and Scripts)
If tests FAIL:
- Check config.tea_use_mcp_enhancements setting
- If true: Enter healing loop (Step 5.4)
- If false: Document failures for manual review, proceed to Step 6
Healing Loop (If config.tea_use_mcp_enhancements is true)

Iteration limit: 3 attempts per test (constant)

For each failing test:

A. Load Healing Knowledge Fragments

Consult tea-index.csv to load healing patterns:
- test-healing-patterns.md - Common failure patterns and fixes
- selector-resilience.md - Selector debugging and refactoring
- timing-debugging.md - Race condition identification and fixes
B. Identify Failure Pattern

Analyze error message and stack trace to classify failure type:

Stale Selector Failure:
- Error contains: "locator resolved to 0 elements", "element not found", "unable to find element"
- Extract selector from error message
- Apply selector healing (knowledge from selector-resilience.md):
  - If CSS class → Replace with page.getByTestId()
  - If nth() → Replace with filter({ hasText })
  - If ID → Replace with data-testid
  - If complex XPath → Replace with ARIA role
Race Condition Failure:
- Error contains: "timeout waiting for", "element not visible", "timed out retrying"
- Detect missing network waits or hard waits in test code
- Apply timing healing (knowledge from timing-debugging.md):
  - Add network-first interception before navigate
  - Replace waitForTimeout() with waitForResponse()
  - Add explicit element state waits (waitFor({ state: 'visible' }))
Dynamic Data Failure:
- Error contains: "Expected 'User 123' but received 'User 456'", timestamp mismatches
- Identify hardcoded assertions
- Apply data healing (knowledge from test-healing-patterns.md):
  - Replace hardcoded IDs with regex (/User \d+/)
  - Replace hardcoded dates with dynamic generation
  - Capture dynamic values and use in assertions
Network Error Failure:
- Error contains: "API call failed", "500 error", "network error"
- Detect missing route interception
- Apply network healing (knowledge from test-healing-patterns.md):
  - Add page.route() or cy.intercept() for API mocking
  - Mock error scenarios (500, 429, timeout)
Hard Wait Detection:
- Scan test code for page.waitForTimeout(), cy.wait(number), sleep()
- Apply hard wait healing (knowledge from timing-debugging.md):
  - Replace with event-based waits
  - Add network response waits
  - Use element state changes
C. MCP Healing Mode (If MCP Tools Available)

If Playwright MCP tools are available in your IDE:

Use MCP tools for interactive healing:
- playwright_test_debug_test: Pause on failure for visual inspection
- browser_snapshot: Capture visual context at failure point
- browser_console_messages: Retrieve console logs for JS errors
- browser_network_requests: Analyze network activity
- browser_generate_locator: Generate better selectors interactively
Apply MCP-generated fixes to test code.

D. Pattern-Based Healing Mode (Fallback)

If MCP unavailable, use pattern-based analysis:
- Parse error message and stack trace
- Match against failure patterns from knowledge base
- Apply fixes programmatically:
  - Selector fixes: Use suggestions from selector-resilience.md
  - Timing fixes: Apply patterns from timing-debugging.md
  - Data fixes: Use patterns from test-healing-patterns.md
E. Apply Healing Fix
- Modify test file with healed code
- Re-run test to validate fix
- If test passes: Mark as healed, move to next failure
- If test fails: Increment iteration count, try different pattern
F. Iteration Limit Handling

After 3 failed healing attempts:

Always mark unfixable tests:
- Mark test with test.fixme() instead of test()
- Add detailed comment explaining:
  - What failure occurred
  - What healing was attempted (3 iterations)
  - Why healing failed
  - Manual investigation needed
```
test.fixme('[P1] should handle complex interaction', async ({ page }) => {
  // FIXME: Test healing failed after 3 attempts
  // Failure: "Locator 'button[data-action="submit"]' resolved to 0 elements"
  // Attempted fixes:
  //   1. Replaced with page.getByTestId('submit-button') - still failing
  //   2. Replaced with page.getByRole('button', { name: 'Submit' }) - still failing
  //   3. Added waitForLoadState('networkidle') - still failing
  // Manual investigation needed: Selector may require application code changes
  // TODO: Review with team, may need data-testid added to button component
  // Original test code...
});
```
Note: Workflow continues even with unfixable tests (marked as test.fixme() for manual review)

Generate Healing Report

Document healing outcomes:

## Test Healing Report

**Auto-Heal Enabled**: {auto_heal_failures}
**Healing Mode**: {use_mcp_healing ? "MCP-assisted" : "Pattern-based"}
**Iterations Allowed**: {max_healing_iterations}

### Validation Results

- **Total tests**: {total_tests}
- **Passing**: {passing_tests}
- **Failing**: {failing_tests}

### Healing Outcomes

**Successfully Healed ({healed_count} tests):**

- `tests/e2e/login.spec.ts:15` - Stale selector (CSS class → data-testid)
- `tests/e2e/checkout.spec.ts:42` - Race condition (added network-first interception)
- `tests/api/users.spec.ts:28` - Dynamic data (hardcoded ID → regex pattern)

**Unable to Heal ({unfixable_count} tests):**

- `tests/e2e/complex-flow.spec.ts:67` - Marked as test.fixme() with manual investigation needed
  - Failure: Locator not found after 3 healing attempts
  - Requires application code changes (add data-testid to component)

### Healing Patterns Applied

- **Selector fixes**: 2 (CSS class → data-testid, nth() → filter())
- **Timing fixes**: 1 (added network-first interception)
- **Data fixes**: 1 (hardcoded ID → regex)

### Knowledge Base References

- `test-healing-patterns.md` - Common failure patterns
- `selector-resilience.md` - Selector refactoring guide
- `timing-debugging.md` - Race condition prevention

Update Test Files with Healing Results
- Save healed test code to files
- Mark unfixable tests with test.fixme() and detailed comments
- Preserve original test logic in comments (for debugging)

Step 6: Update Documentation and Scripts

Actions

Update Test README

If {update_readme} is true:

Create or update tests/README.md with:
- Overview of test suite structure
- How to run tests (all, specific files, by priority)
- Fixture and factory usage examples
- Priority tagging convention ([P0], [P1], [P2], [P3])
- How to write new tests
- Common patterns and anti-patterns
Example section:
```
## Running Tests

```bash
# Run all tests
npm run test:e2e

# Run by priority
npm run test:e2e -- --grep "@P0"
npm run test:e2e -- --grep "@P1"

# Run specific file
npm run test:e2e -- user-authentication.spec.ts

# Run in headed mode
npm run test:e2e -- --headed

# Debug specific test
npm run test:e2e -- user-authentication.spec.ts --debug
```
```
Priority Tags
- [P0]: Critical paths, run every commit
- [P1]: High priority, run on PR to main
- [P2]: Medium priority, run nightly
- [P3]: Low priority, run on-demand

Update package.json Scripts

If {update_package_scripts} is true:

Add or update test execution scripts:

{
  "scripts": {
    "test:e2e": "playwright test",
    "test:e2e:p0": "playwright test --grep '@P0'",
    "test:e2e:p1": "playwright test --grep '@P1|@P0'",
    "test:api": "playwright test tests/api",
    "test:component": "playwright test tests/component",
    "test:unit": "vitest"
  }
}

Run Test Suite

If {run_tests_after_generation} is true:
- Run full test suite locally
- Capture results (passing/failing counts)
- Verify no flaky patterns (tests should be deterministic)
- Document any setup requirements or known issues

Step 6: Generate Automation Summary

Actions

Create Automation Summary Document

Save to {output_summary} with:

BMad-Integrated Mode:

# Automation Summary - {feature_name}

**Date:** {date}
**Story:** {story_id}
**Coverage Target:** {coverage_target}

## Tests Created

### E2E Tests (P0-P1)

- `tests/e2e/user-authentication.spec.ts` (2 tests, 87 lines)
  - [P0] Login with valid credentials → Dashboard loads
  - [P1] Display error for invalid credentials

### API Tests (P1-P2)

- `tests/api/auth.api.spec.ts` (3 tests, 102 lines)
  - [P1] POST /auth/login - valid credentials → 200 + token
  - [P1] POST /auth/login - invalid credentials → 401 + error
  - [P2] POST /auth/login - missing fields → 400 + validation

### Component Tests (P1)

- `tests/component/LoginForm.test.tsx` (2 tests, 45 lines)
  - [P1] Empty fields → submit button disabled
  - [P1] Valid input → submit button enabled

## Infrastructure Created

### Fixtures

- `tests/support/fixtures/auth.fixture.ts` - authenticatedUser with auto-cleanup

### Factories

- `tests/support/factories/user.factory.ts` - createUser(), deleteUser()

### Helpers

- `tests/support/helpers/wait-for.ts` - Polling helper for complex conditions

## Test Execution

```bash
# Run all new tests
npm run test:e2e

# Run by priority
npm run test:e2e:p0  # Critical paths only
npm run test:e2e:p1  # P0 + P1 tests
```

Coverage Analysis

Total Tests: 7

P0: 1 test (critical path)
P1: 5 tests (high priority)
P2: 1 test (medium priority)

Test Levels:

E2E: 2 tests (user journeys)
API: 3 tests (business logic)
Component: 2 tests (UI behavior)

Coverage Status:

✅ All acceptance criteria covered
✅ Happy path covered (E2E + API)
✅ Error cases covered (API)
✅ UI validation covered (Component)
⚠️ Edge case: Password reset flow not yet covered (future story)

Definition of Done

All tests follow Given-When-Then format
All tests use data-testid selectors
All tests have priority tags
All tests are self-cleaning (fixtures with auto-cleanup)
No hard waits or flaky patterns
Test files under 300 lines
All tests run under 1.5 minutes each
README updated with test execution instructions
package.json scripts updated

Next Steps

Review generated tests with team
Run tests in CI pipeline: npm run test:e2e
Integrate with quality gate: bmad tea *gate
Monitor for flaky tests in burn-in loop


**Standalone Mode:**
```markdown
# Automation Summary - {target_feature}

**Date:** {date}
**Target:** {target_feature} (standalone analysis)
**Coverage Target:** {coverage_target}

## Feature Analysis

**Source Files Analyzed:**
- `src/auth/login.ts` - Login logic and validation
- `src/auth/session.ts` - Session management
- `src/auth/validation.ts` - Email/password validation

**Existing Coverage:**
- E2E tests: 0 found
- API tests: 0 found
- Component tests: 0 found
- Unit tests: 0 found

**Coverage Gaps Identified:**
- ❌ No E2E tests for login flow
- ❌ No API tests for /auth/login endpoint
- ❌ No component tests for LoginForm
- ❌ No unit tests for validateEmail()

## Tests Created

{Same structure as BMad-Integrated Mode}

## Recommendations

1. **High Priority (P0-P1):**
   - Add E2E test for password reset flow
   - Add API tests for token refresh endpoint
   - Add component tests for logout button

2. **Medium Priority (P2):**
   - Add unit tests for session timeout logic
   - Add E2E test for "remember me" functionality

3. **Future Enhancements:**
   - Consider contract testing for auth API
   - Add visual regression tests for login page
   - Set up burn-in loop for flaky test detection

## Definition of Done

{Same checklist as BMad-Integrated Mode}

Provide Summary to User

Output concise summary:

## Automation Complete

**Coverage:** {total_tests} tests created across {test_levels} levels
**Priority Breakdown:** P0: {p0_count}, P1: {p1_count}, P2: {p2_count}, P3: {p3_count}
**Infrastructure:** {fixture_count} fixtures, {factory_count} factories
**Output:** {output_summary}

**Run tests:** `npm run test:e2e`
**Next steps:** Review tests, run in CI, integrate with quality gate

Important Notes

Dual-Mode Operation

BMad-Integrated Mode (story available):

Uses story acceptance criteria for coverage targeting
Aligns with test-design risk/priority assessment
Expands ATDD tests with edge cases and negative paths
Updates BMad status tracking

Standalone Mode (no story):

Analyzes source code independently
Identifies coverage gaps automatically
Generates tests based on code analysis
Works with any project (BMad or non-BMad)

Auto-discover Mode (no targets specified):

Scans codebase for features needing tests
Prioritizes features with no coverage
Generates comprehensive test plan

Avoid Duplicate Coverage

Critical principle: Don't test same behavior at multiple levels

Good coverage:

E2E: User can login → Dashboard loads (critical happy path)
API: POST /auth/login returns correct status codes (variations)
Component: LoginForm validates input (UI edge cases)

Bad coverage (duplicate):

E2E: User can login → Dashboard loads
E2E: User can login with different emails → Dashboard loads (unnecessary duplication)
API: POST /auth/login returns 200 (already covered in E2E)

Use E2E sparingly for critical paths. Use API/Component for variations and edge cases.

Priority Tagging

Tag every test with priority in test name:

test('[P0] should login with valid credentials', async ({ page }) => { ... });
test('[P1] should display error for invalid credentials', async ({ page }) => { ... });
test('[P2] should remember login preference', async ({ page }) => { ... });

Enables selective test execution:

# Run only P0 tests (critical paths)
npm run test:e2e -- --grep "@P0"

# Run P0 + P1 tests (pre-merge)
npm run test:e2e -- --grep "@P0|@P1"

No Page Objects

Do NOT create page object classes. Keep tests simple and direct:

// ✅ CORRECT: Direct test
test('should login', async ({ page }) => {
  await page.goto('/login');
  await page.fill('[data-testid="email"]', 'user@example.com');
  await page.click('[data-testid="login-button"]');
  await expect(page).toHaveURL('/dashboard');
});

// ❌ WRONG: Page object abstraction
class LoginPage {
  async login(email, password) { ... }
}

Use fixtures for setup/teardown, not page objects for actions.

Deterministic Tests Only

No flaky patterns allowed:

// ❌ WRONG: Hard wait
await page.waitForTimeout(2000);

// ✅ CORRECT: Explicit wait
await page.waitForSelector('[data-testid="user-name"]');
await expect(page.locator('[data-testid="user-name"]')).toBeVisible();

// ❌ WRONG: Conditional flow
if (await element.isVisible()) {
  await element.click();
}

// ✅ CORRECT: Deterministic assertion
await expect(element).toBeVisible();
await element.click();

// ❌ WRONG: Try-catch for test logic
try {
  await element.click();
} catch (e) {
  // Test shouldn't catch errors
}

// ✅ CORRECT: Let test fail if element not found
await element.click();

Self-Cleaning Tests

Every test must clean up its data:

// ✅ CORRECT: Fixture with auto-cleanup
export const test = base.extend({
  testUser: async ({ page }, use) => {
    const user = await createUser();
    await use(user);
    await deleteUser(user.id); // Auto-cleanup
  },
});

// ❌ WRONG: Manual cleanup (can be forgotten)
test('should login', async ({ page }) => {
  const user = await createUser();
  // ... test logic ...
  // Forgot to delete user!
});

File Size Limits

Keep test files lean (under {max_file_lines} lines):

If file exceeds limit, split into multiple files by feature area
Group related tests in describe blocks
Extract common setup to fixtures

Knowledge Base Integration

Core Fragments (Auto-loaded in Step 1):

test-levels-framework.md - E2E vs API vs Component vs Unit decision framework with characteristics matrix (467 lines, 4 examples)
test-priorities-matrix.md - P0-P3 classification with automated scoring and risk mapping (389 lines, 2 examples)
fixture-architecture.md - Pure function → fixture → mergeTests composition with auto-cleanup (406 lines, 5 examples)
data-factories.md - Factory patterns with faker: overrides, nested factories, API seeding (498 lines, 5 examples)
selective-testing.md - Tag-based, spec filters, diff-based selection, promotion rules (727 lines, 4 examples)
ci-burn-in.md - 10-iteration burn-in loop, parallel sharding, selective execution (678 lines, 4 examples)
test-quality.md - Deterministic tests, isolated with cleanup, explicit assertions, length/time optimization (658 lines, 5 examples)
network-first.md - Intercept before navigate, HAR capture, deterministic waiting strategies (489 lines, 5 examples)

Healing Fragments (Auto-loaded if {auto_heal_failures} enabled):

test-healing-patterns.md - Common failure patterns: stale selectors, race conditions, dynamic data, network errors, hard waits (648 lines, 5 examples)
selector-resilience.md - Selector hierarchy (data-testid > ARIA > text > CSS), dynamic patterns, anti-patterns refactoring (541 lines, 4 examples)
timing-debugging.md - Race condition prevention, deterministic waiting, async debugging techniques (370 lines, 3 examples)

Manual Reference (Optional):

Use tea-index.csv to find additional specialized fragments as needed

Output Summary

After completing this workflow, provide a summary:

## Automation Complete

**Mode:** {standalone_mode ? "Standalone" : "BMad-Integrated"}
**Target:** {story_id || target_feature || "Auto-discovered features"}

**Tests Created:**

- E2E: {e2e_count} tests ({p0_count} P0, {p1_count} P1, {p2_count} P2)
- API: {api_count} tests ({p0_count} P0, {p1_count} P1, {p2_count} P2)
- Component: {component_count} tests ({p1_count} P1, {p2_count} P2)
- Unit: {unit_count} tests ({p2_count} P2, {p3_count} P3)

**Infrastructure:**

- Fixtures: {fixture_count} created/enhanced
- Factories: {factory_count} created/enhanced
- Helpers: {helper_count} created/enhanced

**Documentation Updated:**

- ✅ Test README with execution instructions
- ✅ package.json scripts for test execution

**Test Execution:**

```bash
# Run all tests
npm run test:e2e

# Run by priority
npm run test:e2e:p0  # Critical paths only
npm run test:e2e:p1  # P0 + P1 tests

# Run specific file
npm run test:e2e -- {first_test_file}
```

Coverage Status:

✅ {coverage_percentage}% of features covered
✅ All P0 scenarios covered
✅ All P1 scenarios covered
⚠️ {gap_count} coverage gaps identified (documented in summary)

Quality Checks:

✅ All tests follow Given-When-Then format
✅ All tests have priority tags
✅ All tests use data-testid selectors
✅ All tests are self-cleaning
✅ No hard waits or flaky patterns
✅ All test files under {max_file_lines} lines

Output File: {output_summary}

Next Steps:

Review generated tests with team
Run tests in CI pipeline
Monitor for flaky tests in burn-in loop
Integrate with quality gate: bmad tea *gate

Knowledge Base References Applied:

Test level selection framework (E2E vs API vs Component vs Unit)
Priority classification (P0-P3)
Fixture architecture patterns with auto-cleanup
Data factory patterns using faker
Selective testing strategies
Test quality principles


---

## Validation

After completing all steps, verify:

- [ ] Execution mode determined (BMad-Integrated, Standalone, or Auto-discover)
- [ ] BMad artifacts loaded if available (story, tech-spec, test-design, PRD)
- [ ] Framework configuration loaded
- [ ] Existing test coverage analyzed (gaps identified)
- [ ] Knowledge base fragments loaded (test-levels, test-priorities, fixture-architecture, data-factories, selective-testing)
- [ ] Automation targets identified (what needs testing)
- [ ] Test levels selected appropriately (E2E, API, Component, Unit)
- [ ] Duplicate coverage avoided (same behavior not tested at multiple levels)
- [ ] Test priorities assigned (P0, P1, P2, P3)
- [ ] Fixture architecture created/enhanced (with auto-cleanup)
- [ ] Data factories created/enhanced (using faker)
- [ ] Helper utilities created/enhanced (if needed)
- [ ] E2E tests written (Given-When-Then, priority tags, data-testid selectors)
- [ ] API tests written (Given-When-Then, priority tags, comprehensive coverage)
- [ ] Component tests written (Given-When-Then, priority tags, UI behavior)
- [ ] Unit tests written (Given-When-Then, priority tags, pure logic)
- [ ] Network-first pattern applied (route interception before navigation)
- [ ] Quality standards enforced (no hard waits, no flaky patterns, self-cleaning, deterministic)
- [ ] Test README updated (execution instructions, priority tagging, patterns)
- [ ] package.json scripts updated (test execution commands)
- [ ] Test suite run locally (results captured)
- [ ] Tests validated (if auto_validate enabled)
- [ ] Failures healed (if auto_heal_failures enabled)
- [ ] Healing report generated (if healing attempted)
- [ ] Unfixable tests marked with test.fixme() (if any)
- [ ] Automation summary created (tests, infrastructure, coverage, healing, DoD)
- [ ] Output file formatted correctly

Refer to `checklist.md` for comprehensive validation criteria.

42 KiB Raw Blame History

Test Automation Expansion

Overview

Preflight Requirements

Required (Always)

Optional (BMad-Integrated Mode)

Optional (Standalone Mode)

Step 1: Determine Execution Mode and Load Context

Actions

Step 2: Identify Automation Targets

Actions

Step 3: Generate Test Infrastructure

Actions

Step 4: Generate Test Files

Actions

Step 5: Execute, Validate & Heal Generated Tests (NEW - Phase 2.5)

Actions

Step 6: Update Documentation and Scripts

Actions

Priority Tags

Step 6: Generate Automation Summary

Actions

Coverage Analysis

Definition of Done

Next Steps

Important Notes

Dual-Mode Operation

Avoid Duplicate Coverage

Priority Tagging

No Page Objects

Deterministic Tests Only

Self-Cleaning Tests

File Size Limits

Knowledge Base Integration

Output Summary

42 KiB

Raw Blame History