42 KiB
Test Automation Expansion
Workflow ID: bmad/bmm/testarch/automate
Version: 4.0 (BMad v6)
Overview
Expands test automation coverage by generating comprehensive test suites at appropriate levels (E2E, API, Component, Unit) with supporting infrastructure. This workflow operates in dual mode:
- BMad-Integrated Mode: Works WITH BMad artifacts (story, tech-spec, PRD, test-design) to expand coverage after story implementation
- Standalone Mode: Works WITHOUT BMad artifacts - analyzes existing codebase and generates tests independently
Core Principle: Generate prioritized, deterministic tests that avoid duplicate coverage and follow testing best practices.
Preflight Requirements
Flexible: This workflow can run with minimal prerequisites. Only HALT if framework is completely missing.
Required (Always)
- ✅ Framework scaffolding configured (run
frameworkworkflow if missing) - ✅ Test framework configuration available (playwright.config.ts or cypress.config.ts)
Optional (BMad-Integrated Mode)
- Story markdown with acceptance criteria (enhances coverage targeting)
- Tech spec or PRD (provides architectural context)
- Test design document (provides risk/priority context)
Optional (Standalone Mode)
- Source code to analyze (feature implementation)
- Existing tests (for gap analysis)
If framework is missing: HALT with message: "Framework scaffolding required. Run bmad tea *framework first."
Step 1: Determine Execution Mode and Load Context
Actions
-
Detect Execution Mode
Check if BMad artifacts are available:
- If
{story_file}variable is set → BMad-Integrated Mode - If
{target_feature}or{target_files}set → Standalone Mode - If neither set → Auto-discover mode (scan codebase for features needing tests)
- If
-
Load BMad Artifacts (If Available)
BMad-Integrated Mode:
- Read story markdown from
{story_file} - Extract acceptance criteria and technical requirements
- Load tech-spec.md if
{use_tech_spec}is true - Load test-design.md if
{use_test_design}is true - Load PRD.md if
{use_prd}is true - Note: These are optional enhancements, not hard requirements
Standalone Mode:
- Skip BMad artifact loading
- Proceed directly to source code analysis
- Read story markdown from
-
Load Framework Configuration
- Read test framework config (playwright.config.ts or cypress.config.ts)
- Identify test directory structure from
{test_dir} - Check existing test patterns in
{test_dir} - Note test runner capabilities (parallel execution, fixtures, etc.)
-
Analyze Existing Test Coverage
If
{analyze_coverage}is true:- Search
{test_dir}for existing test files - Identify tested features vs untested features
- Map tests to source files (coverage gaps)
- Check existing fixture and factory patterns
- Search
-
Load Knowledge Base Fragments
Critical: Consult
{project-root}/bmad/bmm/testarch/tea-index.csvto load:test-levels-framework.md- Test level selection (E2E vs API vs Component vs Unit with decision matrix, 467 lines, 4 examples)test-priorities-matrix.md- Priority classification (P0-P3 with automated scoring, risk mapping, 389 lines, 2 examples)fixture-architecture.md- Test fixture patterns (pure function → fixture → mergeTests, auto-cleanup, 406 lines, 5 examples)data-factories.md- Factory patterns with faker (overrides, nested factories, API seeding, 498 lines, 5 examples)selective-testing.md- Targeted test execution strategies (tag-based, spec filters, diff-based, promotion rules, 727 lines, 4 examples)ci-burn-in.md- Flaky test detection patterns (10-iteration burn-in, sharding, selective execution, 678 lines, 4 examples)test-quality.md- Test design principles (deterministic, isolated, explicit assertions, length/time limits, 658 lines, 5 examples)network-first.md- Route interception patterns (intercept before navigate, HAR capture, deterministic waiting, 489 lines, 5 examples)
Healing Knowledge (If
{auto_heal_failures}is true):test-healing-patterns.md- Common failure patterns and automated fixes (stale selectors, race conditions, dynamic data, network errors, hard waits, 648 lines, 5 examples)selector-resilience.md- Selector debugging and refactoring guide (data-testid > ARIA > text > CSS hierarchy, anti-patterns, 541 lines, 4 examples)timing-debugging.md- Race condition identification and fixes (network-first, deterministic waiting, async debugging, 370 lines, 3 examples)
Step 2: Identify Automation Targets
Actions
-
Determine What Needs Testing
BMad-Integrated Mode (story available):
- Map acceptance criteria from story to test scenarios
- Identify features implemented in this story
- Check if story has existing ATDD tests (from
*atddworkflow) - Expand beyond ATDD with edge cases and negative paths
Standalone Mode (no story):
- If
{target_feature}specified: Analyze that specific feature - If
{target_files}specified: Analyze those specific files - If
{auto_discover_features}is true: Scan{source_dir}for features - Prioritize features with:
- No test coverage (highest priority)
- Complex business logic
- External integrations (API calls, database, auth)
- Critical user paths (login, checkout, etc.)
-
Apply Test Level Selection Framework
Knowledge Base Reference:
test-levels-framework.mdFor each feature or acceptance criterion, determine appropriate test level:
E2E (End-to-End):
- Critical user journeys (login, checkout, core workflows)
- Multi-system integration
- Full user-facing scenarios
- Characteristics: High confidence, slow, brittle
API (Integration):
- Business logic validation
- Service contracts and data transformations
- Backend integration without UI
- Characteristics: Fast feedback, stable, good balance
Component:
- UI component behavior (buttons, forms, modals)
- Interaction testing (click, hover, keyboard)
- State management within component
- Characteristics: Fast, isolated, granular
Unit:
- Pure business logic and algorithms
- Edge cases and error handling
- Minimal dependencies
- Characteristics: Fastest, most granular
-
Avoid Duplicate Coverage
Critical principle: Don't test same behavior at multiple levels unless necessary
- Use E2E for critical happy path only
- Use API tests for business logic variations
- Use component tests for UI interaction edge cases
- Use unit tests for pure logic edge cases
Example:
- E2E: User can log in with valid credentials → Dashboard loads
- API: POST /auth/login returns 401 for invalid credentials
- API: POST /auth/login returns 200 and JWT token for valid credentials
- Component: LoginForm disables submit button when fields are empty
- Unit: validateEmail() returns false for malformed email addresses
-
Assign Test Priorities
Knowledge Base Reference:
test-priorities-matrix.mdP0 (Critical - Every commit):
- Critical user paths that must always work
- Security-critical functionality (auth, permissions)
- Data integrity scenarios
- Run in pre-commit hooks or PR checks
P1 (High - PR to main):
- Important features with high user impact
- Integration points between systems
- Error handling for common failures
- Run before merging to main branch
P2 (Medium - Nightly):
- Edge cases with moderate impact
- Less-critical feature variations
- Performance/load testing
- Run in nightly CI builds
P3 (Low - On-demand):
- Nice-to-have validations
- Rarely-used features
- Exploratory testing scenarios
- Run manually or weekly
Priority Variables:
{include_p0}- Always include (default: true){include_p1}- High priority (default: true){include_p2}- Medium priority (default: true){include_p3}- Low priority (default: false)
-
Create Test Coverage Plan
Document what will be tested at each level with priorities:
## Test Coverage Plan ### E2E Tests (P0) - User login with valid credentials → Dashboard loads - User logout → Redirects to login page ### API Tests (P1) - POST /auth/login - valid credentials → 200 + JWT token - POST /auth/login - invalid credentials → 401 + error message - POST /auth/login - missing fields → 400 + validation errors ### Component Tests (P1) - LoginForm - empty fields → submit button disabled - LoginForm - valid input → submit button enabled ### Unit Tests (P2) - validateEmail() - valid email → returns true - validateEmail() - malformed email → returns false
Step 3: Generate Test Infrastructure
Actions
-
Enhance Fixture Architecture
Knowledge Base Reference:
fixture-architecture.mdCheck existing fixtures in
tests/support/fixtures/:- If missing or incomplete, create fixture architecture
- Use Playwright's
test.extend()pattern - Ensure all fixtures have auto-cleanup in teardown
Common fixtures to create/enhance:
- authenticatedUser: User with valid session (auto-deletes user after test)
- apiRequest: Authenticated API client with base URL and headers
- mockNetwork: Network mocking for external services
- testDatabase: Database with test data (auto-cleanup after test)
Example fixture:
// tests/support/fixtures/auth.fixture.ts import { test as base } from '@playwright/test'; import { createUser, deleteUser } from '../factories/user.factory'; export const test = base.extend({ authenticatedUser: async ({ page }, use) => { // Setup: Create and authenticate user const user = await createUser(); await page.goto('/login'); await page.fill('[data-testid="email"]', user.email); await page.fill('[data-testid="password"]', user.password); await page.click('[data-testid="login-button"]'); await page.waitForURL('/dashboard'); // Provide to test await use(user); // Cleanup: Delete user automatically await deleteUser(user.id); }, }); -
Enhance Data Factories
Knowledge Base Reference:
data-factories.mdCheck existing factories in
tests/support/factories/:- If missing or incomplete, create factory architecture
- Use
@faker-js/fakerfor all random data (no hardcoded values) - Support overrides for specific test scenarios
Common factories to create/enhance:
- User factory (email, password, name, role)
- Product factory (name, price, description, SKU)
- Order factory (items, total, status, customer)
Example factory:
// tests/support/factories/user.factory.ts import { faker } from '@faker-js/faker'; export const createUser = (overrides = {}) => ({ id: faker.number.int(), email: faker.internet.email(), password: faker.internet.password(), name: faker.person.fullName(), role: 'user', createdAt: faker.date.recent().toISOString(), ...overrides, }); export const createUsers = (count: number) => Array.from({ length: count }, () => createUser()); // API helper for cleanup export const deleteUser = async (userId: number) => { await fetch(`/api/users/${userId}`, { method: 'DELETE' }); }; -
Create/Enhance Helper Utilities
If
{update_helpers}is true:Check
tests/support/helpers/for common utilities:- waitFor: Polling helper for complex conditions
- retry: Retry helper for flaky operations
- testData: Test data generation helpers
- assertions: Custom assertion helpers
Example helper:
// tests/support/helpers/wait-for.ts export const waitFor = async (condition: () => Promise<boolean>, timeout = 5000, interval = 100): Promise<void> => { const startTime = Date.now(); while (Date.now() - startTime < timeout) { if (await condition()) return; await new Promise((resolve) => setTimeout(resolve, interval)); } throw new Error(`Condition not met within ${timeout}ms`); };
Step 4: Generate Test Files
Actions
-
Create Test File Structure
tests/ ├── e2e/ │ └── {feature-name}.spec.ts # E2E tests (P0-P1) ├── api/ │ └── {feature-name}.api.spec.ts # API tests (P1-P2) ├── component/ │ └── {ComponentName}.test.tsx # Component tests (P1-P2) ├── unit/ │ └── {module-name}.test.ts # Unit tests (P2-P3) └── support/ ├── fixtures/ # Test fixtures ├── factories/ # Data factories └── helpers/ # Utility functions -
Write E2E Tests (If Applicable)
Follow Given-When-Then format:
import { test, expect } from '@playwright/test'; test.describe('User Authentication', () => { test('[P0] should login with valid credentials and load dashboard', async ({ page }) => { // GIVEN: User is on login page await page.goto('/login'); // WHEN: User submits valid credentials await page.fill('[data-testid="email-input"]', 'user@example.com'); await page.fill('[data-testid="password-input"]', 'Password123!'); await page.click('[data-testid="login-button"]'); // THEN: User is redirected to dashboard await expect(page).toHaveURL('/dashboard'); await expect(page.locator('[data-testid="user-name"]')).toBeVisible(); }); test('[P1] should display error for invalid credentials', async ({ page }) => { // GIVEN: User is on login page await page.goto('/login'); // WHEN: User submits invalid credentials await page.fill('[data-testid="email-input"]', 'invalid@example.com'); await page.fill('[data-testid="password-input"]', 'wrongpassword'); await page.click('[data-testid="login-button"]'); // THEN: Error message is displayed await expect(page.locator('[data-testid="error-message"]')).toHaveText('Invalid email or password'); }); });Critical patterns:
- Tag tests with priority:
[P0],[P1],[P2],[P3]in test name - One assertion per test (atomic tests)
- Explicit waits (no hard waits/sleeps)
- Network-first approach (route interception before navigation)
- data-testid selectors for stability
- Clear Given-When-Then structure
- Tag tests with priority:
-
Write API Tests (If Applicable)
import { test, expect } from '@playwright/test'; test.describe('User Authentication API', () => { test('[P1] POST /api/auth/login - should return token for valid credentials', async ({ request }) => { // GIVEN: Valid user credentials const credentials = { email: 'user@example.com', password: 'Password123!', }; // WHEN: Logging in via API const response = await request.post('/api/auth/login', { data: credentials, }); // THEN: Returns 200 and JWT token expect(response.status()).toBe(200); const body = await response.json(); expect(body).toHaveProperty('token'); expect(body.token).toMatch(/^[A-Za-z0-9-_]+\.[A-Za-z0-9-_]+\.[A-Za-z0-9-_]+$/); // JWT format }); test('[P1] POST /api/auth/login - should return 401 for invalid credentials', async ({ request }) => { // GIVEN: Invalid credentials const credentials = { email: 'invalid@example.com', password: 'wrongpassword', }; // WHEN: Attempting login const response = await request.post('/api/auth/login', { data: credentials, }); // THEN: Returns 401 with error expect(response.status()).toBe(401); const body = await response.json(); expect(body).toMatchObject({ error: 'Invalid credentials', }); }); }); -
Write Component Tests (If Applicable)
Knowledge Base Reference:
component-tdd.mdimport { test, expect } from '@playwright/experimental-ct-react'; import { LoginForm } from './LoginForm'; test.describe('LoginForm Component', () => { test('[P1] should disable submit button when fields are empty', async ({ mount }) => { // GIVEN: LoginForm is mounted const component = await mount(<LoginForm />); // WHEN: Form is initially rendered const submitButton = component.locator('button[type="submit"]'); // THEN: Submit button is disabled await expect(submitButton).toBeDisabled(); }); test('[P1] should enable submit button when fields are filled', async ({ mount }) => { // GIVEN: LoginForm is mounted const component = await mount(<LoginForm />); // WHEN: User fills in email and password await component.locator('[data-testid="email-input"]').fill('user@example.com'); await component.locator('[data-testid="password-input"]').fill('Password123!'); // THEN: Submit button is enabled const submitButton = component.locator('button[type="submit"]'); await expect(submitButton).toBeEnabled(); }); }); -
Write Unit Tests (If Applicable)
import { validateEmail } from './validation'; describe('Email Validation', () => { test('[P2] should return true for valid email', () => { // GIVEN: Valid email address const email = 'user@example.com'; // WHEN: Validating email const result = validateEmail(email); // THEN: Returns true expect(result).toBe(true); }); test('[P2] should return false for malformed email', () => { // GIVEN: Malformed email addresses const invalidEmails = ['notanemail', '@example.com', 'user@', 'user @example.com']; // WHEN/THEN: Each should fail validation invalidEmails.forEach((email) => { expect(validateEmail(email)).toBe(false); }); }); }); -
Apply Network-First Pattern (E2E tests)
Knowledge Base Reference:
network-first.mdCritical pattern to prevent race conditions:
test('should load user dashboard after login', async ({ page }) => { // CRITICAL: Intercept routes BEFORE navigation await page.route('**/api/user', (route) => route.fulfill({ status: 200, body: JSON.stringify({ id: 1, name: 'Test User' }), }), ); // NOW navigate await page.goto('/dashboard'); await expect(page.locator('[data-testid="user-name"]')).toHaveText('Test User'); }); -
Enforce Quality Standards
For every test:
- ✅ Uses Given-When-Then format
- ✅ Has clear, descriptive name with priority tag
- ✅ One assertion per test (atomic)
- ✅ No hard waits or sleeps (use explicit waits)
- ✅ Self-cleaning (uses fixtures with auto-cleanup)
- ✅ Deterministic (no flaky patterns)
- ✅ Fast (under {max_test_duration} seconds)
- ✅ Lean (test file under {max_file_lines} lines)
Forbidden patterns:
- ❌ Hard waits:
await page.waitForTimeout(2000) - ❌ Conditional flow:
if (await element.isVisible()) { ... } - ❌ Try-catch for test logic (use for cleanup only)
- ❌ Hardcoded test data (use factories)
- ❌ Page objects (keep tests simple and direct)
- ❌ Shared state between tests
Step 5: Execute, Validate & Heal Generated Tests (NEW - Phase 2.5)
Purpose: Automatically validate generated tests and heal common failures before delivery
Actions
-
Validate Generated Tests
Always validate (auto_validate is always true):
- Run generated tests to verify they work
- Continue with healing if config.tea_use_mcp_enhancements is true
-
Run Generated Tests
Execute the full test suite that was just generated:
npx playwright test {generated_test_files}Capture results:
- Total tests run
- Passing tests count
- Failing tests count
- Error messages and stack traces for failures
-
Evaluate Results
If ALL tests pass:
- ✅ Generate report with success summary
- Proceed to Step 6 (Documentation and Scripts)
If tests FAIL:
- Check config.tea_use_mcp_enhancements setting
- If true: Enter healing loop (Step 5.4)
- If false: Document failures for manual review, proceed to Step 6
-
Healing Loop (If config.tea_use_mcp_enhancements is true)
Iteration limit: 3 attempts per test (constant)
For each failing test:
A. Load Healing Knowledge Fragments
Consult
tea-index.csvto load healing patterns:test-healing-patterns.md- Common failure patterns and fixesselector-resilience.md- Selector debugging and refactoringtiming-debugging.md- Race condition identification and fixes
B. Identify Failure Pattern
Analyze error message and stack trace to classify failure type:
Stale Selector Failure:
- Error contains: "locator resolved to 0 elements", "element not found", "unable to find element"
- Extract selector from error message
- Apply selector healing (knowledge from
selector-resilience.md):- If CSS class → Replace with
page.getByTestId() - If nth() → Replace with
filter({ hasText }) - If ID → Replace with data-testid
- If complex XPath → Replace with ARIA role
- If CSS class → Replace with
Race Condition Failure:
- Error contains: "timeout waiting for", "element not visible", "timed out retrying"
- Detect missing network waits or hard waits in test code
- Apply timing healing (knowledge from
timing-debugging.md):- Add network-first interception before navigate
- Replace
waitForTimeout()withwaitForResponse() - Add explicit element state waits (
waitFor({ state: 'visible' }))
Dynamic Data Failure:
- Error contains: "Expected 'User 123' but received 'User 456'", timestamp mismatches
- Identify hardcoded assertions
- Apply data healing (knowledge from
test-healing-patterns.md):- Replace hardcoded IDs with regex (
/User \d+/) - Replace hardcoded dates with dynamic generation
- Capture dynamic values and use in assertions
- Replace hardcoded IDs with regex (
Network Error Failure:
- Error contains: "API call failed", "500 error", "network error"
- Detect missing route interception
- Apply network healing (knowledge from
test-healing-patterns.md):- Add
page.route()orcy.intercept()for API mocking - Mock error scenarios (500, 429, timeout)
- Add
Hard Wait Detection:
- Scan test code for
page.waitForTimeout(),cy.wait(number),sleep() - Apply hard wait healing (knowledge from
timing-debugging.md):- Replace with event-based waits
- Add network response waits
- Use element state changes
C. MCP Healing Mode (If MCP Tools Available)
If Playwright MCP tools are available in your IDE:
Use MCP tools for interactive healing:
playwright_test_debug_test: Pause on failure for visual inspectionbrowser_snapshot: Capture visual context at failure pointbrowser_console_messages: Retrieve console logs for JS errorsbrowser_network_requests: Analyze network activitybrowser_generate_locator: Generate better selectors interactively
Apply MCP-generated fixes to test code.
D. Pattern-Based Healing Mode (Fallback)
If MCP unavailable, use pattern-based analysis:
- Parse error message and stack trace
- Match against failure patterns from knowledge base
- Apply fixes programmatically:
- Selector fixes: Use suggestions from
selector-resilience.md - Timing fixes: Apply patterns from
timing-debugging.md - Data fixes: Use patterns from
test-healing-patterns.md
- Selector fixes: Use suggestions from
E. Apply Healing Fix
- Modify test file with healed code
- Re-run test to validate fix
- If test passes: Mark as healed, move to next failure
- If test fails: Increment iteration count, try different pattern
F. Iteration Limit Handling
After 3 failed healing attempts:
Always mark unfixable tests:
- Mark test with
test.fixme()instead oftest() - Add detailed comment explaining:
- What failure occurred
- What healing was attempted (3 iterations)
- Why healing failed
- Manual investigation needed
test.fixme('[P1] should handle complex interaction', async ({ page }) => { // FIXME: Test healing failed after 3 attempts // Failure: "Locator 'button[data-action="submit"]' resolved to 0 elements" // Attempted fixes: // 1. Replaced with page.getByTestId('submit-button') - still failing // 2. Replaced with page.getByRole('button', { name: 'Submit' }) - still failing // 3. Added waitForLoadState('networkidle') - still failing // Manual investigation needed: Selector may require application code changes // TODO: Review with team, may need data-testid added to button component // Original test code... });Note: Workflow continues even with unfixable tests (marked as test.fixme() for manual review)
-
Generate Healing Report
Document healing outcomes:
## Test Healing Report **Auto-Heal Enabled**: {auto_heal_failures} **Healing Mode**: {use_mcp_healing ? "MCP-assisted" : "Pattern-based"} **Iterations Allowed**: {max_healing_iterations} ### Validation Results - **Total tests**: {total_tests} - **Passing**: {passing_tests} - **Failing**: {failing_tests} ### Healing Outcomes **Successfully Healed ({healed_count} tests):** - `tests/e2e/login.spec.ts:15` - Stale selector (CSS class → data-testid) - `tests/e2e/checkout.spec.ts:42` - Race condition (added network-first interception) - `tests/api/users.spec.ts:28` - Dynamic data (hardcoded ID → regex pattern) **Unable to Heal ({unfixable_count} tests):** - `tests/e2e/complex-flow.spec.ts:67` - Marked as test.fixme() with manual investigation needed - Failure: Locator not found after 3 healing attempts - Requires application code changes (add data-testid to component) ### Healing Patterns Applied - **Selector fixes**: 2 (CSS class → data-testid, nth() → filter()) - **Timing fixes**: 1 (added network-first interception) - **Data fixes**: 1 (hardcoded ID → regex) ### Knowledge Base References - `test-healing-patterns.md` - Common failure patterns - `selector-resilience.md` - Selector refactoring guide - `timing-debugging.md` - Race condition prevention -
Update Test Files with Healing Results
- Save healed test code to files
- Mark unfixable tests with
test.fixme()and detailed comments - Preserve original test logic in comments (for debugging)
Step 6: Update Documentation and Scripts
Actions
-
Update Test README
If
{update_readme}is true:Create or update
tests/README.mdwith:- Overview of test suite structure
- How to run tests (all, specific files, by priority)
- Fixture and factory usage examples
- Priority tagging convention ([P0], [P1], [P2], [P3])
- How to write new tests
- Common patterns and anti-patterns
Example section:
## Running Tests ```bash # Run all tests npm run test:e2e # Run by priority npm run test:e2e -- --grep "@P0" npm run test:e2e -- --grep "@P1" # Run specific file npm run test:e2e -- user-authentication.spec.ts # Run in headed mode npm run test:e2e -- --headed # Debug specific test npm run test:e2e -- user-authentication.spec.ts --debug ```Priority Tags
- [P0]: Critical paths, run every commit
- [P1]: High priority, run on PR to main
- [P2]: Medium priority, run nightly
- [P3]: Low priority, run on-demand
-
Update package.json Scripts
If
{update_package_scripts}is true:Add or update test execution scripts:
{ "scripts": { "test:e2e": "playwright test", "test:e2e:p0": "playwright test --grep '@P0'", "test:e2e:p1": "playwright test --grep '@P1|@P0'", "test:api": "playwright test tests/api", "test:component": "playwright test tests/component", "test:unit": "vitest" } } -
Run Test Suite
If
{run_tests_after_generation}is true:- Run full test suite locally
- Capture results (passing/failing counts)
- Verify no flaky patterns (tests should be deterministic)
- Document any setup requirements or known issues
Step 6: Generate Automation Summary
Actions
-
Create Automation Summary Document
Save to
{output_summary}with:BMad-Integrated Mode:
# Automation Summary - {feature_name} **Date:** {date} **Story:** {story_id} **Coverage Target:** {coverage_target} ## Tests Created ### E2E Tests (P0-P1) - `tests/e2e/user-authentication.spec.ts` (2 tests, 87 lines) - [P0] Login with valid credentials → Dashboard loads - [P1] Display error for invalid credentials ### API Tests (P1-P2) - `tests/api/auth.api.spec.ts` (3 tests, 102 lines) - [P1] POST /auth/login - valid credentials → 200 + token - [P1] POST /auth/login - invalid credentials → 401 + error - [P2] POST /auth/login - missing fields → 400 + validation ### Component Tests (P1) - `tests/component/LoginForm.test.tsx` (2 tests, 45 lines) - [P1] Empty fields → submit button disabled - [P1] Valid input → submit button enabled ## Infrastructure Created ### Fixtures - `tests/support/fixtures/auth.fixture.ts` - authenticatedUser with auto-cleanup ### Factories - `tests/support/factories/user.factory.ts` - createUser(), deleteUser() ### Helpers - `tests/support/helpers/wait-for.ts` - Polling helper for complex conditions ## Test Execution ```bash # Run all new tests npm run test:e2e # Run by priority npm run test:e2e:p0 # Critical paths only npm run test:e2e:p1 # P0 + P1 tests ```Coverage Analysis
Total Tests: 7
- P0: 1 test (critical path)
- P1: 5 tests (high priority)
- P2: 1 test (medium priority)
Test Levels:
- E2E: 2 tests (user journeys)
- API: 3 tests (business logic)
- Component: 2 tests (UI behavior)
Coverage Status:
- ✅ All acceptance criteria covered
- ✅ Happy path covered (E2E + API)
- ✅ Error cases covered (API)
- ✅ UI validation covered (Component)
- ⚠️ Edge case: Password reset flow not yet covered (future story)
Definition of Done
- All tests follow Given-When-Then format
- All tests use data-testid selectors
- All tests have priority tags
- All tests are self-cleaning (fixtures with auto-cleanup)
- No hard waits or flaky patterns
- Test files under 300 lines
- All tests run under 1.5 minutes each
- README updated with test execution instructions
- package.json scripts updated
Next Steps
- Review generated tests with team
- Run tests in CI pipeline:
npm run test:e2e - Integrate with quality gate:
bmad tea *gate - Monitor for flaky tests in burn-in loop
**Standalone Mode:** ```markdown # Automation Summary - {target_feature} **Date:** {date} **Target:** {target_feature} (standalone analysis) **Coverage Target:** {coverage_target} ## Feature Analysis **Source Files Analyzed:** - `src/auth/login.ts` - Login logic and validation - `src/auth/session.ts` - Session management - `src/auth/validation.ts` - Email/password validation **Existing Coverage:** - E2E tests: 0 found - API tests: 0 found - Component tests: 0 found - Unit tests: 0 found **Coverage Gaps Identified:** - ❌ No E2E tests for login flow - ❌ No API tests for /auth/login endpoint - ❌ No component tests for LoginForm - ❌ No unit tests for validateEmail() ## Tests Created {Same structure as BMad-Integrated Mode} ## Recommendations 1. **High Priority (P0-P1):** - Add E2E test for password reset flow - Add API tests for token refresh endpoint - Add component tests for logout button 2. **Medium Priority (P2):** - Add unit tests for session timeout logic - Add E2E test for "remember me" functionality 3. **Future Enhancements:** - Consider contract testing for auth API - Add visual regression tests for login page - Set up burn-in loop for flaky test detection ## Definition of Done {Same checklist as BMad-Integrated Mode} -
Provide Summary to User
Output concise summary:
## Automation Complete **Coverage:** {total_tests} tests created across {test_levels} levels **Priority Breakdown:** P0: {p0_count}, P1: {p1_count}, P2: {p2_count}, P3: {p3_count} **Infrastructure:** {fixture_count} fixtures, {factory_count} factories **Output:** {output_summary} **Run tests:** `npm run test:e2e` **Next steps:** Review tests, run in CI, integrate with quality gate
Important Notes
Dual-Mode Operation
BMad-Integrated Mode (story available):
- Uses story acceptance criteria for coverage targeting
- Aligns with test-design risk/priority assessment
- Expands ATDD tests with edge cases and negative paths
- Updates BMad status tracking
Standalone Mode (no story):
- Analyzes source code independently
- Identifies coverage gaps automatically
- Generates tests based on code analysis
- Works with any project (BMad or non-BMad)
Auto-discover Mode (no targets specified):
- Scans codebase for features needing tests
- Prioritizes features with no coverage
- Generates comprehensive test plan
Avoid Duplicate Coverage
Critical principle: Don't test same behavior at multiple levels
Good coverage:
- E2E: User can login → Dashboard loads (critical happy path)
- API: POST /auth/login returns correct status codes (variations)
- Component: LoginForm validates input (UI edge cases)
Bad coverage (duplicate):
- E2E: User can login → Dashboard loads
- E2E: User can login with different emails → Dashboard loads (unnecessary duplication)
- API: POST /auth/login returns 200 (already covered in E2E)
Use E2E sparingly for critical paths. Use API/Component for variations and edge cases.
Priority Tagging
Tag every test with priority in test name:
test('[P0] should login with valid credentials', async ({ page }) => { ... });
test('[P1] should display error for invalid credentials', async ({ page }) => { ... });
test('[P2] should remember login preference', async ({ page }) => { ... });
Enables selective test execution:
# Run only P0 tests (critical paths)
npm run test:e2e -- --grep "@P0"
# Run P0 + P1 tests (pre-merge)
npm run test:e2e -- --grep "@P0|@P1"
No Page Objects
Do NOT create page object classes. Keep tests simple and direct:
// ✅ CORRECT: Direct test
test('should login', async ({ page }) => {
await page.goto('/login');
await page.fill('[data-testid="email"]', 'user@example.com');
await page.click('[data-testid="login-button"]');
await expect(page).toHaveURL('/dashboard');
});
// ❌ WRONG: Page object abstraction
class LoginPage {
async login(email, password) { ... }
}
Use fixtures for setup/teardown, not page objects for actions.
Deterministic Tests Only
No flaky patterns allowed:
// ❌ WRONG: Hard wait
await page.waitForTimeout(2000);
// ✅ CORRECT: Explicit wait
await page.waitForSelector('[data-testid="user-name"]');
await expect(page.locator('[data-testid="user-name"]')).toBeVisible();
// ❌ WRONG: Conditional flow
if (await element.isVisible()) {
await element.click();
}
// ✅ CORRECT: Deterministic assertion
await expect(element).toBeVisible();
await element.click();
// ❌ WRONG: Try-catch for test logic
try {
await element.click();
} catch (e) {
// Test shouldn't catch errors
}
// ✅ CORRECT: Let test fail if element not found
await element.click();
Self-Cleaning Tests
Every test must clean up its data:
// ✅ CORRECT: Fixture with auto-cleanup
export const test = base.extend({
testUser: async ({ page }, use) => {
const user = await createUser();
await use(user);
await deleteUser(user.id); // Auto-cleanup
},
});
// ❌ WRONG: Manual cleanup (can be forgotten)
test('should login', async ({ page }) => {
const user = await createUser();
// ... test logic ...
// Forgot to delete user!
});
File Size Limits
Keep test files lean (under {max_file_lines} lines):
- If file exceeds limit, split into multiple files by feature area
- Group related tests in describe blocks
- Extract common setup to fixtures
Knowledge Base Integration
Core Fragments (Auto-loaded in Step 1):
test-levels-framework.md- E2E vs API vs Component vs Unit decision framework with characteristics matrix (467 lines, 4 examples)test-priorities-matrix.md- P0-P3 classification with automated scoring and risk mapping (389 lines, 2 examples)fixture-architecture.md- Pure function → fixture → mergeTests composition with auto-cleanup (406 lines, 5 examples)data-factories.md- Factory patterns with faker: overrides, nested factories, API seeding (498 lines, 5 examples)selective-testing.md- Tag-based, spec filters, diff-based selection, promotion rules (727 lines, 4 examples)ci-burn-in.md- 10-iteration burn-in loop, parallel sharding, selective execution (678 lines, 4 examples)test-quality.md- Deterministic tests, isolated with cleanup, explicit assertions, length/time optimization (658 lines, 5 examples)network-first.md- Intercept before navigate, HAR capture, deterministic waiting strategies (489 lines, 5 examples)
Healing Fragments (Auto-loaded if {auto_heal_failures} enabled):
test-healing-patterns.md- Common failure patterns: stale selectors, race conditions, dynamic data, network errors, hard waits (648 lines, 5 examples)selector-resilience.md- Selector hierarchy (data-testid > ARIA > text > CSS), dynamic patterns, anti-patterns refactoring (541 lines, 4 examples)timing-debugging.md- Race condition prevention, deterministic waiting, async debugging techniques (370 lines, 3 examples)
Manual Reference (Optional):
- Use
tea-index.csvto find additional specialized fragments as needed
Output Summary
After completing this workflow, provide a summary:
## Automation Complete
**Mode:** {standalone_mode ? "Standalone" : "BMad-Integrated"}
**Target:** {story_id || target_feature || "Auto-discovered features"}
**Tests Created:**
- E2E: {e2e_count} tests ({p0_count} P0, {p1_count} P1, {p2_count} P2)
- API: {api_count} tests ({p0_count} P0, {p1_count} P1, {p2_count} P2)
- Component: {component_count} tests ({p1_count} P1, {p2_count} P2)
- Unit: {unit_count} tests ({p2_count} P2, {p3_count} P3)
**Infrastructure:**
- Fixtures: {fixture_count} created/enhanced
- Factories: {factory_count} created/enhanced
- Helpers: {helper_count} created/enhanced
**Documentation Updated:**
- ✅ Test README with execution instructions
- ✅ package.json scripts for test execution
**Test Execution:**
```bash
# Run all tests
npm run test:e2e
# Run by priority
npm run test:e2e:p0 # Critical paths only
npm run test:e2e:p1 # P0 + P1 tests
# Run specific file
npm run test:e2e -- {first_test_file}
```
Coverage Status:
- ✅ {coverage_percentage}% of features covered
- ✅ All P0 scenarios covered
- ✅ All P1 scenarios covered
- ⚠️ {gap_count} coverage gaps identified (documented in summary)
Quality Checks:
- ✅ All tests follow Given-When-Then format
- ✅ All tests have priority tags
- ✅ All tests use data-testid selectors
- ✅ All tests are self-cleaning
- ✅ No hard waits or flaky patterns
- ✅ All test files under {max_file_lines} lines
Output File: {output_summary}
Next Steps:
- Review generated tests with team
- Run tests in CI pipeline
- Monitor for flaky tests in burn-in loop
- Integrate with quality gate:
bmad tea *gate
Knowledge Base References Applied:
- Test level selection framework (E2E vs API vs Component vs Unit)
- Priority classification (P0-P3)
- Fixture architecture patterns with auto-cleanup
- Data factory patterns using faker
- Selective testing strategies
- Test quality principles
---
## Validation
After completing all steps, verify:
- [ ] Execution mode determined (BMad-Integrated, Standalone, or Auto-discover)
- [ ] BMad artifacts loaded if available (story, tech-spec, test-design, PRD)
- [ ] Framework configuration loaded
- [ ] Existing test coverage analyzed (gaps identified)
- [ ] Knowledge base fragments loaded (test-levels, test-priorities, fixture-architecture, data-factories, selective-testing)
- [ ] Automation targets identified (what needs testing)
- [ ] Test levels selected appropriately (E2E, API, Component, Unit)
- [ ] Duplicate coverage avoided (same behavior not tested at multiple levels)
- [ ] Test priorities assigned (P0, P1, P2, P3)
- [ ] Fixture architecture created/enhanced (with auto-cleanup)
- [ ] Data factories created/enhanced (using faker)
- [ ] Helper utilities created/enhanced (if needed)
- [ ] E2E tests written (Given-When-Then, priority tags, data-testid selectors)
- [ ] API tests written (Given-When-Then, priority tags, comprehensive coverage)
- [ ] Component tests written (Given-When-Then, priority tags, UI behavior)
- [ ] Unit tests written (Given-When-Then, priority tags, pure logic)
- [ ] Network-first pattern applied (route interception before navigation)
- [ ] Quality standards enforced (no hard waits, no flaky patterns, self-cleaning, deterministic)
- [ ] Test README updated (execution instructions, priority tagging, patterns)
- [ ] package.json scripts updated (test execution commands)
- [ ] Test suite run locally (results captured)
- [ ] Tests validated (if auto_validate enabled)
- [ ] Failures healed (if auto_heal_failures enabled)
- [ ] Healing report generated (if healing attempted)
- [ ] Unfixable tests marked with test.fixme() (if any)
- [ ] Automation summary created (tests, infrastructure, coverage, healing, DoD)
- [ ] Output file formatted correctly
Refer to `checklist.md` for comprehensive validation criteria.