17 KiB
Test Design and Risk Assessment
Workflow ID: bmad/bmm/testarch/test-design
Version: 4.0 (BMad v6)
Overview
Plans comprehensive test coverage strategy with risk assessment, priority classification, and execution ordering. This workflow generates a test design document that identifies high-risk areas, maps requirements to test levels, prioritizes scenarios (P0-P3), and provides resource estimates for the testing effort.
Preflight Requirements
Critical: Verify these requirements before proceeding. If any fail, HALT and notify the user.
- ✅ Story markdown with acceptance criteria available
- ✅ PRD or epic documentation exists for context
- ✅ Architecture documents available (optional but recommended)
- ✅ Requirements are clear and testable
Step 1: Load Context and Requirements
Actions
-
Read Requirements Documentation
- Load PRD.md for high-level product requirements
- Read epics.md or specific epic for feature scope
- Read story markdown for detailed acceptance criteria
- Identify all testable requirements
-
Load Architecture Context
- Read architecture.md for system design
- Read tech-spec for implementation details
- Identify technical constraints and dependencies
- Note integration points and external systems
-
Analyze Existing Test Coverage
- Search for existing test files in
{test_dir} - Identify coverage gaps
- Note areas with insufficient testing
- Check for flaky or outdated tests
- Search for existing test files in
-
Load Knowledge Base Fragments
Critical: Consult
{project-root}/bmad/bmm/testarch/tea-index.csvto load:risk-governance.md- Risk classification framework (6 categories: TECH, SEC, PERF, DATA, BUS, OPS), automated scoring, gate decision engine, owner tracking (625 lines, 4 examples)probability-impact.md- Risk scoring methodology (probability × impact matrix, automated classification, dynamic re-assessment, gate integration, 604 lines, 4 examples)test-levels-framework.md- Test level selection guidance (E2E vs API vs Component vs Unit with decision matrix, characteristics, when to use each, 467 lines, 4 examples)test-priorities-matrix.md- P0-P3 prioritization criteria (automated priority calculation, risk-based mapping, tagging strategy, time budgets, 389 lines, 2 examples)
Halt Condition: If story data or acceptance criteria are missing, check if brownfield exploration is needed. If neither requirements NOR exploration possible, HALT with message: "Test design requires clear requirements, acceptance criteria, or brownfield app URL for exploration"
Step 1.5: Mode Selection (NEW - Phase 2.5)
Actions
-
Detect Planning Mode
Determine mode based on context:
Requirements-Based Mode (DEFAULT):
- Have clear story/PRD with acceptance criteria
- Uses: Existing workflow (Steps 2-4)
- Appropriate for: Documented features, greenfield projects
Exploratory Mode (OPTIONAL - Brownfield):
- Missing/incomplete requirements AND brownfield application exists
- Uses: UI exploration to discover functionality
- Appropriate for: Undocumented brownfield apps, legacy systems
-
Requirements-Based Mode (DEFAULT - Skip to Step 2)
If requirements are clear:
- Continue with existing workflow (Step 2: Assess and Classify Risks)
- Use loaded requirements from Step 1
- Proceed with risk assessment based on documented requirements
-
Exploratory Mode (OPTIONAL - Brownfield Apps)
If exploring brownfield application:
A. Check MCP Availability
If config.tea_use_mcp_enhancements is true AND Playwright MCP tools available:
- Use MCP-assisted exploration (Step 3.B)
If MCP unavailable OR config.tea_use_mcp_enhancements is false:
- Use manual exploration fallback (Step 3.C)
B. MCP-Assisted Exploration (If MCP Tools Available)
Use Playwright MCP browser tools to explore UI:
Setup:
1. Use planner_setup_page to initialize browser 2. Navigate to {exploration_url} 3. Capture initial state with browser_snapshotExploration Process:
4. Use browser_navigate to explore different pages 5. Use browser_click to interact with buttons, links, forms 6. Use browser_hover to reveal hidden menus/tooltips 7. Capture browser_snapshot at each significant state 8. Take browser_screenshot for documentation 9. Monitor browser_console_messages for JavaScript errors 10. Track browser_network_requests to identify API calls 11. Map user flows and interactive elements 12. Document discovered functionalityDiscovery Documentation:
- Create list of discovered features (pages, workflows, forms)
- Identify user journeys (navigation paths)
- Map API endpoints (from network requests)
- Note error states (from console messages)
- Capture screenshots for visual reference
Convert to Test Scenarios:
- Transform discoveries into testable requirements
- Prioritize based on user flow criticality
- Identify risks from discovered functionality
- Continue with Step 2 (Assess and Classify Risks) using discovered requirements
C. Manual Exploration Fallback (If MCP Unavailable)
If Playwright MCP is not available:
Notify User:
Exploratory mode enabled but Playwright MCP unavailable. **Manual exploration required:** 1. Open application at: {exploration_url} 2. Explore all pages, workflows, and features 3. Document findings in markdown: - List of pages/features discovered - User journeys identified - API endpoints observed (DevTools Network tab) - JavaScript errors noted (DevTools Console) - Critical workflows mapped 4. Provide exploration findings to continue workflow **Alternative:** Disable exploratory_mode and provide requirements documentationWait for user to provide exploration findings, then:
- Parse user-provided discovery documentation
- Convert to testable requirements
- Continue with Step 2 (risk assessment)
-
Proceed to Risk Assessment
After mode selection (Requirements-Based OR Exploratory):
- Continue to Step 2: Assess and Classify Risks
- Use requirements from documentation (Requirements-Based) OR discoveries (Exploratory)
Step 2: Assess and Classify Risks
Actions
-
Identify Genuine Risks
Filter requirements to isolate actual risks (not just features):
- Unresolved technical gaps
- Security vulnerabilities
- Performance bottlenecks
- Data loss or corruption potential
- Business impact failures
- Operational deployment issues
-
Classify Risks by Category
Use these standard risk categories:
TECH (Technical/Architecture):
- Architecture flaws
- Integration failures
- Scalability issues
- Technical debt
SEC (Security):
- Missing access controls
- Authentication bypass
- Data exposure
- Injection vulnerabilities
PERF (Performance):
- SLA violations
- Response time degradation
- Resource exhaustion
- Scalability limits
DATA (Data Integrity):
- Data loss
- Data corruption
- Inconsistent state
- Migration failures
BUS (Business Impact):
- User experience degradation
- Business logic errors
- Revenue impact
- Compliance violations
OPS (Operations):
- Deployment failures
- Configuration errors
- Monitoring gaps
- Rollback issues
-
Score Risk Probability
Rate likelihood (1-3):
- 1 (Unlikely): <10% chance, edge case
- 2 (Possible): 10-50% chance, known scenario
- 3 (Likely): >50% chance, common occurrence
-
Score Risk Impact
Rate severity (1-3):
- 1 (Minor): Cosmetic, workaround exists, limited users
- 2 (Degraded): Feature impaired, workaround difficult, affects many users
- 3 (Critical): System failure, data loss, no workaround, blocks usage
-
Calculate Risk Score
Risk Score = Probability × Impact Scores: 1-2: Low risk (monitor) 3-4: Medium risk (plan mitigation) 6-9: High risk (immediate mitigation required) -
Highlight High-Priority Risks
Flag all risks with score ≥6 for immediate attention.
-
Request Clarification
If evidence is missing or assumptions required:
- Document assumptions clearly
- Request user clarification
- Do NOT speculate on business impact
-
Plan Mitigations
For each high-priority risk:
- Define mitigation strategy
- Assign owner (dev, QA, ops)
- Set timeline
- Update residual risk expectation
Step 3: Design Test Coverage
Actions
-
Break Down Acceptance Criteria
Convert each acceptance criterion into atomic test scenarios:
- One scenario per testable behavior
- Scenarios are independent
- Scenarios are repeatable
- Scenarios tie back to risk mitigations
-
Select Appropriate Test Levels
Knowledge Base Reference:
test-levels-framework.mdMap requirements to optimal test levels (avoid duplication):
E2E (End-to-End):
- Critical user journeys
- Multi-system integration
- Production-like environment
- Highest confidence, slowest execution
API (Integration):
- Service contracts
- Business logic validation
- Fast feedback
- Good for complex scenarios
Component:
- UI component behavior
- Interaction testing
- Visual regression
- Fast, isolated
Unit:
- Business logic
- Edge cases
- Error handling
- Fastest, most granular
Avoid duplicate coverage: Don't test same behavior at multiple levels unless necessary.
-
Assign Priority Levels
Knowledge Base Reference:
test-priorities-matrix.mdP0 (Critical):
- Blocks core user journey
- High-risk areas (score ≥6)
- Revenue-impacting
- Security-critical
- Run on every commit
P1 (High):
- Important user features
- Medium-risk areas (score 3-4)
- Common workflows
- Run on PR to main
P2 (Medium):
- Secondary features
- Low-risk areas (score 1-2)
- Edge cases
- Run nightly or weekly
P3 (Low):
- Nice-to-have
- Exploratory
- Performance benchmarks
- Run on-demand
-
Outline Data and Tooling Prerequisites
For each test scenario, identify:
- Test data requirements (factories, fixtures)
- External services (mocks, stubs)
- Environment setup
- Tools and dependencies
-
Define Execution Order
Recommend test execution sequence:
- Smoke tests (P0 subset, <5 min)
- P0 tests (critical paths, <10 min)
- P1 tests (important features, <30 min)
- P2/P3 tests (full regression, <60 min)
Step 4: Generate Deliverables
Actions
-
Create Risk Assessment Matrix
Use template structure:
| Risk ID | Category | Description | Probability | Impact | Score | Mitigation | | ------- | -------- | ----------- | ----------- | ------ | ----- | --------------- | | R-001 | SEC | Auth bypass | 2 | 3 | 6 | Add authz check | -
Create Coverage Matrix
| Requirement | Test Level | Priority | Risk Link | Test Count | Owner | | ----------- | ---------- | -------- | --------- | ---------- | ----- | | Login flow | E2E | P0 | R-001 | 3 | QA | -
Document Execution Order
### Smoke Tests (<5 min) - Login successful - Dashboard loads ### P0 Tests (<10 min) - [Full P0 list] ### P1 Tests (<30 min) - [Full P1 list] -
Include Resource Estimates
### Test Effort Estimates - P0 scenarios: 15 tests × 2 hours = 30 hours - P1 scenarios: 25 tests × 1 hour = 25 hours - P2 scenarios: 40 tests × 0.5 hour = 20 hours - **Total:** 75 hours (~10 days) -
Add Gate Criteria
### Quality Gate Criteria - All P0 tests pass (100%) - P1 tests pass rate ≥95% - No high-risk (score ≥6) items unmitigated - Test coverage ≥80% for critical paths -
Write to Output File
Save to
{output_folder}/test-design-epic-{epic_num}.mdusing template structure.
Important Notes
Risk Category Definitions
TECH (Technical/Architecture):
- Architecture flaws or technical debt
- Integration complexity
- Scalability concerns
SEC (Security):
- Missing security controls
- Authentication/authorization gaps
- Data exposure risks
PERF (Performance):
- SLA risk or performance degradation
- Resource constraints
- Scalability bottlenecks
DATA (Data Integrity):
- Data loss or corruption potential
- State consistency issues
- Migration risks
BUS (Business Impact):
- User experience harm
- Business logic errors
- Revenue or compliance impact
OPS (Operations):
- Deployment or runtime failures
- Configuration issues
- Monitoring/observability gaps
Risk Scoring Methodology
Probability × Impact = Risk Score
Examples:
- High likelihood (3) × Critical impact (3) = Score 9 (highest priority)
- Possible (2) × Critical (3) = Score 6 (high priority threshold)
- Unlikely (1) × Minor (1) = Score 1 (low priority)
Threshold: Scores ≥6 require immediate mitigation.
Test Level Selection Strategy
Avoid duplication:
- Don't test same behavior at E2E and API level
- Use E2E for critical paths only
- Use API tests for complex business logic
- Use unit tests for edge cases
Tradeoffs:
- E2E: High confidence, slow execution, brittle
- API: Good balance, fast, stable
- Unit: Fastest feedback, narrow scope
Priority Assignment Guidelines
P0 criteria (all must be true):
- Blocks core functionality
- High-risk (score ≥6)
- No workaround exists
- Affects majority of users
P1 criteria:
- Important feature
- Medium risk (score 3-5)
- Workaround exists but difficult
P2/P3: Everything else, prioritized by value
Knowledge Base Integration
Core Fragments (Auto-loaded in Step 1):
risk-governance.md- Risk classification (6 categories), automated scoring, gate decision engine, coverage traceability, owner tracking (625 lines, 4 examples)probability-impact.md- Probability × impact matrix, automated classification thresholds, dynamic re-assessment, gate integration (604 lines, 4 examples)test-levels-framework.md- E2E vs API vs Component vs Unit decision framework with characteristics matrix (467 lines, 4 examples)test-priorities-matrix.md- P0-P3 automated priority calculation, risk-based mapping, tagging strategy, time budgets (389 lines, 2 examples)
Reference for Test Planning:
selective-testing.md- Execution strategy: tag-based, spec filters, diff-based selection, promotion rules (727 lines, 4 examples)fixture-architecture.md- Data setup patterns: pure function → fixture → mergeTests, auto-cleanup (406 lines, 5 examples)
Manual Reference (Optional):
- Use
tea-index.csvto find additional specialized fragments as needed
Evidence-Based Assessment
Critical principle: Base risk assessment on evidence, not speculation.
Evidence sources:
- PRD and user research
- Architecture documentation
- Historical bug data
- User feedback
- Security audit results
Avoid:
- Guessing business impact
- Assuming user behavior
- Inventing requirements
When uncertain: Document assumptions and request clarification from user.
Output Summary
After completing this workflow, provide a summary:
## Test Design Complete
**Epic**: {epic_num}
**Scope**: {design_level}
**Risk Assessment**:
- Total risks identified: {count}
- High-priority risks (≥6): {high_count}
- Categories: {categories}
**Coverage Plan**:
- P0 scenarios: {p0_count} ({p0_hours} hours)
- P1 scenarios: {p1_count} ({p1_hours} hours)
- P2/P3 scenarios: {p2p3_count} ({p2p3_hours} hours)
- **Total effort**: {total_hours} hours (~{total_days} days)
**Test Levels**:
- E2E: {e2e_count}
- API: {api_count}
- Component: {component_count}
- Unit: {unit_count}
**Quality Gate Criteria**:
- P0 pass rate: 100%
- P1 pass rate: ≥95%
- High-risk mitigations: 100%
- Coverage: ≥80%
**Output File**: {output_file}
**Next Steps**:
1. Review risk assessment with team
2. Prioritize mitigation for high-risk items (score ≥6)
3. Run `atdd` workflow to generate failing tests for P0 scenarios
4. Allocate resources per effort estimates
5. Set up test data factories and fixtures
Validation
After completing all steps, verify:
- Risk assessment complete with all categories
- All risks scored (probability × impact)
- High-priority risks (≥6) flagged
- Coverage matrix maps requirements to test levels
- Priority levels assigned (P0-P3)
- Execution order defined
- Resource estimates provided
- Quality gate criteria defined
- Output file created and formatted correctly
Refer to checklist.md for comprehensive validation criteria.