Test Design and Risk Assessment Workflow
Plans comprehensive test coverage strategy with risk assessment (probability × impact scoring), priority classification (P0-P3), and resource estimation. This workflow generates a test design document that identifies high-risk areas, maps requirements to appropriate test levels, and provides execution ordering for optimal feedback.
Usage
bmad tea *test-design
The TEA agent runs this workflow when:
- Planning test coverage before development starts
 - Assessing risks for an epic or story
 - Prioritizing test scenarios by business impact
 - Estimating testing effort and resources
 
Inputs
Required Context Files:
- Story markdown: Acceptance criteria and requirements
 - PRD or epics.md: High-level product context
 - Architecture docs (optional): Technical constraints and integration points
 
Workflow Variables:
epic_num: Epic number for scoped designstory_path: Specific story for design (optional)design_level: full/targeted/minimal (default: full)risk_threshold: Score for high-priority flag (default: 6)risk_categories: TECH,SEC,PERF,DATA,BUS,OPS (all enabled)priority_levels: P0,P1,P2,P3 (all enabled)
Outputs
Primary Deliverable:
Test Design Document (test-design-epic-{N}.md):
- 
Risk Assessment Matrix
- Risk ID, category, description
 - Probability (1-3) × Impact (1-3) = Score
 - Scores ≥6 flagged as high-priority
 - Mitigation plans with owners and timelines
 
 - 
Coverage Matrix
- Requirement → Test Level (E2E/API/Component/Unit)
 - Priority assignment (P0-P3)
 - Risk linkage
 - Test count estimates
 
 - 
Execution Order
- Smoke tests (P0 subset, <5 min)
 - P0 tests (critical paths, <10 min)
 - P1 tests (important features, <30 min)
 - P2/P3 tests (full regression, <60 min)
 
 - 
Resource Estimates
- Hours per priority level
 - Total effort in days
 - Tooling and data prerequisites
 
 - 
Quality Gate Criteria
- P0 pass rate: 100%
 - P1 pass rate: ≥95%
 - High-risk mitigations: 100%
 - Coverage target: ≥80%
 
 
Key Features
Risk Scoring Framework
Probability × Impact = Risk Score
Probability (1-3):
- 1 (Unlikely): <10% chance
 - 2 (Possible): 10-50% chance
 - 3 (Likely): >50% chance
 
Impact (1-3):
- 1 (Minor): Cosmetic, workaround exists
 - 2 (Degraded): Feature impaired, difficult workaround
 - 3 (Critical): System failure, no workaround
 
Scores:
- 1-2: Low risk (monitor)
 - 3-4: Medium risk (plan mitigation)
 - 6-9: High risk (immediate mitigation required)
 
Risk Categories (6 types)
TECH (Technical/Architecture):
- Architecture flaws, integration failures
 - Scalability issues, technical debt
 
SEC (Security):
- Missing access controls, auth bypass
 - Data exposure, injection vulnerabilities
 
PERF (Performance):
- SLA violations, response time degradation
 - Resource exhaustion, scalability limits
 
DATA (Data Integrity):
- Data loss/corruption, inconsistent state
 - Migration failures
 
BUS (Business Impact):
- UX degradation, business logic errors
 - Revenue impact, compliance violations
 
OPS (Operations):
- Deployment failures, configuration errors
 - Monitoring gaps, rollback issues
 
Priority Classification (P0-P3)
P0 (Critical) - Run on every commit:
- Blocks core user journey
 - High-risk (score ≥6)
 - Revenue-impacting or security-critical
 
P1 (High) - Run on PR to main:
- Important user features
 - Medium-risk (score 3-4)
 - Common workflows
 
P2 (Medium) - Run nightly/weekly:
- Secondary features
 - Low-risk (score 1-2)
 - Edge cases
 
P3 (Low) - Run on-demand:
- Nice-to-have, exploratory
 - Performance benchmarks
 
Test Level Selection
E2E (End-to-End):
- Critical user journeys
 - Multi-system integration
 - Highest confidence, slowest
 
API (Integration):
- Service contracts
 - Business logic validation
 - Fast feedback, stable
 
Component:
- UI component behavior
 - Visual regression
 - Fast, isolated
 
Unit:
- Business logic, edge cases
 - Error handling
 - Fastest, most granular
 
Key principle: Avoid duplicate coverage - don't test same behavior at multiple levels.
Exploratory Mode (NEW - Phase 2.5)
test-design supports UI exploration for brownfield applications with missing documentation.
Activation: Automatic when requirements missing/incomplete for brownfield apps
- If config.tea_use_mcp_enhancements is true + MCP available → MCP-assisted exploration
 - Otherwise → Manual exploration with user documentation
 
When to Use Exploratory Mode:
- ✅ Brownfield projects with missing documentation
 - ✅ Legacy systems lacking requirements
 - ✅ Undocumented features needing test coverage
 - ✅ Unknown user journeys requiring discovery
 - ❌ NOT for greenfield projects with clear requirements
 
Exploration Modes:
- 
MCP-Assisted Exploration (if Playwright MCP available):
- Interactive browser exploration using MCP tools
 planner_setup_page- Initialize browserbrowser_navigate- Explore pagesbrowser_click- Interact with UI elementsbrowser_hover- Reveal hidden menusbrowser_snapshot- Capture state at each stepbrowser_screenshot- Document visuallybrowser_console_messages- Find JavaScript errorsbrowser_network_requests- Identify API endpoints
 - 
Manual Exploration (fallback without MCP):
- User explores application manually
 - Documents findings in markdown:
- Pages/features discovered
 - User journeys identified
 - API endpoints observed (DevTools Network)
 - JavaScript errors noted (DevTools Console)
 - Critical workflows mapped
 
 - Provides exploration findings to workflow
 
 
Exploration Workflow:
1. Enable exploratory_mode and set exploration_url
2. IF MCP available:
   - Use planner_setup_page to init browser
   - Explore UI with browser_* tools
   - Capture snapshots and screenshots
   - Monitor console and network
   - Document discoveries
3. IF MCP unavailable:
   - Notify user to explore manually
   - Wait for exploration findings
4. Convert discoveries to testable requirements
5. Continue with standard risk assessment (Step 2)
Example Output from Exploratory Mode:
## Exploration Findings - Legacy Admin Panel
**Exploration URL**: https://admin.example.com
**Mode**: MCP-Assisted
### Discovered Features:
1. User Management (/admin/users)
   - List users (table with 10 columns)
   - Edit user (modal form)
   - Delete user (confirmation dialog)
   - Export to CSV (download button)
2. Reporting Dashboard (/admin/reports)
   - Date range picker
   - Filter by department
   - Generate PDF report
   - Email report to stakeholders
3. API Endpoints Discovered:
   - GET /api/admin/users
   - PUT /api/admin/users/:id
   - DELETE /api/admin/users/:id
   - POST /api/reports/generate
### User Journeys Mapped:
1. Admin deletes inactive user
   - Navigate to /admin/users
   - Click delete icon
   - Confirm in modal
   - User removed from table
2. Admin generates monthly report
   - Navigate to /admin/reports
   - Select date range (last month)
   - Click generate
   - Download PDF
### Risks Identified (from exploration):
- R-001 (SEC): No RBAC check observed (any admin can delete any user)
- R-002 (DATA): No confirmation on bulk delete
- R-003 (PERF): User table loads slowly (5s for 1000 rows)
**Next**: Proceed to risk assessment with discovered requirements
Graceful Degradation:
- Exploratory mode is OPTIONAL (default: disabled)
 - Works without Playwright MCP (manual fallback)
 - If exploration fails, can disable mode and provide requirements documentation
 - Seamlessly transitions to standard risk assessment workflow
 
Knowledge Base Integration
Automatically consults TEA knowledge base:
risk-governance.md- Risk classification frameworkprobability-impact.md- Risk scoring methodologytest-levels-framework.md- Test level selectiontest-priorities-matrix.md- P0-P3 prioritization
Integration with Other Workflows
Before test-design:
- prd (Phase 2): Creates PRD and epics
 - architecture (Phase 3): Defines technical approach
 - tech-spec (Phase 3): Implementation details
 
After test-design:
- atdd: Generate failing tests for P0 scenarios
 - automate: Expand coverage for P1/P2 scenarios
 - trace (Phase 2): Use quality gate criteria for release decisions
 
Coordinates with:
- framework: Test infrastructure must exist
 - ci: Execution order maps to CI stages
 
Updates:
bmm-workflow-status.md: Adds test design to Quality & Testing Progress
Important Notes
Evidence-Based Assessment
Critical principle: Base risk assessment on evidence, not speculation.
Evidence sources:
- PRD and user research
 - Architecture documentation
 - Historical bug data
 - User feedback
 - Security audit results
 
When uncertain: Document assumptions, request user clarification.
Avoid:
- Guessing business impact
 - Assuming user behavior
 - Inventing requirements
 
Resource Estimation Formula
P0: 2 hours per test (setup + complex scenarios)
P1: 1 hour per test (standard coverage)
P2: 0.5 hours per test (simple scenarios)
P3: 0.25 hours per test (exploratory)
Total Days = Total Hours / 8
Example:
- 15 P0 × 2h = 30h
 - 25 P1 × 1h = 25h
 - 40 P2 × 0.5h = 20h
 - Total: 75 hours (~10 days)
 
Execution Order Strategy
Smoke tests (subset of P0, <5 min):
- Login successful
 - Dashboard loads
 - Core API responds
 
Purpose: Fast feedback, catch build-breaking issues immediately.
P0 tests (critical paths, <10 min):
- All scenarios blocking user journeys
 - Security-critical flows
 
P1 tests (important features, <30 min):
- Common workflows
 - Medium-risk areas
 
P2/P3 tests (full regression, <60 min):
- Edge cases
 - Performance benchmarks
 
Quality Gate Criteria
Pass/Fail thresholds:
- P0: 100% pass (no exceptions)
 - P1: ≥95% pass (2-3 failures acceptable with waivers)
 - P2/P3: ≥90% pass (informational)
 - High-risk items: All mitigated or have approved waivers
 
Coverage targets:
- Critical paths: ≥80%
 - Security scenarios: 100%
 - Business logic: ≥70%
 
Validation Checklist
After workflow completion:
- Risk assessment complete (all categories)
 - Risks scored (probability × impact)
 - High-priority risks (≥6) flagged
 - Coverage matrix maps requirements to test levels
 - Priorities assigned (P0-P3)
 - Execution order defined
 - Resource estimates provided
 - Quality gate criteria defined
 - Output file created
 
Refer to checklist.md for comprehensive validation.
Example Execution
Scenario: E-commerce checkout epic
bmad tea *test-design
# Epic 3: Checkout flow redesign
# Risk Assessment identifies:
- R-001 (SEC): Payment bypass, P=2 × I=3 = 6 (HIGH)
- R-002 (PERF): Cart load time, P=3 × I=2 = 6 (HIGH)
- R-003 (BUS): Order confirmation email, P=2 × I=2 = 4 (MEDIUM)
# Coverage Plan:
P0 scenarios: 12 tests (payment security, order creation)
P1 scenarios: 18 tests (cart management, promo codes)
P2 scenarios: 25 tests (edge cases, error handling)
Total effort: 65 hours (~8 days)
# Test Levels:
- E2E: 8 tests (critical checkout path)
- API: 30 tests (business logic, payment processing)
- Unit: 17 tests (calculations, validations)
# Execution Order:
1. Smoke: Payment successful, order created (2 min)
2. P0: All payment & security flows (8 min)
3. P1: Cart & promo codes (20 min)
4. P2: Edge cases (40 min)
# Quality Gates:
- P0 pass rate: 100%
- P1 pass rate: ≥95%
- R-001 mitigated: Add payment validation layer
- R-002 mitigated: Implement cart caching
Troubleshooting
Issue: "Unable to score risks - missing context"
- Cause: Insufficient documentation
 - Solution: Request PRD, architecture docs, or user clarification
 
Issue: "All tests marked as P0"
- Cause: Over-prioritization
 - Solution: Apply strict P0 criteria (blocks core journey + high risk + no workaround)
 
Issue: "Duplicate coverage at multiple test levels"
- Cause: Not following test pyramid
 - Solution: Use E2E for critical paths only, API for logic, unit for edge cases
 
Issue: "Resource estimates too high"
- Cause: Complex test setup or insufficient automation
 - Solution: Invest in fixtures/factories upfront, reduce per-test setup time
 
Related Workflows
- atdd: Generate failing tests → atdd/README.md
 - automate: Expand regression coverage → automate/README.md
 - trace: Traceability and quality gate decisions → trace/README.md
 - framework: Test infrastructure → framework/README.md
 
Version History
- v4.0 (BMad v6): Pure markdown instructions, risk scoring framework, template-based output
 - v3.x: XML format instructions
 - v2.x: Legacy task-based approach