12 KiB
Test Quality Review: {test_filename}
Quality Score: {score}/100 ({grade} - {assessment}) Review Date: {YYYY-MM-DD} Review Scope: {single | directory | suite} Reviewer: {user_name or TEA Agent}
Executive Summary
Overall Assessment: {Excellent | Good | Acceptable | Needs Improvement | Critical Issues}
Recommendation: {Approve | Approve with Comments | Request Changes | Block}
Key Strengths
✅ {strength_1} ✅ {strength_2} ✅ {strength_3}
Key Weaknesses
❌ {weakness_1} ❌ {weakness_2} ❌ {weakness_3}
Summary
{1-2 paragraph summary of overall test quality, highlighting major findings and recommendation rationale}
Quality Criteria Assessment
| Criterion | Status | Violations | Notes |
|---|---|---|---|
| BDD Format (Given-When-Then) | {✅ PASS | ⚠️ WARN | ❌ FAIL} | {count} | {brief_note} |
| Test IDs | {✅ PASS | ⚠️ WARN | ❌ FAIL} | {count} | {brief_note} |
| Priority Markers (P0/P1/P2/P3) | {✅ PASS | ⚠️ WARN | ❌ FAIL} | {count} | {brief_note} |
| Hard Waits (sleep, waitForTimeout) | {✅ PASS | ⚠️ WARN | ❌ FAIL} | {count} | {brief_note} |
| Determinism (no conditionals) | {✅ PASS | ⚠️ WARN | ❌ FAIL} | {count} | {brief_note} |
| Isolation (cleanup, no shared state) | {✅ PASS | ⚠️ WARN | ❌ FAIL} | {count} | {brief_note} |
| Fixture Patterns | {✅ PASS | ⚠️ WARN | ❌ FAIL} | {count} | {brief_note} |
| Data Factories | {✅ PASS | ⚠️ WARN | ❌ FAIL} | {count} | {brief_note} |
| Network-First Pattern | {✅ PASS | ⚠️ WARN | ❌ FAIL} | {count} | {brief_note} |
| Explicit Assertions | {✅ PASS | ⚠️ WARN | ❌ FAIL} | {count} | {brief_note} |
| Test Length (≤300 lines) | {✅ PASS | ⚠️ WARN | ❌ FAIL} | {lines} | {brief_note} |
| Test Duration (≤1.5 min) | {✅ PASS | ⚠️ WARN | ❌ FAIL} | {duration} | {brief_note} |
| Flakiness Patterns | {✅ PASS | ⚠️ WARN | ❌ FAIL} | {count} | {brief_note} |
Total Violations: {critical_count} Critical, {high_count} High, {medium_count} Medium, {low_count} Low
Quality Score Breakdown
Starting Score: 100
Critical Violations: -{critical_count} × 10 = -{critical_deduction}
High Violations: -{high_count} × 5 = -{high_deduction}
Medium Violations: -{medium_count} × 2 = -{medium_deduction}
Low Violations: -{low_count} × 1 = -{low_deduction}
Bonus Points:
Excellent BDD: +{0|5}
Comprehensive Fixtures: +{0|5}
Data Factories: +{0|5}
Network-First: +{0|5}
Perfect Isolation: +{0|5}
All Test IDs: +{0|5}
--------
Total Bonus: +{bonus_total}
Final Score: {final_score}/100
Grade: {grade}
Critical Issues (Must Fix)
{If no critical issues: "No critical issues detected. ✅"}
{For each critical issue:}
{issue_number}. {Issue Title}
Severity: P0 (Critical)
Location: {filename}:{line_number}
Criterion: {criterion_name}
Knowledge Base: {fragment_name}
Issue Description: {Detailed explanation of what the problem is and why it's critical}
Current Code:
// ❌ Bad (current implementation)
{
code_snippet_showing_problem;
}
Recommended Fix:
// ✅ Good (recommended approach)
{
code_snippet_showing_solution;
}
Why This Matters: {Explanation of impact - flakiness risk, maintainability, reliability}
Related Violations: {If similar issue appears elsewhere, note line numbers}
Recommendations (Should Fix)
{If no recommendations: "No additional recommendations. Test quality is excellent. ✅"}
{For each recommendation:}
{rec_number}. {Recommendation Title}
Severity: {P1 (High) | P2 (Medium) | P3 (Low)}
Location: {filename}:{line_number}
Criterion: {criterion_name}
Knowledge Base: {fragment_name}
Issue Description: {Detailed explanation of what could be improved and why}
Current Code:
// ⚠️ Could be improved (current implementation)
{
code_snippet_showing_current_approach;
}
Recommended Improvement:
// ✅ Better approach (recommended)
{
code_snippet_showing_improvement;
}
Benefits: {Explanation of benefits - maintainability, readability, reusability}
Priority: {Why this is P1/P2/P3 - urgency and impact}
Best Practices Found
{If good patterns found, highlight them}
{For each best practice:}
{practice_number}. {Best Practice Title}
Location: {filename}:{line_number}
Pattern: {pattern_name}
Knowledge Base: {fragment_name}
Why This Is Good: {Explanation of why this pattern is excellent}
Code Example:
// ✅ Excellent pattern demonstrated in this test
{
code_snippet_showing_best_practice;
}
Use as Reference: {Encourage using this pattern in other tests}
Test File Analysis
File Metadata
- File Path:
{relative_path_from_project_root} - File Size: {line_count} lines, {kb_size} KB
- Test Framework: {Playwright | Jest | Cypress | Vitest | Other}
- Language: {TypeScript | JavaScript}
Test Structure
- Describe Blocks: {describe_count}
- Test Cases (it/test): {test_count}
- Average Test Length: {avg_lines_per_test} lines per test
- Fixtures Used: {fixture_count} ({fixture_names})
- Data Factories Used: {factory_count} ({factory_names})
Test Coverage Scope
- Test IDs: {test_id_list}
- Priority Distribution:
- P0 (Critical): {p0_count} tests
- P1 (High): {p1_count} tests
- P2 (Medium): {p2_count} tests
- P3 (Low): {p3_count} tests
- Unknown: {unknown_count} tests
Assertions Analysis
- Total Assertions: {assertion_count}
- Assertions per Test: {avg_assertions_per_test} (avg)
- Assertion Types: {assertion_types_used}
Context and Integration
Related Artifacts
{If story file found:}
- Story File: {story_filename}
- Acceptance Criteria Mapped: {ac_mapped}/{ac_total} ({ac_coverage}%)
{If test-design found:}
- Test Design: {test_design_filename}
- Risk Assessment: {risk_level}
- Priority Framework: P0-P3 applied
Acceptance Criteria Validation
{If story file available, map tests to ACs:}
| Acceptance Criterion | Test ID | Status | Notes |
|---|---|---|---|
| {AC_1} | {test_id} | {✅ Covered | ❌ Missing} | {notes} |
| {AC_2} | {test_id} | {✅ Covered | ❌ Missing} | {notes} |
| {AC_3} | {test_id} | {✅ Covered | ❌ Missing} | {notes} |
Coverage: {covered_count}/{total_count} criteria covered ({coverage_percentage}%)
Knowledge Base References
This review consulted the following knowledge base fragments:
- test-quality.md - Definition of Done for tests (no hard waits, <300 lines, <1.5 min, self-cleaning)
- fixture-architecture.md - Pure function → Fixture → mergeTests pattern
- network-first.md - Route intercept before navigate (race condition prevention)
- data-factories.md - Factory functions with overrides, API-first setup
- test-levels-framework.md - E2E vs API vs Component vs Unit appropriateness
- tdd-cycles.md - Red-Green-Refactor patterns
- selective-testing.md - Duplicate coverage detection
- ci-burn-in.md - Flakiness detection patterns (10-iteration loop)
- test-priorities.md - P0/P1/P2/P3 classification framework
- traceability.md - Requirements-to-tests mapping
See tea-index.csv for complete knowledge base.
Next Steps
Immediate Actions (Before Merge)
-
{action_1} - {description}
- Priority: {P0 | P1 | P2}
- Owner: {team_or_person}
- Estimated Effort: {time_estimate}
-
{action_2} - {description}
- Priority: {P0 | P1 | P2}
- Owner: {team_or_person}
- Estimated Effort: {time_estimate}
Follow-up Actions (Future PRs)
-
{action_1} - {description}
- Priority: {P2 | P3}
- Target: {next_sprint | backlog}
-
{action_2} - {description}
- Priority: {P2 | P3}
- Target: {next_sprint | backlog}
Re-Review Needed?
{✅ No re-review needed - approve as-is} {⚠️ Re-review after critical fixes - request changes, then re-review} {❌ Major refactor required - block merge, pair programming recommended}
Decision
Recommendation: {Approve | Approve with Comments | Request Changes | Block}
Rationale: {1-2 paragraph explanation of recommendation based on findings}
For Approve:
Test quality is excellent/good with {score}/100 score. {Minor issues noted can be addressed in follow-up PRs.} Tests are production-ready and follow best practices.
For Approve with Comments:
Test quality is acceptable with {score}/100 score. {High-priority recommendations should be addressed but don't block merge.} Critical issues resolved, but improvements would enhance maintainability.
For Request Changes:
Test quality needs improvement with {score}/100 score. {Critical issues must be fixed before merge.} {X} critical violations detected that pose flakiness/maintainability risks.
For Block:
Test quality is insufficient with {score}/100 score. {Multiple critical issues make tests unsuitable for production.} Recommend pairing session with QA engineer to apply patterns from knowledge base.
Appendix
Violation Summary by Location
{Table of all violations sorted by line number:}
| Line | Severity | Criterion | Issue | Fix |
|---|---|---|---|---|
| {line} | {P0/P1/P2/P3} | {criterion} | {brief_issue} | {brief_fix} |
| {line} | {P0/P1/P2/P3} | {criterion} | {brief_issue} | {brief_fix} |
Quality Trends
{If reviewing same file multiple times, show trend:}
| Review Date | Score | Grade | Critical Issues | Trend |
|---|---|---|---|---|
| {YYYY-MM-DD} | {score_1}/100 | {grade_1} | {count_1} | ⬆️ Improved |
| {YYYY-MM-DD} | {score_2}/100 | {grade_2} | {count_2} | ⬇️ Declined |
| {YYYY-MM-DD} | {score_3}/100 | {grade_3} | {count_3} | ➡️ Stable |
Related Reviews
{If reviewing multiple files in directory/suite:}
| File | Score | Grade | Critical | Status |
|---|---|---|---|---|
| {file_1} | {score}/100 | {grade} | {count} | {Approved/Blocked} |
| {file_2} | {score}/100 | {grade} | {count} | {Approved/Blocked} |
| {file_3} | {score}/100 | {grade} | {count} | {Approved/Blocked} |
Suite Average: {avg_score}/100 ({avg_grade})
Review Metadata
Generated By: BMad TEA Agent (Test Architect) Workflow: testarch-test-review v4.0 Review ID: test-review-{filename}-{YYYYMMDD} Timestamp: {YYYY-MM-DD HH:MM:SS} Version: 1.0
Feedback on This Review
If you have questions or feedback on this review:
- Review patterns in knowledge base:
testarch/knowledge/ - Consult tea-index.csv for detailed guidance
- Request clarification on specific violations
- Pair with QA engineer to apply patterns
This review is guidance, not rigid rules. Context matters - if a pattern is justified, document it with a comment.