Files

2025-11-01 19:22:39 +08:00

21 KiB

Raw Blame History

Test Healing Patterns

Principle

Common test failures follow predictable patterns (stale selectors, race conditions, dynamic data assertions, network errors, hard waits). Automated healing identifies failure signatures and applies pattern-based fixes. Manual healing captures these patterns for future automation.

Rationale

The Problem: Test failures waste developer time on repetitive debugging. Teams manually fix the same selector issues, timing bugs, and data mismatches repeatedly across test suites.

The Solution: Catalog common failure patterns with diagnostic signatures and automated fixes. When a test fails, match the error message/stack trace against known patterns and apply the corresponding fix. This transforms test maintenance from reactive debugging to proactive pattern application.

Why This Matters:

Reduces test maintenance time by 60-80% (pattern-based fixes vs manual debugging)
Prevents flakiness regression (same bug fixed once, applied everywhere)
Builds institutional knowledge (failure catalog grows over time)
Enables self-healing test suites (automate workflow validates and heals)

Pattern Examples

Example 1: Common Failure Pattern - Stale Selectors (Element Not Found)

Context: Test fails with "Element not found" or "Locator resolved to 0 elements" errors

Diagnostic Signature:

// src/testing/healing/selector-healing.ts

export type SelectorFailure = {
  errorMessage: string;
  stackTrace: string;
  selector: string;
  testFile: string;
  lineNumber: number;
};

/**
 * Detect stale selector failures
 */
export function isSelectorFailure(error: Error): boolean {
  const patterns = [
    /locator.*resolved to 0 elements/i,
    /element not found/i,
    /waiting for locator.*to be visible/i,
    /selector.*did not match any elements/i,
    /unable to find element/i,
  ];

  return patterns.some((pattern) => pattern.test(error.message));
}

/**
 * Extract selector from error message
 */
export function extractSelector(errorMessage: string): string | null {
  // Playwright: "locator('button[type=\"submit\"]') resolved to 0 elements"
  const playwrightMatch = errorMessage.match(/locator\('([^']+)'\)/);
  if (playwrightMatch) return playwrightMatch[1];

  // Cypress: "Timed out retrying: Expected to find element: '.submit-button'"
  const cypressMatch = errorMessage.match(/Expected to find element: ['"]([^'"]+)['"]/i);
  if (cypressMatch) return cypressMatch[1];

  return null;
}

/**
 * Suggest better selector based on hierarchy
 */
export function suggestBetterSelector(badSelector: string): string {
  // If using CSS class → suggest data-testid
  if (badSelector.startsWith('.') || badSelector.includes('class=')) {
    const elementName = badSelector.match(/class=["']([^"']+)["']/)?.[1] || badSelector.slice(1);
    return `page.getByTestId('${elementName}') // Prefer data-testid over CSS class`;
  }

  // If using ID → suggest data-testid
  if (badSelector.startsWith('#')) {
    return `page.getByTestId('${badSelector.slice(1)}') // Prefer data-testid over ID`;
  }

  // If using nth() → suggest filter() or more specific selector
  if (badSelector.includes('.nth(')) {
    return `page.locator('${badSelector.split('.nth(')[0]}').filter({ hasText: 'specific text' }) // Avoid brittle nth(), use filter()`;
  }

  // If using complex CSS → suggest ARIA role
  if (badSelector.includes('>') || badSelector.includes('+')) {
    return `page.getByRole('button', { name: 'Submit' }) // Prefer ARIA roles over complex CSS`;
  }

  return `page.getByTestId('...') // Add data-testid attribute to element`;
}

Healing Implementation:

// tests/healing/selector-healing.spec.ts
import { test, expect } from '@playwright/test';
import { isSelectorFailure, extractSelector, suggestBetterSelector } from '../../src/testing/healing/selector-healing';

test('heal stale selector failures automatically', async ({ page }) => {
  await page.goto('/dashboard');

  try {
    // Original test with brittle CSS selector
    await page.locator('.btn-primary').click();
  } catch (error: any) {
    if (isSelectorFailure(error)) {
      const badSelector = extractSelector(error.message);
      const suggestion = badSelector ? suggestBetterSelector(badSelector) : null;

      console.log('HEALING SUGGESTION:', suggestion);

      // Apply healed selector
      await page.getByTestId('submit-button').click(); // Fixed!
    } else {
      throw error; // Not a selector issue, rethrow
    }
  }

  await expect(page.getByText('Success')).toBeVisible();
});

Key Points:

Diagnosis: Error message contains "locator resolved to 0 elements" or "element not found"
Fix: Replace brittle selector (CSS class, ID, nth) with robust alternative (data-testid, ARIA role)
Prevention: Follow selector hierarchy (data-testid > ARIA > text > CSS)
Automation: Pattern matching on error message + stack trace

Example 2: Common Failure Pattern - Race Conditions (Timing Errors)

Context: Test fails with "timeout waiting for element" or "element not visible" errors

Diagnostic Signature:

// src/testing/healing/timing-healing.ts

export type TimingFailure = {
  errorMessage: string;
  testFile: string;
  lineNumber: number;
  actionType: 'click' | 'fill' | 'waitFor' | 'expect';
};

/**
 * Detect race condition failures
 */
export function isTimingFailure(error: Error): boolean {
  const patterns = [
    /timeout.*waiting for/i,
    /element is not visible/i,
    /element is not attached to the dom/i,
    /waiting for element to be visible.*exceeded/i,
    /timed out retrying/i,
    /waitForLoadState.*timeout/i,
  ];

  return patterns.some((pattern) => pattern.test(error.message));
}

/**
 * Detect hard wait anti-pattern
 */
export function hasHardWait(testCode: string): boolean {
  const hardWaitPatterns = [/page\.waitForTimeout\(/, /cy\.wait\(\d+\)/, /await.*sleep\(/, /setTimeout\(/];

  return hardWaitPatterns.some((pattern) => pattern.test(testCode));
}

/**
 * Suggest deterministic wait replacement
 */
export function suggestDeterministicWait(testCode: string): string {
  if (testCode.includes('page.waitForTimeout')) {
    return `
// ❌ Bad: Hard wait (flaky)
// await page.waitForTimeout(3000)

// ✅ Good: Wait for network response
await page.waitForResponse(resp => resp.url().includes('/api/data') && resp.status() === 200)

// OR wait for element state
await page.getByTestId('loading-spinner').waitFor({ state: 'detached' })
    `.trim();
  }

  if (testCode.includes('cy.wait(') && /cy\.wait\(\d+\)/.test(testCode)) {
    return `
// ❌ Bad: Hard wait (flaky)
// cy.wait(3000)

// ✅ Good: Wait for aliased network request
cy.intercept('GET', '/api/data').as('getData')
cy.visit('/page')
cy.wait('@getData')
    `.trim();
  }

  return `
// Add network-first interception BEFORE navigation:
await page.route('**/api/**', route => route.continue())
const responsePromise = page.waitForResponse('**/api/data')
await page.goto('/page')
await responsePromise
  `.trim();
}

Healing Implementation:

// tests/healing/timing-healing.spec.ts
import { test, expect } from '@playwright/test';
import { isTimingFailure, hasHardWait, suggestDeterministicWait } from '../../src/testing/healing/timing-healing';

test('heal race condition with network-first pattern', async ({ page, context }) => {
  // Setup interception BEFORE navigation (prevent race)
  await context.route('**/api/products', (route) => {
    route.fulfill({
      status: 200,
      body: JSON.stringify({ products: [{ id: 1, name: 'Product A' }] }),
    });
  });

  const responsePromise = page.waitForResponse('**/api/products');

  await page.goto('/products');
  await responsePromise; // Deterministic wait

  // Element now reliably visible (no race condition)
  await expect(page.getByText('Product A')).toBeVisible();
});

test('heal hard wait with event-based wait', async ({ page }) => {
  await page.goto('/dashboard');

  // ❌ Original (flaky): await page.waitForTimeout(3000)

  // ✅ Healed: Wait for spinner to disappear
  await page.getByTestId('loading-spinner').waitFor({ state: 'detached' });

  // Element now reliably visible
  await expect(page.getByText('Dashboard loaded')).toBeVisible();
});

Key Points:

Diagnosis: Error contains "timeout" or "not visible", often after navigation
Fix: Replace hard waits with network-first pattern or element state waits
Prevention: ALWAYS intercept before navigate, use waitForResponse()
Automation: Detect page.waitForTimeout() or cy.wait(number) in test code

Example 3: Common Failure Pattern - Dynamic Data Assertions (Non-Deterministic IDs)

Context: Test fails with "Expected 'User 123' but received 'User 456'" or timestamp mismatches

Diagnostic Signature:

// src/testing/healing/data-healing.ts

export type DataFailure = {
  errorMessage: string;
  expectedValue: string;
  actualValue: string;
  testFile: string;
  lineNumber: number;
};

/**
 * Detect dynamic data assertion failures
 */
export function isDynamicDataFailure(error: Error): boolean {
  const patterns = [
    /expected.*\d+.*received.*\d+/i, // ID mismatches
    /expected.*\d{4}-\d{2}-\d{2}.*received/i, // Date mismatches
    /expected.*user.*\d+/i, // Dynamic user IDs
    /expected.*order.*\d+/i, // Dynamic order IDs
    /expected.*to.*contain.*\d+/i, // Numeric assertions
  ];

  return patterns.some((pattern) => pattern.test(error.message));
}

/**
 * Suggest flexible assertion pattern
 */
export function suggestFlexibleAssertion(errorMessage: string): string {
  if (/expected.*user.*\d+/i.test(errorMessage)) {
    return `
// ❌ Bad: Hardcoded ID
// await expect(page.getByText('User 123')).toBeVisible()

// ✅ Good: Regex pattern for any user ID
await expect(page.getByText(/User \\d+/)).toBeVisible()

// OR use partial match
await expect(page.locator('[data-testid="user-name"]')).toContainText('User')
    `.trim();
  }

  if (/expected.*\d{4}-\d{2}-\d{2}/i.test(errorMessage)) {
    return `
// ❌ Bad: Hardcoded date
// await expect(page.getByText('2024-01-15')).toBeVisible()

// ✅ Good: Dynamic date validation
const today = new Date().toISOString().split('T')[0]
await expect(page.getByTestId('created-date')).toHaveText(today)

// OR use date format regex
await expect(page.getByTestId('created-date')).toHaveText(/\\d{4}-\\d{2}-\\d{2}/)
    `.trim();
  }

  if (/expected.*order.*\d+/i.test(errorMessage)) {
    return `
// ❌ Bad: Hardcoded order ID
// const orderId = '12345'

// ✅ Good: Capture dynamic order ID
const orderText = await page.getByTestId('order-id').textContent()
const orderId = orderText?.match(/Order #(\\d+)/)?.[1]
expect(orderId).toBeTruthy()

// Use captured ID in later assertions
await expect(page.getByText(\`Order #\${orderId} confirmed\`)).toBeVisible()
    `.trim();
  }

  return `Use regex patterns, partial matching, or capture dynamic values instead of hardcoding`;
}

Healing Implementation:

// tests/healing/data-healing.spec.ts
import { test, expect } from '@playwright/test';

test('heal dynamic ID assertion with regex', async ({ page }) => {
  await page.goto('/users');

  // ❌ Original (fails with random IDs): await expect(page.getByText('User 123')).toBeVisible()

  // ✅ Healed: Regex pattern matches any user ID
  await expect(page.getByText(/User \d+/)).toBeVisible();
});

test('heal timestamp assertion with dynamic generation', async ({ page }) => {
  await page.goto('/dashboard');

  // ❌ Original (fails daily): await expect(page.getByText('2024-01-15')).toBeVisible()

  // ✅ Healed: Generate expected date dynamically
  const today = new Date().toISOString().split('T')[0];
  await expect(page.getByTestId('last-updated')).toContainText(today);
});

test('heal order ID assertion with capture', async ({ page, request }) => {
  // Create order via API (dynamic ID)
  const response = await request.post('/api/orders', {
    data: { productId: '123', quantity: 1 },
  });
  const { orderId } = await response.json();

  // ✅ Healed: Use captured dynamic ID
  await page.goto(`/orders/${orderId}`);
  await expect(page.getByText(`Order #${orderId}`)).toBeVisible();
});

Key Points:

Diagnosis: Error message shows expected vs actual value mismatch with IDs/timestamps
Fix: Use regex patterns (/User \d+/), partial matching, or capture dynamic values
Prevention: Never hardcode IDs, timestamps, or random data in assertions
Automation: Parse error message for expected/actual values, suggest regex patterns

Example 4: Common Failure Pattern - Network Errors (Missing Route Interception)

Context: Test fails with "API call failed" or "500 error" during test execution

Diagnostic Signature:

// src/testing/healing/network-healing.ts

export type NetworkFailure = {
  errorMessage: string;
  url: string;
  statusCode: number;
  method: string;
};

/**
 * Detect network failure
 */
export function isNetworkFailure(error: Error): boolean {
  const patterns = [
    /api.*call.*failed/i,
    /request.*failed/i,
    /network.*error/i,
    /500.*internal server error/i,
    /503.*service unavailable/i,
    /fetch.*failed/i,
  ];

  return patterns.some((pattern) => pattern.test(error.message));
}

/**
 * Suggest route interception
 */
export function suggestRouteInterception(url: string, method: string): string {
  return `
// ❌ Bad: Real API call (unreliable, slow, external dependency)

// ✅ Good: Mock API response with route interception
await page.route('${url}', route => {
  route.fulfill({
    status: 200,
    contentType: 'application/json',
    body: JSON.stringify({
      // Mock response data
      id: 1,
      name: 'Test User',
      email: 'test@example.com'
    })
  })
})

// Then perform action
await page.goto('/page')
  `.trim();
}

Healing Implementation:

// tests/healing/network-healing.spec.ts
import { test, expect } from '@playwright/test';

test('heal network failure with route mocking', async ({ page, context }) => {
  // ✅ Healed: Mock API to prevent real network calls
  await context.route('**/api/products', (route) => {
    route.fulfill({
      status: 200,
      contentType: 'application/json',
      body: JSON.stringify({
        products: [
          { id: 1, name: 'Product A', price: 29.99 },
          { id: 2, name: 'Product B', price: 49.99 },
        ],
      }),
    });
  });

  await page.goto('/products');

  // Test now reliable (no external API dependency)
  await expect(page.getByText('Product A')).toBeVisible();
  await expect(page.getByText('$29.99')).toBeVisible();
});

test('heal 500 error with error state mocking', async ({ page, context }) => {
  // Mock API failure scenario
  await context.route('**/api/products', (route) => {
    route.fulfill({ status: 500, body: JSON.stringify({ error: 'Internal Server Error' }) });
  });

  await page.goto('/products');

  // Verify error handling (not crash)
  await expect(page.getByText('Unable to load products')).toBeVisible();
  await expect(page.getByRole('button', { name: 'Retry' })).toBeVisible();
});

Key Points:

Diagnosis: Error message contains "API call failed", "500 error", or network-related failures
Fix: Add page.route() or cy.intercept() to mock API responses
Prevention: Mock ALL external dependencies (APIs, third-party services)
Automation: Extract URL from error message, generate route interception code

Example 5: Common Failure Pattern - Hard Waits (Unreliable Timing)

Context: Test fails intermittently with "timeout exceeded" or passes/fails randomly

Diagnostic Signature:

// src/testing/healing/hard-wait-healing.ts

/**
 * Detect hard wait anti-pattern in test code
 */
export function detectHardWaits(testCode: string): Array<{ line: number; code: string }> {
  const lines = testCode.split('\n');
  const violations: Array<{ line: number; code: string }> = [];

  lines.forEach((line, index) => {
    if (line.includes('page.waitForTimeout(') || /cy\.wait\(\d+\)/.test(line) || line.includes('sleep(') || line.includes('setTimeout(')) {
      violations.push({ line: index + 1, code: line.trim() });
    }
  });

  return violations;
}

/**
 * Suggest event-based wait replacement
 */
export function suggestEventBasedWait(hardWaitLine: string): string {
  if (hardWaitLine.includes('page.waitForTimeout')) {
    return `
// ❌ Bad: Hard wait (flaky)
${hardWaitLine}

// ✅ Good: Wait for network response
await page.waitForResponse(resp => resp.url().includes('/api/') && resp.ok())

// OR wait for element state change
await page.getByTestId('loading-spinner').waitFor({ state: 'detached' })
await page.getByTestId('content').waitFor({ state: 'visible' })
    `.trim();
  }

  if (/cy\.wait\(\d+\)/.test(hardWaitLine)) {
    return `
// ❌ Bad: Hard wait (flaky)
${hardWaitLine}

// ✅ Good: Wait for aliased request
cy.intercept('GET', '/api/data').as('getData')
cy.visit('/page')
cy.wait('@getData') // Deterministic
    `.trim();
  }

  return 'Replace hard waits with event-based waits (waitForResponse, waitFor state changes)';
}

Healing Implementation:

// tests/healing/hard-wait-healing.spec.ts
import { test, expect } from '@playwright/test';

test('heal hard wait with deterministic wait', async ({ page }) => {
  await page.goto('/dashboard');

  // ❌ Original (flaky): await page.waitForTimeout(3000)

  // ✅ Healed: Wait for loading spinner to disappear
  await page.getByTestId('loading-spinner').waitFor({ state: 'detached' });

  // OR wait for specific network response
  await page.waitForResponse((resp) => resp.url().includes('/api/dashboard') && resp.ok());

  await expect(page.getByText('Dashboard ready')).toBeVisible();
});

test('heal implicit wait with explicit network wait', async ({ page }) => {
  const responsePromise = page.waitForResponse('**/api/products');

  await page.goto('/products');

  // ❌ Original (race condition): await page.getByText('Product A').click()

  // ✅ Healed: Wait for network first
  await responsePromise;
  await page.getByText('Product A').click();

  await expect(page).toHaveURL(/\/products\/\d+/);
});

Key Points:

Diagnosis: Test code contains page.waitForTimeout() or cy.wait(number)
Fix: Replace with waitForResponse(), waitFor({ state }), or aliased intercepts
Prevention: NEVER use hard waits, always use event-based/response-based waits
Automation: Scan test code for hard wait patterns, suggest deterministic replacements

Healing Pattern Catalog

Failure Type	Diagnostic Signature	Healing Strategy	Prevention Pattern
Stale Selector	"locator resolved to 0 elements"	Replace with data-testid or ARIA role	Selector hierarchy (testid > ARIA > text)
Race Condition	"timeout waiting for element"	Add network-first interception	Intercept before navigate
Dynamic Data	"Expected 'User 123' but got 'User 456'"	Use regex or capture dynamic values	Never hardcode IDs/timestamps
Network Error	"API call failed", "500 error"	Add route mocking	Mock all external dependencies
Hard Wait	Test contains `waitForTimeout()` or `wait(n)`	Replace with event-based waits	Always use deterministic waits

Healing Workflow

Run test → Capture failure
Identify pattern → Match error against diagnostic signatures
Apply fix → Use pattern-based healing strategy
Re-run test → Validate fix (max 3 iterations)
Mark unfixable → Use test.fixme() if healing fails after 3 attempts

Healing Checklist

Before enabling auto-healing in workflows:

Failure catalog documented: Common patterns identified (selectors, timing, data, network, hard waits)
Diagnostic signatures defined: Error message patterns for each failure type
Healing strategies documented: Fix patterns for each failure type
Prevention patterns documented: Best practices to avoid recurrence
Healing iteration limit set: Max 3 attempts before marking test.fixme()
MCP integration optional: Graceful degradation without Playwright MCP
Pattern-based fallback: Use knowledge base patterns when MCP unavailable
Healing report generated: Document what was healed and how

Integration Points

Used in workflows: *automate (auto-healing after test generation), *atdd (optional healing for acceptance tests)
Related fragments: selector-resilience.md (selector debugging), timing-debugging.md (race condition fixes), network-first.md (interception patterns), data-factories.md (dynamic data handling)
Tools: Error message parsing, AST analysis for code patterns, Playwright MCP (optional), pattern matching

Source: Playwright test-healer patterns, production test failure analysis, common anti-patterns from test-resources-for-ai

21 KiB Raw Blame History

Test Healing Patterns

Principle

Rationale

Pattern Examples

Example 1: Common Failure Pattern - Stale Selectors (Element Not Found)

Example 2: Common Failure Pattern - Race Conditions (Timing Errors)

Example 3: Common Failure Pattern - Dynamic Data Assertions (Non-Deterministic IDs)

Example 4: Common Failure Pattern - Network Errors (Missing Route Interception)

Example 5: Common Failure Pattern - Hard Waits (Unreliable Timing)

Healing Pattern Catalog

Healing Workflow

Healing Checklist

Integration Points

21 KiB

Raw Blame History