Files

2025-11-01 19:22:39 +08:00

14 KiB

Raw Blame History

Timing Debugging and Race Condition Fixes

Principle

Race conditions arise when tests make assumptions about asynchronous timing (network, animations, state updates). Deterministic waiting eliminates flakiness by explicitly waiting for observable events (network responses, element state changes) instead of arbitrary timeouts.

Rationale

The Problem: Tests pass locally but fail in CI (different timing), or pass/fail randomly (race conditions). Hard waits (waitForTimeout, sleep) mask timing issues without solving them.

The Solution: Replace all hard waits with event-based waits (waitForResponse, waitFor({ state })). Implement network-first pattern (intercept before navigate). Use explicit state checks (loading spinner detached, data loaded). This makes tests deterministic regardless of network speed or system load.

Why This Matters:

Eliminates flaky tests (0 tolerance for timing-based failures)
Works consistently across environments (local, CI, production-like)
Faster test execution (no unnecessary waits)
Clearer test intent (explicit about what we're waiting for)

Pattern Examples

Example 1: Race Condition Identification (Network-First Pattern)

Context: Prevent race conditions by intercepting network requests before navigation

Implementation:

// tests/timing/race-condition-prevention.spec.ts
import { test, expect } from '@playwright/test';

test.describe('Race Condition Prevention Patterns', () => {
  test('❌ Anti-Pattern: Navigate then intercept (race condition)', async ({ page, context }) => {
    // BAD: Navigation starts before interception ready
    await page.goto('/products'); // ⚠️ Race! API might load before route is set

    await context.route('**/api/products', (route) => {
      route.fulfill({ status: 200, body: JSON.stringify({ products: [] }) });
    });

    // Test may see real API response or mock (non-deterministic)
  });

  test('✅ Pattern: Intercept BEFORE navigate (deterministic)', async ({ page, context }) => {
    // GOOD: Interception ready before navigation
    await context.route('**/api/products', (route) => {
      route.fulfill({
        status: 200,
        contentType: 'application/json',
        body: JSON.stringify({
          products: [
            { id: 1, name: 'Product A', price: 29.99 },
            { id: 2, name: 'Product B', price: 49.99 },
          ],
        }),
      });
    });

    const responsePromise = page.waitForResponse('**/api/products');

    await page.goto('/products'); // Navigation happens AFTER route is ready
    await responsePromise; // Explicit wait for network

    // Test sees mock response reliably (deterministic)
    await expect(page.getByText('Product A')).toBeVisible();
  });

  test('✅ Pattern: Wait for element state change (loading → loaded)', async ({ page }) => {
    await page.goto('/dashboard');

    // Wait for loading indicator to appear (confirms load started)
    await page.getByTestId('loading-spinner').waitFor({ state: 'visible' });

    // Wait for loading indicator to disappear (confirms load complete)
    await page.getByTestId('loading-spinner').waitFor({ state: 'detached' });

    // Content now reliably visible
    await expect(page.getByTestId('dashboard-data')).toBeVisible();
  });

  test('✅ Pattern: Explicit visibility check (not just presence)', async ({ page }) => {
    await page.goto('/modal-demo');

    await page.getByRole('button', { name: 'Open Modal' }).click();

    // ❌ Bad: Element exists but may not be visible yet
    // await expect(page.getByTestId('modal')).toBeAttached()

    // ✅ Good: Wait for visibility (accounts for animations)
    await expect(page.getByTestId('modal')).toBeVisible();
    await expect(page.getByRole('heading', { name: 'Modal Title' })).toBeVisible();
  });

  test('❌ Anti-Pattern: waitForLoadState("networkidle") in SPAs', async ({ page }) => {
    // ⚠️ Deprecated for SPAs (WebSocket connections never idle)
    // await page.goto('/dashboard')
    // await page.waitForLoadState('networkidle') // May timeout in SPAs

    // ✅ Better: Wait for specific API response
    const responsePromise = page.waitForResponse('**/api/dashboard');
    await page.goto('/dashboard');
    await responsePromise;

    await expect(page.getByText('Dashboard loaded')).toBeVisible();
  });
});

Key Points:

Network-first: ALWAYS intercept before navigate (prevents race conditions)
State changes: Wait for loading spinner detached (explicit load completion)
Visibility vs presence: toBeVisible() accounts for animations, toBeAttached() doesn't
Avoid networkidle: Unreliable in SPAs (WebSocket, polling connections)
Explicit waits: Document exactly what we're waiting for

Example 2: Deterministic Waiting Patterns (Event-Based, Not Time-Based)

Context: Replace all hard waits with observable event waits

Implementation:

// tests/timing/deterministic-waits.spec.ts
import { test, expect } from '@playwright/test';

test.describe('Deterministic Waiting Patterns', () => {
  test('waitForResponse() with URL pattern', async ({ page }) => {
    const responsePromise = page.waitForResponse('**/api/products');

    await page.goto('/products');
    await responsePromise; // Deterministic (waits for exact API call)

    await expect(page.getByText('Products loaded')).toBeVisible();
  });

  test('waitForResponse() with predicate function', async ({ page }) => {
    const responsePromise = page.waitForResponse((resp) => resp.url().includes('/api/search') && resp.status() === 200);

    await page.goto('/search');
    await page.getByPlaceholder('Search').fill('laptop');
    await page.getByRole('button', { name: 'Search' }).click();

    await responsePromise; // Wait for successful search response

    await expect(page.getByTestId('search-results')).toBeVisible();
  });

  test('waitForFunction() for custom conditions', async ({ page }) => {
    await page.goto('/dashboard');

    // Wait for custom JavaScript condition
    await page.waitForFunction(() => {
      const element = document.querySelector('[data-testid="user-count"]');
      return element && parseInt(element.textContent || '0') > 0;
    });

    // User count now loaded
    await expect(page.getByTestId('user-count')).not.toHaveText('0');
  });

  test('waitFor() element state (attached, visible, hidden, detached)', async ({ page }) => {
    await page.goto('/products');

    // Wait for element to be attached to DOM
    await page.getByTestId('product-list').waitFor({ state: 'attached' });

    // Wait for element to be visible (animations complete)
    await page.getByTestId('product-list').waitFor({ state: 'visible' });

    // Perform action
    await page.getByText('Product A').click();

    // Wait for modal to be hidden (close animation complete)
    await page.getByTestId('modal').waitFor({ state: 'hidden' });
  });

  test('Cypress: cy.wait() with aliased intercepts', async () => {
    // Cypress example (not Playwright)
    /*
    cy.intercept('GET', '/api/products').as('getProducts')
    cy.visit('/products')
    cy.wait('@getProducts') // Deterministic wait for specific request

    cy.get('[data-testid="product-list"]').should('be.visible')
    */
  });
});

Key Points:

waitForResponse(): Wait for specific API calls (URL pattern or predicate)
waitForFunction(): Wait for custom JavaScript conditions
waitFor({ state }): Wait for element state changes (attached, visible, hidden, detached)
Cypress cy.wait('@alias'): Deterministic wait for aliased intercepts
All waits are event-based (not time-based)

Example 3: Timing Anti-Patterns (What NEVER to Do)

Context: Common timing mistakes that cause flakiness

Problem Examples:

// tests/timing/anti-patterns.spec.ts
import { test, expect } from '@playwright/test';

test.describe('Timing Anti-Patterns to Avoid', () => {
  test('❌ NEVER: page.waitForTimeout() (arbitrary delay)', async ({ page }) => {
    await page.goto('/dashboard');

    // ❌ Bad: Arbitrary 3-second wait (flaky)
    // await page.waitForTimeout(3000)
    // Problem: Might be too short (CI slower) or too long (wastes time)

    // ✅ Good: Wait for observable event
    await page.waitForResponse('**/api/dashboard');
    await expect(page.getByText('Dashboard loaded')).toBeVisible();
  });

  test('❌ NEVER: cy.wait(number) without alias (arbitrary delay)', async () => {
    // Cypress example
    /*
    // ❌ Bad: Arbitrary delay
    cy.visit('/products')
    cy.wait(2000) // Flaky!

    // ✅ Good: Wait for specific request
    cy.intercept('GET', '/api/products').as('getProducts')
    cy.visit('/products')
    cy.wait('@getProducts') // Deterministic
    */
  });

  test('❌ NEVER: Multiple hard waits in sequence (compounding delays)', async ({ page }) => {
    await page.goto('/checkout');

    // ❌ Bad: Stacked hard waits (6+ seconds wasted)
    // await page.waitForTimeout(2000) // Wait for form
    // await page.getByTestId('email').fill('test@example.com')
    // await page.waitForTimeout(1000) // Wait for validation
    // await page.getByTestId('submit').click()
    // await page.waitForTimeout(3000) // Wait for redirect

    // ✅ Good: Event-based waits (no wasted time)
    await page.getByTestId('checkout-form').waitFor({ state: 'visible' });
    await page.getByTestId('email').fill('test@example.com');
    await page.waitForResponse('**/api/validate-email');
    await page.getByTestId('submit').click();
    await page.waitForURL('**/confirmation');
  });

  test('❌ NEVER: waitForLoadState("networkidle") in SPAs', async ({ page }) => {
    // ❌ Bad: Unreliable in SPAs (WebSocket connections never idle)
    // await page.goto('/dashboard')
    // await page.waitForLoadState('networkidle') // Timeout in SPAs!

    // ✅ Good: Wait for specific API responses
    await page.goto('/dashboard');
    await page.waitForResponse('**/api/dashboard');
    await page.waitForResponse('**/api/user');
    await expect(page.getByTestId('dashboard-content')).toBeVisible();
  });

  test('❌ NEVER: Sleep/setTimeout in tests', async ({ page }) => {
    await page.goto('/products');

    // ❌ Bad: Node.js sleep (blocks test thread)
    // await new Promise(resolve => setTimeout(resolve, 2000))

    // ✅ Good: Playwright auto-waits for element
    await expect(page.getByText('Products loaded')).toBeVisible();
  });
});

Why These Fail:

Hard waits: Arbitrary timeouts (too short → flaky, too long → slow)
Stacked waits: Compound delays (wasteful, unreliable)
networkidle: Broken in SPAs (WebSocket/polling never idle)
Sleep: Blocks execution (wastes time, doesn't solve race conditions)

Better Approach: Use event-based waits from examples above

Async Debugging Techniques

Technique 1: Promise Chain Analysis

test('debug async waterfall with console logs', async ({ page }) => {
  console.log('1. Starting navigation...');
  await page.goto('/products');

  console.log('2. Waiting for API response...');
  const response = await page.waitForResponse('**/api/products');
  console.log('3. API responded:', response.status());

  console.log('4. Waiting for UI update...');
  await expect(page.getByText('Products loaded')).toBeVisible();
  console.log('5. Test complete');

  // Console output shows exactly where timing issue occurs
});

Technique 2: Network Waterfall Inspection (DevTools)

test('inspect network timing with trace viewer', async ({ page }) => {
  await page.goto('/dashboard');

  // Generate trace for analysis
  // npx playwright test --trace on
  // npx playwright show-trace trace.zip

  // In trace viewer:
  // 1. Check Network tab for API call timing
  // 2. Identify slow requests (>1s response time)
  // 3. Find race conditions (overlapping requests)
  // 4. Verify request order (dependencies)
});

Technique 3: Trace Viewer for Timing Visualization

test('use trace viewer to debug timing', async ({ page }) => {
  // Run with trace: npx playwright test --trace on

  await page.goto('/checkout');
  await page.getByTestId('submit').click();

  // In trace viewer, examine:
  // - Timeline: See exact timing of each action
  // - Snapshots: Hover to see DOM state at each moment
  // - Network: Identify slow/failed requests
  // - Console: Check for async errors

  await expect(page.getByText('Success')).toBeVisible();
});

Race Condition Checklist

Before deploying tests:

Network-first pattern: All routes intercepted BEFORE navigation (no race conditions)
Explicit waits: Every navigation followed by waitForResponse() or state check
No hard waits: Zero instances of waitForTimeout(), cy.wait(number), sleep()
Element state waits: Loading spinners use waitFor({ state: 'detached' })
Visibility checks: Use toBeVisible() (accounts for animations), not just toBeAttached()
Response validation: Wait for successful responses (resp.ok() or status === 200)
Trace viewer analysis: Generate traces to identify timing issues (network waterfall, console errors)
CI/local parity: Tests pass reliably in both environments (no timing assumptions)

Integration Points

Used in workflows: *automate (healing timing failures), *test-review (detect hard wait anti-patterns), *framework (configure timeout standards)
Related fragments: test-healing-patterns.md (race condition diagnosis), network-first.md (interception patterns), playwright-config.md (timeout configuration), visual-debugging.md (trace viewer analysis)
Tools: Playwright Inspector (--debug), Trace Viewer (--trace on), DevTools Network tab

Source: Playwright timing best practices, network-first pattern from test-resources-for-ai, production race condition debugging

14 KiB Raw Blame History