Files
claudekit/skills/playwright/SKILL.md
T

14 KiB

name, description
name description
playwright Use when writing, debugging, or configuring E2E tests with Playwright. Trigger for any mention of end-to-end testing, browser automation, page objects, visual regression, storageState auth, playwright.config, or cross-browser testing. Also use when setting up E2E in CI, testing critical user flows, or debugging flaky browser tests.

Playwright E2E Testing

Overview

The definitive E2E testing reference for web apps built with Next.js, FastAPI, Django, NestJS, Express, and React. Covers test structure, locator strategy, authentication reuse, API mocking, visual regression, accessibility, CI sharding, and framework-specific setup.

When to Use

  • Testing critical user flows end-to-end (login, checkout, onboarding)
  • Cross-browser testing (Chromium, Firefox, WebKit)
  • Visual regression testing with toHaveScreenshot()
  • Accessibility auditing with @axe-core/playwright
  • Testing Server Components, SSR pages, or full-stack flows
  • Mobile/responsive testing via device emulation

When NOT to Use

  • Unit testing isolated functions — use pytest or vitest
  • Component testing React components in isolation — use vitest + Testing Library (faster feedback loop)
  • API-only testing with no browser interaction — use httpx / supertest directly
  • Load/performance testing — use k6, Artillery, or Locust

Quick Reference

I need... Go to
Production-grade config to copy templates/playwright.config.ts
Page Object, auth, mocking patterns references/e2e-patterns.md
Locator strategy § Locators below
Auth reuse with storageState § Authentication below
CI setup (GitHub Actions + sharding) § CI Integration below
Framework-specific webServer § Framework Integration below

Core Patterns

Test Structure

import { test, expect } from '@playwright/test';

test.describe('Checkout flow', () => {
  test('guest can complete purchase', async ({ page }) => {
    await page.goto('/products/widget-pro');
    await page.getByRole('button', { name: 'Add to cart' }).click();
    await page.getByRole('link', { name: 'Cart' }).click();
    await page.getByRole('button', { name: 'Checkout' }).click();

    await page.getByLabel('Email').fill('guest@example.com');
    await page.getByRole('button', { name: 'Place order' }).click();

    await expect(page.getByText('Order confirmed')).toBeVisible();
  });
});

Locators — the priority order

Always prefer role-based and user-visible locators. They survive refactors and match how users interact with the page.

Priority Locator When
1 getByRole('button', { name: '...' }) Interactive elements with accessible names
2 getByLabel('...') Form fields with <label>
3 getByText('...') Static visible text
4 getByPlaceholder('...') Inputs without labels (fix the label instead)
5 getByTestId('...') Last resort — when no semantic locator works

Never use: page.locator('.css-class'), page.locator('#id'), XPath. These break on every styling change.

Assertions

// Visibility
await expect(page.getByText('Welcome')).toBeVisible();
await expect(page.getByRole('alert')).not.toBeVisible();

// Content
await expect(page.getByRole('heading')).toHaveText('Dashboard');
await expect(page.getByRole('table')).toContainText('usr_abc123');

// Navigation
await expect(page).toHaveURL('/dashboard');
await expect(page).toHaveTitle('Dashboard | Acme');

// Count
await expect(page.getByRole('listitem')).toHaveCount(5);

// Attribute / state
await expect(page.getByRole('button', { name: 'Submit' })).toBeEnabled();
await expect(page.getByRole('checkbox')).toBeChecked();

All expect() calls auto-retry until the timeout (default 5s). No waitForSelector needed.

Fixtures

Extend test to share setup logic without inheritance chains.

// fixtures.ts
import { test as base, expect } from '@playwright/test';

type Fixtures = {
  adminPage: Page;
};

export const test = base.extend<Fixtures>({
  adminPage: async ({ browser }, use) => {
    const context = await browser.newContext({
      storageState: 'e2e/.auth/admin.json',
    });
    const page = await context.newPage();
    await use(page);
    await context.close();
  },
});

export { expect };
// admin.spec.ts
import { test, expect } from './fixtures';

test('admin can view users', async ({ adminPage }) => {
  await adminPage.goto('/admin/users');
  await expect(adminPage.getByRole('table')).toBeVisible();
});

Authentication

Use storageState to log in once in globalSetup and reuse across all tests. Eliminates login page interaction from every test.

// e2e/global-setup.ts
import { chromium, FullConfig } from '@playwright/test';

async function globalSetup(config: FullConfig) {
  const browser = await chromium.launch();
  const page = await browser.newPage();

  await page.goto('http://localhost:3000/login');
  await page.getByLabel('Email').fill('admin@example.com');
  await page.getByLabel('Password').fill('test-password');
  await page.getByRole('button', { name: 'Sign in' }).click();
  await page.waitForURL('/dashboard');

  await page.context().storageState({ path: 'e2e/.auth/admin.json' });
  await browser.close();
}

export default globalSetup;
// playwright.config.ts
export default defineConfig({
  globalSetup: './e2e/global-setup.ts',
  projects: [
    { name: 'authenticated', use: { storageState: 'e2e/.auth/admin.json' } },
    { name: 'guest', use: { storageState: undefined } },
  ],
});

Multiple roles: create separate storage state files per role (admin.json, member.json, guest) and use Playwright projects or fixtures to select which role each test suite uses.


API Mocking

Use page.route() to intercept network requests. Prefer this over MSW for E2E — it runs at the browser level and doesn't require service worker setup.

test('shows error on API failure', async ({ page }) => {
  await page.route('**/api/v1/users', (route) =>
    route.fulfill({
      status: 500,
      contentType: 'application/problem+json',
      body: JSON.stringify({
        type: 'https://api.example.com/problems/internal-error',
        title: 'Internal server error',
        status: 500,
      }),
    }),
  );

  await page.goto('/users');
  await expect(page.getByRole('alert')).toContainText('Something went wrong');
});

When to mock vs use real backend:

  • Mock: error paths, edge cases, third-party integrations, rate-limit scenarios
  • Real backend: happy-path smoke tests, data integrity flows, auth flows

Framework Integration

Next.js

// playwright.config.ts
export default defineConfig({
  webServer: {
    command: 'pnpm dev',
    url: 'http://localhost:3000',
    reuseExistingServer: !process.env.CI,
    timeout: 120_000,
  },
  use: { baseURL: 'http://localhost:3000' },
});

For App Router with Server Components — test the rendered output, not the server component directly. Playwright sees the final HTML the browser receives.

FastAPI / Django (Python backends)

// playwright.config.ts
export default defineConfig({
  webServer: [
    {
      command: 'uvicorn app.main:app --port 8000',
      url: 'http://localhost:8000/health',
      reuseExistingServer: !process.env.CI,
      timeout: 30_000,
    },
    {
      command: 'pnpm dev',
      url: 'http://localhost:3000',
      reuseExistingServer: !process.env.CI,
    },
  ],
  use: { baseURL: 'http://localhost:3000' },
});

webServer accepts an array — spin up both backend and frontend in one config.

NestJS / Express

Same pattern as FastAPI — use webServer with the backend's start command (nest start --watch or node dist/main.js). Point the health check URL at the backend's /health endpoint.


CI Integration (GitHub Actions)

# .github/workflows/e2e.yml
name: E2E Tests
on:
  pull_request:
  push:
    branches: [main]

jobs:
  e2e:
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        shard: [1/4, 2/4, 3/4, 4/4]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: '20' }
      - run: pnpm install
      - run: pnpm exec playwright install --with-deps chromium

      - run: pnpm exec playwright test --shard=${{ matrix.shard }}

      - uses: actions/upload-artifact@v4
        if: ${{ !cancelled() }}
        with:
          name: playwright-report-${{ strategy.job-index }}
          path: playwright-report/
          retention-days: 7

      - uses: actions/upload-artifact@v4
        if: failure()
        with:
          name: test-traces-${{ strategy.job-index }}
          path: test-results/
          retention-days: 3

Sharding splits tests across N parallel runners. Use fail-fast: false so one shard failure doesn't kill the others.

Artifacts: always upload playwright-report/ (HTML report) and test-results/ on failure (traces for debugging).


Accessibility Testing

import { test, expect } from '@playwright/test';
import AxeBuilder from '@axe-core/playwright';

test('homepage has no a11y violations', async ({ page }) => {
  await page.goto('/');

  const results = await new AxeBuilder({ page })
    .withTags(['wcag2a', 'wcag2aa', 'wcag21a', 'wcag21aa'])
    .analyze();

  expect(results.violations).toEqual([]);
});

Run accessibility audits on every critical page. Integrate into the main E2E suite — don't create a separate "a11y suite" that gets ignored. Use .withTags() to target specific WCAG levels.


Visual Regression

test('dashboard matches screenshot', async ({ page }) => {
  await page.goto('/dashboard');

  // Wait for dynamic content to settle
  await expect(page.getByRole('table')).toBeVisible();

  await expect(page).toHaveScreenshot('dashboard.png', {
    maxDiffPixelRatio: 0.01,
    animations: 'disabled',
    mask: [page.getByTestId('timestamp')],
  });
});
  • animations: 'disabled' — prevents CSS/JS animation flicker from causing false diffs
  • mask — hides dynamic content (timestamps, avatars, random IDs) that changes between runs
  • maxDiffPixelRatio — allows minor anti-aliasing differences across environments

Update baselines: pnpm exec playwright test --update-snapshots

For team-scale visual regression with review UIs, pair with Argos, Percy, or Chromatic.


Debugging

Situation Tool
Writing tests npx playwright test --ui (interactive test explorer)
Test just failed in CI Download test-results/ artifact → npx playwright show-trace trace.zip
Flaky test npx playwright test --repeat-each=10 to reproduce
Step-by-step inspection await page.pause() in code → debugger opens
Generate test from actions npx playwright codegen http://localhost:3000

Trace-on-first-retry — the most cost-effective trace strategy for CI:

// playwright.config.ts
use: {
  trace: 'on-first-retry',
}

Records a trace only when a test fails and retries. You get debugging info without the storage cost of tracing every test.


File Organization

e2e/
├── playwright.config.ts
├── global-setup.ts
├── fixtures.ts            # Shared custom fixtures
├── .auth/                 # storageState files (gitignored)
│   ├── admin.json
│   └── member.json
├── pages/                 # Page objects (if used)
│   ├── login.page.ts
│   └── dashboard.page.ts
├── specs/                 # Test files
│   ├── auth.spec.ts
│   ├── checkout.spec.ts
│   └── dashboard.spec.ts
└── helpers/               # Shared utilities
    └── api.ts             # API helpers for seeding data

Keep E2E tests in a top-level e2e/ directory, separate from unit/integration tests. This keeps vitest and playwright from interfering with each other's config/discovery.


Common Pitfalls

  1. page.waitForTimeout() — never use hard waits. Use expect() auto-retry or page.waitForResponse() instead. Hard waits are the #1 source of flaky tests.
  2. CSS/XPath selectors — break on every refactor. Use role/label/text locators. If you can't find a semantic locator, add a data-testid attribute (and fix the accessibility).
  3. Test interdependence — tests that share state or must run in order. Every test should work in isolation. Use storageState + API calls to seed data, not prior tests.
  4. Testing implementation details — checking CSS classes, DOM structure, or internal state. Test what the user sees and does.
  5. Running all browsers in CI — run Chromium-only in CI by default (covers ~95% of bugs). Run multi-browser on a nightly schedule, not on every PR.
  6. Forgetting --with-deps in CIplaywright install without --with-deps skips system dependencies (fonts, libs) and causes cryptic failures.
  7. No trace on failure — without trace: 'on-first-retry' and artifact upload, CI failures are impossible to debug remotely.
  8. Giant spec files — split by feature, not by page. checkout.spec.ts, auth.spec.ts, search.spec.ts — each focused on one flow.
  9. Mocking everything — E2E tests that mock the entire backend aren't E2E tests. Mock only third-party services and error scenarios; let happy paths hit the real stack.
  10. No visual regression baseline management — screenshots checked into git without review. Use --update-snapshots deliberately, review diffs in PRs.

  • vitest — unit/integration testing for TypeScript/JavaScript (complement to E2E)
  • pytest — unit/integration testing for Python
  • testing-anti-patterns — patterns that make tests unreliable (applies to E2E too)
  • test-driven-development — TDD methodology (use Playwright for the "integration test" step)
  • github-actions — CI/CD pipeline configuration for running E2E