How to wire axe-core into CI — webaccessibility.wiki

axe-core is the open-source accessibility rules engine maintained by Deque and used inside browser DevTools, Lighthouse, and many third-party scanners. This how-to wires the same engine into your CI pipeline via Playwright, so every pull request runs the rules against your built site and fails if a new violation appears.

axe-core is one of several reasonable choices: Pa11y, Lighthouse a11y category, and IBM Equal Access all run a similar rule set with subtly different defaults. Picking one and pinning a version is more important than picking a particular vendor.

Why automate

Manual screen-reader testing finds issues that no automated rule catches — meaningful link text, sensible reading order, accurate live-region announcements. Automated rules catch the regressions you do not have time to retest manually on every PR: a removed aria-label, a new colour token that drops below 4.5:1, a forgotten heading.

A reasonable rule of thumb: automated tools catch around 30–50% of all WCAG failures, and almost 100% of the ones that recur as regressions. That is enough to be worth wiring in.

Install

pnpm add -D @playwright/test @axe-core/playwright
pnpm dlx playwright install --with-deps chromium

@axe-core/playwright is the official adapter and is licensed MPL-2.0. Pinning a specific axe-core version is wise: rule sets evolve and new versions can surface previously-passing pages.

A first test

tests/a11y/axe.spec.ts:

import { test, expect } from "@playwright/test";
import AxeBuilder from "@axe-core/playwright";

const ROUTES = [
  "/",
  "/about",
  "/how-to",
  "/eaa-checklist",
  "/framework-guide",
];

for (const route of ROUTES) {
  test(`a11y: ${route}`, async ({ page }) => {
    await page.goto(route);
    const results = await new AxeBuilder({ page })
      .withTags(["wcag2a", "wcag2aa", "wcag21aa", "wcag22aa"])
      .analyze();
    expect(results.violations).toEqual([]);
  });
}

withTags constrains the run to WCAG 2.0/2.1/2.2 Level AA rules. The empty- array assertion is a hard zero — even one new violation fails CI.

GitHub Actions wiring

.github/workflows/a11y.yml:

name: a11y
on:
  pull_request:
  push:
    branches: [main]
  schedule:
    - cron: "0 6 * * *" # nightly drift check
jobs:
  axe:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: pnpm/action-setup@v4
      - uses: actions/setup-node@v4
        with: { node-version: 22 }
      - run: pnpm install --frozen-lockfile
      - run: pnpm dlx playwright install --with-deps chromium
      - run: pnpm run build
      - run: pnpm exec playwright test tests/a11y
      - if: failure()
        uses: actions/upload-artifact@v4
        with:
          name: playwright-report
          path: playwright-report/

Make the job a required check in branch protection. Nightly cron catches drift from external dependencies (a font CDN ships a new file with an unlabelled SVG, an embed widget starts injecting a contrast failure).

Baseline existing violations

If your existing site has 200 known violations, you cannot block all PRs on zero. Two strategies:

Snapshot baseline. Commit the violation list as a JSON artefact; fail the build only if the new run contains a violation that is not in the baseline. axe-core supports this via the disableRules option plus a custom matcher.
Per-route allow-list. Tag legacy routes with a comment exempting them from CI; prioritise greenfield routes for zero-violation gating first, then march through the legacy set.

Baseline strategies are a pragmatic compromise. Schedule a recurring “reduce baseline by N” task in your sprint cadence.

Multi-tool comparison without bias

Each scanner has a slightly different rule set. Where vendors disagree, the W3C spec wins:

axe-core is the broadest open-source rule set; it has the highest signal in our experience but also the most opinion in edge cases.
Pa11y is more permissive on color-contrast, more strict on landmarks. It is useful as a second opinion.
Lighthouse is the simplest entry point and runs in DevTools without install, but its rule set is a curated subset of axe-core.
IBM Equal Access has stronger PDF and document-format coverage, weaker SPA support.

A reasonable CI policy is to run axe-core as the primary gate (zero violations) and Pa11y as an advisory check (warnings, not failures), then periodically reconcile diffs.

How to test the test

A failing reference fixture catches the case where the test passes because the page itself failed to render:

test("intentional violation fails the gate", async ({ page }) => {
  await page.setContent(
    `<html><body><img src="x"></body></html>`, // missing alt
  );
  const results = await new AxeBuilder({ page }).analyze();
  expect(results.violations.length).toBeGreaterThan(0);
});

If this passes, your harness is sound. Run it once, then mark it as expected-fail or move to a separate verification suite.

When this is hard

Authentication. Login-gated pages need a Playwright fixture that signs in before the axe run. Use a service-account session; do not bake user credentials into CI logs.
Iframes. axe-core descends into same-origin iframes by default; for cross-origin (Stripe, embedded videos), you cannot inject it. Audit iframe content separately or accept the gap.
Dynamic content. Wait for networkidle and any explicit loading state before invoking axe. A check that runs while the page is still rendering will produce false negatives.

Cross-links

How to fix WCAG 1.4.3 contrast is the most-common automated finding.
How to fix WCAG 4.1.2 name role value is the second-most-common.
How to test with NVDA covers the manual layer that automation cannot replace.
Reference: axe-core documentation; rule descriptions.

The simplest summary: automated checks are necessary but not sufficient. Wire them in to catch regressions; keep the manual screen-reader pass for real coverage.