axe-core is the open-source accessibility rules engine maintained by Deque and used inside browser DevTools, Lighthouse, and many third-party scanners. This how-to wires the same engine into your CI pipeline via Playwright, so every pull request runs the rules against your built site and fails if a new violation appears.
axe-core is one of several reasonable choices: Pa11y, Lighthouse a11y category, and IBM Equal Access all run a similar rule set with subtly different defaults. Picking one and pinning a version is more important than picking a particular vendor.
Why automate
Manual screen-reader testing finds issues that no automated rule catches
— meaningful link text, sensible reading order, accurate live-region
announcements. Automated rules catch the regressions you do not have time
to retest manually on every PR: a removed aria-label, a new colour token
that drops below 4.5:1, a forgotten heading.
A reasonable rule of thumb: automated tools catch around 30–50% of all WCAG failures, and almost 100% of the ones that recur as regressions. That is enough to be worth wiring in.
Install
pnpm add -D @playwright/test @axe-core/playwright
pnpm dlx playwright install --with-deps chromium
@axe-core/playwright is the official adapter and is licensed MPL-2.0.
Pinning a specific axe-core version is wise: rule sets evolve and new
versions can surface previously-passing pages.
A first test
tests/a11y/axe.spec.ts:
import { test, expect } from "@playwright/test";
import AxeBuilder from "@axe-core/playwright";
const ROUTES = [
"/",
"/about",
"/how-to",
"/eaa-checklist",
"/framework-guide",
];
for (const route of ROUTES) {
test(`a11y: ${route}`, async ({ page }) => {
await page.goto(route);
const results = await new AxeBuilder({ page })
.withTags(["wcag2a", "wcag2aa", "wcag21aa", "wcag22aa"])
.analyze();
expect(results.violations).toEqual([]);
});
}
withTags constrains the run to WCAG 2.0/2.1/2.2 Level AA rules. The empty-
array assertion is a hard zero — even one new violation fails CI.
GitHub Actions wiring
.github/workflows/a11y.yml:
name: a11y
on:
pull_request:
push:
branches: [main]
schedule:
- cron: "0 6 * * *" # nightly drift check
jobs:
axe:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: pnpm/action-setup@v4
- uses: actions/setup-node@v4
with: { node-version: 22 }
- run: pnpm install --frozen-lockfile
- run: pnpm dlx playwright install --with-deps chromium
- run: pnpm run build
- run: pnpm exec playwright test tests/a11y
- if: failure()
uses: actions/upload-artifact@v4
with:
name: playwright-report
path: playwright-report/
Make the job a required check in branch protection. Nightly cron catches drift from external dependencies (a font CDN ships a new file with an unlabelled SVG, an embed widget starts injecting a contrast failure).
Baseline existing violations
If your existing site has 200 known violations, you cannot block all PRs on zero. Two strategies:
- Snapshot baseline. Commit the violation list as a JSON artefact;
fail the build only if the new run contains a violation that is not
in the baseline. axe-core supports this via the
disableRulesoption plus a custom matcher. - Per-route allow-list. Tag legacy routes with a comment exempting them from CI; prioritise greenfield routes for zero-violation gating first, then march through the legacy set.
Baseline strategies are a pragmatic compromise. Schedule a recurring “reduce baseline by N” task in your sprint cadence.
Multi-tool comparison without bias
Each scanner has a slightly different rule set. Where vendors disagree, the W3C spec wins:
- axe-core is the broadest open-source rule set; it has the highest signal in our experience but also the most opinion in edge cases.
- Pa11y is more permissive on
color-contrast, more strict onlandmarks. It is useful as a second opinion. - Lighthouse is the simplest entry point and runs in DevTools without install, but its rule set is a curated subset of axe-core.
- IBM Equal Access has stronger PDF and document-format coverage, weaker SPA support.
A reasonable CI policy is to run axe-core as the primary gate (zero violations) and Pa11y as an advisory check (warnings, not failures), then periodically reconcile diffs.
How to test the test
A failing reference fixture catches the case where the test passes because the page itself failed to render:
test("intentional violation fails the gate", async ({ page }) => {
await page.setContent(
`<html><body><img src="x"></body></html>`, // missing alt
);
const results = await new AxeBuilder({ page }).analyze();
expect(results.violations.length).toBeGreaterThan(0);
});
If this passes, your harness is sound. Run it once, then mark it as expected-fail or move to a separate verification suite.
When this is hard
- Authentication. Login-gated pages need a Playwright fixture that signs in before the axe run. Use a service-account session; do not bake user credentials into CI logs.
- Iframes. axe-core descends into same-origin iframes by default; for cross-origin (Stripe, embedded videos), you cannot inject it. Audit iframe content separately or accept the gap.
- Dynamic content. Wait for
networkidleand any explicit loading state before invoking axe. A check that runs while the page is still rendering will produce false negatives.
Cross-links
- How to fix WCAG 1.4.3 contrast is the most-common automated finding.
- How to fix WCAG 4.1.2 name role value is the second-most-common.
- How to test with NVDA covers the manual layer that automation cannot replace.
- Reference: axe-core documentation; rule descriptions.
The simplest summary: automated checks are necessary but not sufficient. Wire them in to catch regressions; keep the manual screen-reader pass for real coverage.