Content authenticity policy

Reader-facing summary. The operational version with regex enforcement lives at editorial/CONTENT_AUTHENTICITY_POLICY.md in our repository.

What we promise

Why the policy exists

Search engines — Google, Bing, DuckDuckGo, Qwant — have moved against content produced "at scale" with the primary purpose of manipulating rankings. Google's March 2024 spam policy update is the clearest written enforcement: "scaled content abuse" applies regardless of whether AI was involved. The trigger is scale without value, not AI specifically.

We are deliberately a small, slow site. Our editorial cap is five articles per week per editor at the most-AI-assisted tier. Below that, the cap is determined by editor capacity, not AI throughput. This is structural — not a vibes-based promise.

The four AI-assist tiers

none
The editor wrote every word. No AI-assist at any stage.
outline-only
An LLM helped brainstorm or sanity-check the outline. All prose is the editor's.
draft-then-edit
An LLM produced a first draft. The editor materially rewrote (≥ 30% of words substantively changed) and added at least one first-hand observation.
extensive
An LLM produced most prose. The editor reviewed line by line, verified every citation against the primary source, and added first-hand expertise.

Build-time enforcement

Our CI runs five audits on every pull request and a sixth nightly. Any failure blocks merge to main:

Content audit (scripts/content-audit.mjs)
Indexed articles must contain zero phrases from our blocked-pattern list (e.g. "delve into", "in today's digital landscape", "a testament to"). They must also carry a named reviewer, a substantive (≥ 30 character) contribution description, and at least one first-hand source URL.
Freshness audit (scripts/freshness-check.mjs)
Indexed articles older than 12 months (per lastReviewed) hard-fail the build; articles older than 9 months trigger a warning and a visible "review overdue" banner.
Duplicate audit (scripts/duplicate-check.mjs)
We hash every paragraph (≥ 12 words) across all articles. If the same paragraph appears in two indexed articles, the build fails. This catches the "scaled content abuse" pattern even when individual phrases pass the AI-tell filter.
Link integrity (scripts/link-check.mjs)
Every outbound URL in an indexed article is probed (HEAD, then GET on 405/403). Any 4xx/5xx response on an indexed article hard-fails the build. Network timeouts produce warnings. Runs nightly on a cron in case external sources rot.
SEO audit (scripts/seo-audit.mjs)
After build, verifies: (a) every indexed article appears in the sitemap, (b) no draft appears in the sitemap, (c) every built page has exactly one self-referential canonical, (d) every indexed article has valid Schema.org JSON-LD with author Person and publisher Organization, (e) no indexed article is orphaned (every indexed article has at least one internal link to another indexed article).

The blocked-pattern list lives in editorial/AI_PHRASE_BLOCKLIST.md; the regex enforcement in scripts/content-audit.mjs; the freshness ceiling in scripts/freshness-check.mjs. All scripts are open in our public repository.

How to flag a suspicious article

If an indexed article reads like it slipped through the gate, mail editor@webaccessibility.wiki with the URL and the passage you are concerned about. We re-audit; if the gate slipped, we move the article back to draft pending re-review and log the incident in our quarterly transparency report.

Quarterly audit

Each quarter we re-run the audit, sample 10% of indexed articles for manual spot-check, and cross-check that the first-hand-source URLs still resolve. Results are published in our transparency report.

Currently indexed

At v0.1 we have 0 indexed articles and 11 drafts under editorial review. Drafts are visible to editors at /_drafts/; they are not visible to search engines. This is the honest state — the site shipped with AI-drafted content that has not yet completed substantive review, and we do not promote that content to indexed until a named editor signs off.