← Back to insights

Auditing hallucinated citations in your back catalogue

The model invents a Forrester benchmark. The percentage looks plausible, the analyst’s name is real, the journal title is half-real. You ship the post; a reader clicks the footnote; the URL 404s. The next morning you start auditing which figures in the back catalogue are real and which are fabrications you signed off on.

This post is the audit-side companion to how to cite a marketing statistic without becoming a vendor pitch. That post is about authoring discipline upstream; this post is about auditing what you have already shipped. Both lean on the Playbook’s evidence standards.

Why models fabricate citations

The training data has citation patterns — (Author, Year) shapes, “according to a 2023 Forrester study”, “as published in Journal of Marketing Research”. The easiest path to a confident-sounding sentence is to inhabit the pattern. A model asked for a “supporting statistic” that does not have one in its context will, at default temperature and without retrieval grounding, write a sentence that fits the pattern without resolving to a real source.

The fabrication is not random. It looks like a real cite because the pattern it inhabits is a real pattern. The text says “Forrester (2024) found that 67% of CMOs…”; Forrester exists, 2024 exists, the sentence shape is correct, and the figure is invented.

The audit checklist for a single bolded figure

Run each bolded figure in your back catalogue through the same four checks:

  1. Click the URL. Does it resolve to a real page on the publisher’s domain? A 404 or a redirect to the publisher’s homepage is the first failure mode.
  2. Confirm the publication. Search the publisher’s site for the report title. The publisher’s name (Forrester, Gartner, McKinsey) is easy — the model gets the well-known names right. The report title is where fabrication lives, and a search on the publisher’s site for the title should return a real page.
  3. Confirm the figure. Open the source. Does the page contain the percentage, dollar amount, or count your copy bolded? The figure your post cites must appear on the page you cited, in the year you named.
  4. Confirm the date. A 2024 figure cited as a 2025 figure is the same hallucination shape with one digit off. The Playbook’s rule is that any figure older than twenty-four months has its year named explicitly in prose, and the audit confirms the year.

A figure that fails any of the four is replaced or cut. The Playbook’s standard explicitly disallows “find a secondary source because the primary is not findable” — secondary sources are how the fabrication propagates.

The audit cadence for the back catalogue

A 200-piece back catalogue is too much to audit in a sprint. The cadence:

  • Sample roughly one in ten bolded figures per month, distributed across content types. A spreadsheet with the figure, the URL, the audit date, and a pass/fail flag is enough.
  • If the failure rate climbs above one in twenty, expand the sample. The catalogue has a systemic issue and a one-in-ten sample is undercounting it.
  • Errata-on-find. A failure becomes an entry in your equivalent of the errata channel — the figure is corrected, the post is re-published, the audit row is closed.

What the Playbook does upstream

The Playbook’s verified case-study register is the worked example of the discipline. Every case has a named source URL, a retrieval date, a vendor-origination flag, and an explicit caveat in prose. The bar each entry has to clear is stated publicly on the case study standards page so a reader can hold the work to it; the register itself reads inside the subscriber library. The same shape — source, date, flag, caveat — is the row your team’s audit spreadsheet should carry.

Auditing a back catalogue is the cheaper version of the discipline. Building the catalogue without fabrications in the first place is the Playbook’s full standard.

The full evidence standard — three source tiers, the rules on vendor-published cases, the statistic-age cut-offs — is at /standards/evidence. The free sample chapter reads at full length in your browser; the full Playbook is the rest of the work.


Liked this? Most of the work it draws on lives in the Playbook. See what's on offer or read two sample chapters.