Visual Regression

Visual Regression Plan

The goal is to catch unintended HTML/CSS/JS changes caused by edits to the shared Hugo generator. The test system should build representative template sites and live workspaces, serve them locally, capture browser screenshots, and compare them against approved baselines.

Sites Under Test

Production-like websites:

quantalumin/quantalumin.com
henry/henrysheehy.com
vicky/technoantiques
tutorlumin/tutorlumin.co.uk

Template families:

archimedes/website
einstein/website
mariecurie/website
rosalindfranklin/website
richardfeynman/website
eurekadynamicsgroup/website
moleculargroup/website
symmetrygroup/website
wavefrontresearch/website

The template list should be generated from the Lab workspace template catalog where possible:

Lab/inventory/workspace-templates.yml

Page Types

Each site should expose a small manifest of representative paths:

homepage: /
listpage: /blog/
normal_page: /about/
particular_pages:
  - /contact/
  - /publications/

The test runner should support automatic fallbacks:

  • homepage: /
  • listpage: first page where Hugo kind is section or a configured section
  • normal page: first non-draft regular page
  • particular pages: explicit per-site list only

Explicit manifests should win over discovery so important pages remain stable.

Viewports

Capture at least:

desktop: 1440x1100
tablet:  834x1112
mobile:  390x844

For each page, capture:

  • full-page screenshot
  • viewport screenshot above the fold
  • optional DOM summary JSON for structure-sensitive pages
  • optional console/network error log

Baseline Policy

Baselines should be committed only for approved generator states:

tests/visual/baselines/<site>/<page>/<viewport>.png
tests/visual/baselines/<site>/<page>/<viewport>.json

New comparisons should write artifacts to:

tests/visual/artifacts/<run-id>/

The runner should fail if:

  • a page does not build
  • a page does not load
  • the page has uncaught runtime errors
  • text or controls overlap enough to create obvious layout regressions
  • screenshot difference exceeds a configured threshold

Build And Serve Model

Use isolated build directories so visual tests do not mutate site worktrees:

/tmp/quantalumin-visual/<site>/public

For each site:

  1. Build with Hugo using the candidate generator checkout.
  2. Serve the generated public directory on a unique localhost port.
  3. Capture pages with Playwright.
  4. Compare against the baseline.
  5. Write artifacts and a summary report.

The runner should not require public internet access for ordinary pages. Tests that depend on third-party widgets should either stub those requests or mark the network dependency explicitly.

Proposed Commands

Implemented smoke check:

npm run visual:smoke

This creates an isolated temporary Hugo site, builds it with a temporary copy of the generator theme, captures a Playwright screenshot, mutates the temporary Hugo layout, rebuilds, captures again, and fails unless the screenshot hash changes. It proves the test harness can detect generator-side Hugo code changes without touching real site worktrees.

Implemented real-site commands:

npm run ci:generator
npm run visual:capture
npm run visual:compare
npm run visual:update
npm run visual:approve
npm run visual:docs
npm run visual:capture:docs
npm run visual:precommit

Local Versus Forgejo CI

Keep the full real-site visual workflow local for now. The matrix in tests/visual/sites.json points at workstation/Lab paths such as /home/henry/Workspaces/...; it is designed for interactive review on the machine that has all website workspaces available.

Use this locally before committing generator changes:

npm run visual:precommit
npm run visual:capture
npm run visual:approve

Use this in Forgejo CI:

npm run ci:generator

ci:generator is intentionally portable: it syntax-checks the visual scripts, runs the isolated synthetic visual smoke, and builds the generator docs. It does not require the real website workspaces to exist on the runner.

The real-site matrix is configured in tests/visual/sites.json. It builds each site into a temporary directory with the candidate generator checkout as the theme source, serves the generated public directory locally, captures homepage/list/single/custom pages across desktop/tablet/mobile viewports, and writes run artifacts to tests/visual/artifacts/<run-id>/.

Each capture/compare/update run writes:

tests/visual/artifacts/<run-id>/index.html
tests/visual/artifacts/<run-id>/summary.json
tests/visual/artifacts/latest.json

Open index.html to review thumbnails grouped as cards with site, page, viewport, route mode, hash, metadata links, and runtime/network notes.

To publish the latest visual report under the generated docs output, run:

npm run visual:docs

This rebuilds public-docs/ and copies the latest ignored artifact run to:

public-docs/visual/index.html
public-docs/visual/runs/<run-id>/index.html

For the full local loop in one command, run:

npm run visual:capture:docs

If the generated docs site is exposed through a protected domain, the visual report is then available as a docs subpath, for example /visual/, without a separate visual-review application.

Each path can be classified:

{ "path": "/research/", "mode": "compare" }
{ "path": "/", "mode": "capture", "reason": "dynamic homepage" }

compare paths are eligible for committed baselines and hash comparison. capture paths still build, load, and write screenshots/metadata, but do not block visual:compare on pixel drift. Promote a capture route to compare only after repeated captures show it is stable enough or after volatile regions have been masked.

visual:update writes approved baseline PNG/JSON pairs under tests/visual/baselines/. visual:compare captures fresh screenshots and compares them against those baselines by hash, leaving actual screenshots and mismatch JSON in the artifacts directory for review. Exact pixel comparison is best used for stable routes; dynamic hero/runtime pages should first be reviewed from capture artifacts before being promoted to committed baselines.

Approval workflow:

  1. Run npm run visual:capture and inspect the newest artifact directory.
  2. If a capture-only page is stable and worth gating, change its path mode from capture to compare.
  3. Run npm run visual:update to write/update baselines for compare routes.
  4. Commit the generator change, tests/visual/sites.json mode change if any, and the matching tests/visual/baselines/ files together.
  5. Future npm run visual:compare runs will fail on unexpected drift for those approved compare routes.

For the common “approve the latest reviewed capture” case, run:

npm run visual:approve

This reads tests/visual/artifacts/latest.json and promotes only the latest compare route screenshots/metadata into tests/visual/baselines/.

visual:precommit is a local developer hook. It runs the deterministic Hugo-change smoke and then captures the real-site matrix. This catches broken Hugo builds, unreachable pages, browser crashes, and obvious artifact generation problems without making every commit depend on exact pixel hashes for dynamic pages. Do not use it as the default Forgejo CI gate until the website fixture checkout path is made portable.

A repository pre-commit hook is available at .githooks/pre-commit; enable it with:

git config core.hooksPath .githooks

Initial implementation:

npm run visual:plan
npm run visual:capture
npm run visual:compare
npm run visual:update

Suggested script names:

scripts/visual/site-matrix.mjs
scripts/visual/build-sites.mjs
scripts/visual/capture-sites.mjs
scripts/visual/compare-screenshots.mjs

First Implementation Phase

Start with four sites and three paths each:

quantalumin/quantalumin.com: /, /about/, /contact/
henry/henrysheehy.com: /, /research/, /publications/
vicky/technoantiques: /, /about/, /contact/
tutorlumin/tutorlumin.co.uk: /, /courses/, /contact/

Then add template families in batches. The first template batch should include:

einstein/website
mariecurie/website
eurekadynamicsgroup/website

Review Workflow

When changing generator code:

  1. Run targeted tests for the changed layer.
  2. Run visual:capture for the affected sites.
  3. Compare against baselines.
  4. Inspect artifacts for any screenshot diffs.
  5. If the change is intended, update baselines in the same commit.
  6. If the change is unintended, fix the generator before merging.

This gives us HTML/CSS/JS frontend tests without making every shared generator change depend on manually opening many websites.

Published Report

When a visual capture has been produced, publish it into the generated docs output with:

npm run visual:docs

For a one-command local capture and docs publication:

npm run visual:capture:docs

The generated report is written to:

public-docs/visual/index.html

If the docs output is deployed to a public or protected docs domain, the latest visual report is served as /visual/ beneath that same site.