Reproducibility and Artifacts

How the notebooks, caches, and website outputs are built.

This project follows a deliberate execution policy so that scientific outputs on the website do not drift silently between commits.

Execution Classes

Narrative .qmd pages and lightweight notebooks are executed in CI as part of the normal test and docs-deploy cycle. Heavy notebooks are executed only via a manual refresh workflow.

Heavy Notebooks

The heavy notebooks are:

  • bayesian.ipynb — MCMC posterior and convergence diagnostics
  • sensitivity.ipynb — noise injection, sparsity sweeps, Bode’s Law bias
  • validation.ipynb — detectability, hindcast, simulation-based calibration

These notebooks may use local caches under data/cache/, which is intentionally ignored by git. Their committed outputs on the website come from the refresh workflow, not from ad-hoc local runs.

Refresh Workflow

The refresh-doc-artifacts GitHub Actions workflow regenerates the heavy notebook outputs. Trigger it from the Actions UI when the scientific result changes intentionally. It:

  1. Installs the project with uv sync --frozen.
  2. Executes the three heavy notebooks non-inplace under a staging directory.
  3. Copies the fresh outputs into docs/ and runs scripts/sanitize_notebooks.py.
  4. Renders the Quarto site against the refreshed notebooks.
  5. Uploads both the refreshed notebooks and the rendered site as artifacts for review before anyone commits them.

Commit Hygiene

Committed notebook outputs must not include local absolute paths (/Users/..., /home/runner/...), worktree paths (.worktrees/...), or known transient warnings. scripts/sanitize_notebooks.py enforces this; tests/test_notebook_docs.py asserts it on every CI run.

Uranus Observation Animations

scripts/render_uranus_observation_animation.py produces 2D Matplotlib animations of Earth-to-Uranus sight lines accumulating over time. The --observation-set flag picks which dataset drives the rays:

Set Span Rays Use case
historical_all 1690–1846 26 Full Le Verrier record incl. pre-discovery
historical_post_discovery 1781–1846 7 Post-discovery only (sparse)
proxy 1781–1846 66 Annual JPL-derived proxy (dense, default)

Regenerate the committed GIFs from data/:

for set in historical_all historical_post_discovery proxy; do
  name=$(echo "$set" | sed 's/historical_post_discovery/historical-post/; s/historical_all/historical-all/')
  raw=docs/assets/animations/uranus-observations-${name}-raw.gif
  out=docs/assets/animations/uranus-observations-${name}.gif

  uv run --frozen python scripts/render_uranus_observation_animation.py \
    --observation-set "$set" \
    --output "$raw" \
    --frames 360 --fps 30 --intro-frames 120 --dpi 80

  ffmpeg -y -i "$raw" \
    -vf "split[s0][s1];[s0]palettegen=max_colors=128[p];[s1][p]paletteuse=dither=bayer:bayer_scale=5" \
    -loop 0 "$out"
  rm "$raw"
done

The Pillow GIF writer produces large files (~5–9 MB at default fidelity) because it writes an unoptimized palette per frame. The ffmpeg palettegen post-process drops the committed assets to ~1 MB each with no visible quality loss; if ffmpeg isn’t available on your system, the raw Pillow GIFs are also valid output — they’re just larger.

The historical_* sets read data/leverrier_historical_observations.csv (see the bundled README for provenance). Pre-discovery rays (1690–1771) render in a cooler hue than post-discovery rays so the 1781 transition reads visually; the discovery x pulses around 1781 mid-timeline in the historical_all animation.

Local review renders should go under ignored artifacts/ or /tmp. docs/assets/animations/ is reserved for intentionally accepted website assets, not ad-hoc render experiments.