Tutorial: From Zero to Neptune

Rediscover Neptune from scratch using N-body simulation and optimization.
Author

Jonathan Whitmore

Published

April 3, 2026

Introduction

In 1846 Urbain Le Verrier predicted the existence and position of an unseen planet from nothing more than decades of anomalous Uranus observations. This tutorial recreates that discovery computationally. You will:

  1. Load 66 historical Uranus longitude observations (1781–1846).
  2. Build an N-body simulation of the outer solar system with REBOUND.
  3. See the ~130 arcsec residual signal that Neptune imprints on Uranus.
  4. Search for Neptune with scipy.optimize.differential_evolution.
  5. Visualize the before-and-after fit.

Everything runs offline using bundled CSV data — no network access required.

Setup

Clone the repository and install dependencies:

git clone https://github.com/jbwhit/discoverneptune.git
cd discoverneptune
uv sync

Then load the two bundled data files. The observations file contains 66 heliocentric Uranus longitudes in radians, sampled roughly once per year from Herschel’s 1781 discovery through the eve of Neptune’s identification. The state-vectors file gives the Cartesian positions and velocities of Jupiter, Saturn, and Uranus at the simulation epoch (1781-01-01).

Load bundled CSV data
from pathlib import Path

import numpy as np
import pandas as pd
from discoverneptune.data import find_bundled_data_dir

data_dir = find_bundled_data_dir()

observations = pd.read_csv(data_dir / "uranus_observations_1781_1846.csv")
outer_planets = pd.read_csv(data_dir / "outer_planet_state_vectors_1781_01_01.csv")

print(f"Observations: {len(observations)} rows")
print(f"State vectors: {len(outer_planets)} rows")
observations.head()
Observations: 66 rows
State vectors: 3 rows
datetime_jd longitude_rad
0 2371557.5 1.564843
1 2371922.5 1.641149
2 2372287.5 1.717966
3 2372652.5 1.795309
4 2373018.5 1.873409

Uranus observations

The raw longitudes trace Uranus’s slow march around the ecliptic — about one full orbit every 84 years. Plotting them shows the smooth progression that 18th- and 19th-century astronomers tracked with increasing precision.

Animated top-down view of the solar system accumulating 66 annual Earth-to-Uranus sight lines from 1781 to 1846

The 66 annual observations as Earth-to-Uranus sight lines accumulating across the 1781–1846 window — the raw geometry behind every plot that follows.
Plot raw Uranus longitudes
import matplotlib.pyplot as plt

obs_years = (observations["datetime_jd"].values - 2371557.5) / 365.25 + 1781.0
_lon_deg = np.degrees(observations["longitude_rad"].values)

def _build(theme="light"):
    apply_style(theme)
    pal = palette(theme)
    cycle = category_cycle(theme)
    fig, ax = plt.subplots(figsize=fig_size("wide"))
    ax.scatter(obs_years, _lon_deg, s=12, color=cycle[0], zorder=3)
    ax.set_xlabel("Year")
    ax.set_ylabel("Longitude (degrees)")
    ax.set_title("Uranus heliocentric longitude")
    return fig

display(dual_render(_build, alt="Heliocentric longitude of Uranus 1781–1846, one point per annual observation"))
Figure 1: Heliocentric longitude of Uranus, 1781–1846. Each point is one annual observation in degrees.

Build the forward model

The forward model places the Sun, Jupiter, Saturn, and Uranus into a REBOUND simulation using JPL state vectors at the 1781 epoch. An optional Neptune candidate — specified by mass, semi-major axis, eccentricity, and mean longitude — can be injected via orbital elements.

A few details matter:

  • Units first. sim.units = ("yr", "AU", "Msun") must be set before any particles are added; REBOUND uses the unit declaration to scale its internal constants.
  • Velocity conversion. The bundled CSV stores velocities in AU/day (the JPL Horizons default). REBOUND expects AU/yr, so every velocity component is multiplied by 365.25.
  • Centre of mass. sim.move_to_com() shifts the coordinate origin to the system barycenter, which the WHFast symplectic integrator requires for accurate long-term integration.
  • Heliocentric longitude. After integration, Uranus’s longitude is extracted as arctan2(y_uranus − y_sun, x_uranus − x_sun).
Define build_simulation()
import rebound

PLANET_MASSES = {
    "jupiter": 1 / 1047.35,
    "saturn": 1 / 3501.6,
    "uranus": 1 / 22869.0,
}
DAYS_PER_YEAR = 365.25


def build_simulation(
    neptune_params: dict[str, float] | None = None,
) -> rebound.Simulation:
    """Create a REBOUND simulation of the outer solar system.

    Args:
        neptune_params: If provided, must contain keys "mass_solar", "a",
            "e", and "l_rad" (mean longitude in radians).  Neptune is
            added via orbital elements.

    Returns:
        A configured REBOUND Simulation ready to integrate.
    """
    sim = rebound.Simulation()
    sim.units = ("yr", "AU", "Msun")

    # Sun
    sim.add(m=1.0)

    # Jupiter, Saturn, Uranus from JPL state vectors
    for row in outer_planets.itertuples(index=False):
        sim.add(
            m=PLANET_MASSES[row.planet],
            x=float(row.x),
            y=float(row.y),
            z=float(row.z),
            vx=float(row.vx) * DAYS_PER_YEAR,
            vy=float(row.vy) * DAYS_PER_YEAR,
            vz=float(row.vz) * DAYS_PER_YEAR,
            hash=row.planet.title(),
        )

    # Optional Neptune candidate
    if neptune_params is not None:
        sim.add(
            m=neptune_params["mass_solar"],
            a=neptune_params["a"],
            e=neptune_params["e"],
            l=neptune_params["l_rad"],
            hash="Neptune",
        )

    sim.move_to_com()
    sim.integrator = "whfast"
    sim.dt = 0.05
    return sim


print("build_simulation() defined.")
build_simulation() defined.

The companion function integrates the simulation forward and extracts Uranus’s heliocentric longitude at each observation epoch.

Define simulate_uranus_longitudes()
observation_times_jd = observations["datetime_jd"].values.astype(float)
observed_longitudes_rad = observations["longitude_rad"].values.astype(float)
epoch_jd = float(observation_times_jd[0])


def simulate_uranus_longitudes(
    neptune_params: dict[str, float] | None = None,
) -> np.ndarray:
    """Integrate and return Uranus heliocentric longitudes in radians.

    Args:
        neptune_params: Optional Neptune candidate parameters.

    Returns:
        Array of longitudes (radians) at each observation epoch.
    """
    sim = build_simulation(neptune_params)
    longitudes = np.empty(len(observation_times_jd))

    for i, jd in enumerate(observation_times_jd):
        t_yr = (jd - epoch_jd) / DAYS_PER_YEAR
        sim.integrate(t_yr)
        sun = sim.particles[0]
        uranus = sim.particles["Uranus"]
        longitudes[i] = np.arctan2(
            uranus.y - sun.y, uranus.x - sun.x
        )

    return longitudes


print("simulate_uranus_longitudes() defined.")
simulate_uranus_longitudes() defined.

Neptune is placed in the ecliptic plane with its pericenter along the reference direction (inclination, argument of pericenter, and longitude of ascending node all default to zero). This coplanar assumption matches Le Verrier’s approach and keeps the search at 4 free parameters.

See the anomaly

With no Neptune in the simulation, the modeled Uranus longitudes should match the observations if the four-body model (Sun + Jupiter + Saturn + Uranus) were complete. The residuals reveal that it is not.

Angle-wrapping keeps the residual in the range (−π, π) before converting to arcseconds. The signal grows to roughly 130 arcsec over 65 years — the gravitational fingerprint of the missing planet.

Compute and plot residuals without Neptune
sim_longitudes_no_neptune = simulate_uranus_longitudes(neptune_params=None)

residuals_rad = (observed_longitudes_rad - sim_longitudes_no_neptune + np.pi) % (2 * np.pi) - np.pi
residuals_arcsec = np.degrees(residuals_rad) * 3600.0

def _build(theme="light"):
    apply_style(theme)
    pal = palette(theme)
    cycle = category_cycle(theme)
    fig, ax = plt.subplots(figsize=fig_size("wide"))
    ax.scatter(obs_years, residuals_arcsec, s=12, color=cycle[1], zorder=3)
    truth_line(ax, 0, axis="y", theme=theme)
    ax.set_xlabel("Year")
    ax.set_ylabel("Residual (arcsec)")
    ax.set_title("Uranus residuals — no Neptune")
    return fig

display(dual_render(_build, alt="Uranus longitude residuals 1781–1846 with no Neptune, showing ~130 arcsec drift"))

print(f"Peak residual: {np.max(np.abs(residuals_arcsec)):.1f} arcsec")
Figure 2: Uranus residuals (observed minus modeled) with no Neptune. The ~130 arcsec drift is the signal Le Verrier used.
Peak residual: 114.0 arcsec

Manual exploration

Before unleashing an optimizer, it helps to build intuition by trying a few Neptune candidates by hand. Each candidate differs in semi-major axis and starting longitude — the two parameters the residuals are most sensitive to.

Try three Neptune candidates manually
candidates = [
    {"mass_solar": 5e-5, "a": 28.0, "e": 0.05, "l_rad": 1.0},
    {"mass_solar": 5e-5, "a": 35.0, "e": 0.05, "l_rad": 3.5},
    {"mass_solar": 5e-5, "a": 40.0, "e": 0.05, "l_rad": 5.5},
]

# Compute residuals outside builder so simulation runs only once
_candidate_residuals = []
for _params in candidates:
    _sim_lon = simulate_uranus_longitudes(neptune_params=_params)
    _res_rad = (observed_longitudes_rad - _sim_lon + np.pi) % (2 * np.pi) - np.pi
    _res_arcsec = np.degrees(_res_rad) * 3600.0
    _rms = np.sqrt(np.mean(_res_arcsec**2))
    _candidate_residuals.append((_params, _res_arcsec, _rms))

def _build(theme="light"):
    apply_style(theme)
    pal = palette(theme)
    cycle = category_cycle(theme)
    fig, axes = plt.subplots(1, 3, figsize=fig_size("double"), sharey=True)
    for ax, (params, res_arcsec, rms) in zip(axes, _candidate_residuals):
        ax.scatter(obs_years, res_arcsec, s=10, color=cycle[0], zorder=3)
        truth_line(ax, 0, axis="y", theme=theme)
        ax.set_xlabel("Year")
        ax.set_title(f"a={params['a']}, l={params['l_rad']:.1f}\nRMS={rms:.1f}″")
    axes[0].set_ylabel("Residual (arcsec)")
    return fig

display(dual_render(_build, alt="Uranus residuals for three hand-picked Neptune candidates with varying semi-major axis and longitude"))
Figure 3: Residuals for three hand-picked Neptune candidates. None is a good fit, but the residual shape responds strongly to (a, l).

The residual shape changes dramatically with different parameters, confirming that the signal carries real information about Neptune’s orbit. But finding the right combination by hand is impractical with four free parameters — time to optimize.

Optimize

We use scipy.optimize.differential_evolution, a global optimizer well suited to problems with broad, possibly multimodal search spaces. The objective function computes the sum of squared angle-wrapped residuals across all 66 observation epochs.

The bounds below are deliberately wide: mass spans an order of magnitude, semi-major axis covers 25–45 AU (well beyond the true ~30 AU), and the mean longitude is unconstrained over the full circle.

Note

Tutorial runtime configuration. The code cells below use workers=1 (single-core, for reproducible cell output across platforms) and polish=True (SciPy’s built-in L-BFGS-B polish). The production solver in src/discoverneptune/solver.py uses workers=-1 (parallel) and runs L-BFGS-B as an explicit second stage with polish=False for better control and logging. Both approaches find the same answer; the difference is presentation, not correctness.

Run differential evolution search
from scipy.optimize import differential_evolution


def objective(x: np.ndarray) -> float:
    """Sum of squared angle-wrapped residuals."""
    params = {"mass_solar": x[0], "a": x[1], "e": x[2], "l_rad": x[3]}
    sim_lon = simulate_uranus_longitudes(neptune_params=params)
    res_rad = (observed_longitudes_rad - sim_lon + np.pi) % (2 * np.pi) - np.pi
    return float(np.sum(res_rad**2))


bounds = [
    (1e-5, 1e-4),    # mass (solar masses)
    (25.0, 45.0),    # semi-major axis (AU)
    (0.0, 0.15),     # eccentricity
    (0.0, 2 * np.pi),  # mean longitude (radians)
]

result = differential_evolution(
    objective,
    bounds=bounds,
    maxiter=100,
    popsize=15,
    seed=42,
    polish=True,
    workers=1,      # readers can use -1 for parallel evaluation
    updating="deferred",
)

best = {
    "mass_solar": result.x[0],
    "a": result.x[1],
    "e": result.x[2],
    "l_rad": result.x[3],
}

print("Best-fit Neptune parameters:")
print(f"  Mass:            {best['mass_solar']:.2e} M_sun")
print(f"  Semi-major axis: {best['a']:.2f} AU")
print(f"  Eccentricity:    {best['e']:.4f}")
print(f"  Mean longitude:  {np.degrees(best['l_rad']):.1f}°")
print(f"  Optimizer cost:  {result.fun:.4e}")
Best-fit Neptune parameters:
  Mass:            6.58e-05 M_sun
  Semi-major axis: 32.74 AU
  Eccentricity:    0.1496
  Mean longitude:  215.4°
  Optimizer cost:  9.6828e-09

Visualize the result

The before-and-after comparison makes the improvement concrete. On the left, the unperturbed model drifts by over a hundred arcseconds. On the right, the optimized Neptune brings the residuals down to a few arcseconds — the model noise floor representing the gap between our simplified five-body Newtonian model and the modern JPL Horizons ephemeris.

Side-by-side residual comparison
sim_lon_best = simulate_uranus_longitudes(neptune_params=best)
res_best_rad = (observed_longitudes_rad - sim_lon_best + np.pi) % (2 * np.pi) - np.pi
res_best_arcsec = np.degrees(res_best_rad) * 3600.0

rms_before = np.sqrt(np.mean(residuals_arcsec**2))
rms_after = np.sqrt(np.mean(res_best_arcsec**2))

def _build(theme="light"):
    apply_style(theme)
    pal = palette(theme)
    cycle = category_cycle(theme)
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=fig_size("double"), sharey=True)

    ax1.scatter(obs_years, residuals_arcsec, s=12, color=cycle[1], zorder=3)
    truth_line(ax1, 0, axis="y", theme=theme)
    ax1.set_xlabel("Year")
    ax1.set_ylabel("Residual (arcsec)")
    ax1.set_title(f"No Neptune — RMS = {rms_before:.1f}″")

    ax2.scatter(obs_years, res_best_arcsec, s=12, color=cycle[2], zorder=3)
    truth_line(ax2, 0, axis="y", theme=theme)
    ax2.set_xlabel("Year")
    ax2.set_title(f"Best-fit Neptune — RMS = {rms_after:.1f}″")

    return fig

display(dual_render(_build, alt="Uranus residuals before and after adding best-fit Neptune, showing reduction from ~130 arcsec to a few arcsec"))

print(f"RMS before Neptune: {rms_before:.1f} arcsec")
print(f"RMS after Neptune:  {rms_after:.1f} arcsec")
Figure 4: Uranus residuals before (left) and after (right) adding the best-fit Neptune. The optimizer reduces the signal from ~130 arcsec to a few arcsec.
RMS before Neptune: 31.5 arcsec
RMS after Neptune:  2.5 arcsec

Starting from 66 longitude measurements and a simple N-body integrator, the optimizer recovers a Neptune-like planet that collapses the residuals by more than an order of magnitude. The remaining few-arcsecond scatter is the model noise floor — Le Verrier’s mathematical triumph, reproduced in a few seconds of computation.


Where to go next