Neural ODEs on Biological Dynamics: Part 1 — The FHN Model and Why Continuity Matters

Overview

ODEs were designed for biological dynamics. Machine learning has been using them on toy spirals. This series points a Neural ODE at the FitzHugh-Nagumo neuron model — the canonical 2D simplification of Hodgkin-Huxley excitability.

From Synthetic Spirals to Real Dynamical Systems

The original Neural ODE paper used 2D spirals as the demonstration — clean phase portrait, low dimension, non-trivial flow. But spirals are not where Neural ODEs were going to matter. The natural application is to dynamical systems that genuinely live in continuous time — which is to say, biology.

FitzHugh-Nagumo

FitzHugh (1961) and Nagumo et al. (1962) introduced a two-variable simplification of the four-variable Hodgkin-Huxley action-potential model:

\frac{dv}{dt} = v - \frac{v^3}{3} - w + I, \qquad \frac{dw}{dt} = \frac{v + a - b w}{\tau}.

Standard parameters: $a = 0.7$, $b = 0.8$, $\tau = 12.5$. $I$ is an external current. For $I$ above $\sim 0.34$, a stable limit cycle (sustained spiking). Below, a stable fixed point (quiescent). This is the canonical reduced model for "is the neuron firing right now?"

Why Continuous-Time Inductive Biases?

Discrete RNNs ($y_{t+1} = y_t + g(y_t)$) effectively assume a fixed timestep baked into $g$. If we train on data sampled at $\Delta t = 0.5$ and want to predict at $\Delta t = 0.25$, the model has no way to take a half-step. The whole representation is bound to the sampling rate.

Neural ODEs ($dy/dt = f(y)$) have no notion of timestep. We can integrate with whatever solver and step size we want at inference, as long as $f$ approximates the true vector field. The model lives in the same continuous space the underlying physics live in.

For biology this is decisive. Calcium imaging at 30 Hz. Patch-clamp at 10 kHz. RNA-seq at one point per hour. None of these are the natural timescale of the underlying processes; they are instrument sampling rates.

Autonomous vs Non-Autonomous Dynamics

The original Neural ODE paper used the non-autonomous form $dy/dt = f(t, y)$ for maximum flexibility. For most biological systems the true dynamics are autonomous — the RHS depends on the current state but not on absolute clock time. Including $t$ in $f$ gives the model freedom to overfit to specific time intervals.

In our experiments this matters quantitatively. Including $t$ in the Neural ODE input causes extrapolation MSE to balloon from $0.02$ to $5.0$ on the same data. For FHN and most biological models, autonomous dynamics are the correct inductive bias.

What We Aim to Demonstrate

Trajectory fit. The Neural ODE can fit FHN spike trajectories with low MSE.
Extrapolation. The same Neural ODE predicts trajectories 4× longer than the training horizon without re-training.
Phase-portrait recovery. The learned vector field $f$ matches the ground-truth FHN flow with high cosine similarity — evidence the model learned the dynamics, not just the time series.
Comparison with a discrete RNN. A parameter-matched residual RNN fits the time series equally well but its learned vector field is not a true continuous flow — it lives in step-space.