ODEs were designed for biological dynamics. Machine learning has been using them on toy spirals. This series points a Neural ODE at the FitzHugh-Nagumo neuron model — the canonical 2D simplification of Hodgkin-Huxley excitability.
From Synthetic Spirals to Real Dynamical Systems
The original Neural ODE paper used 2D spirals as the demonstration — clean phase portrait, low dimension, non-trivial flow. But spirals are not where Neural ODEs were going to matter. The natural application is to dynamical systems that genuinely live in continuous time — which is to say, biology.
FitzHugh-Nagumo
FitzHugh (1961) and Nagumo et al. (1962) introduced a two-variable simplification of the four-variable Hodgkin-Huxley action-potential model:
Standard parameters: $a = 0.7$, $b = 0.8$, $\tau = 12.5$. $I$ is an external current. For $I$ above $\sim 0.34$, a stable limit cycle (sustained spiking). Below, a stable fixed point (quiescent). This is the canonical reduced model for "is the neuron firing right now?"
Why Continuous-Time Inductive Biases?
Discrete RNNs ($y_{t+1} = y_t + g(y_t)$) effectively assume a fixed timestep baked into $g$. If we train on data sampled at $\Delta t = 0.5$ and want to predict at $\Delta t = 0.25$, the model has no way to take a half-step. The whole representation is bound to the sampling rate.
Neural ODEs ($dy/dt = f(y)$) have no notion of timestep. We can integrate with whatever solver and step size we want at inference, as long as $f$ approximates the true vector field. The model lives in the same continuous space the underlying physics live in.
For biology this is decisive. Calcium imaging at 30 Hz. Patch-clamp at 10 kHz. RNA-seq at one point per hour. None of these are the natural timescale of the underlying processes; they are instrument sampling rates.
Autonomous vs Non-Autonomous Dynamics
The original Neural ODE paper used the non-autonomous form $dy/dt = f(t, y)$ for maximum flexibility. For most biological systems the true dynamics are autonomous — the RHS depends on the current state but not on absolute clock time. Including $t$ in $f$ gives the model freedom to overfit to specific time intervals.
In our experiments this matters quantitatively. Including $t$ in the Neural ODE input causes extrapolation MSE to balloon from $0.02$ to $5.0$ on the same data. For FHN and most biological models, autonomous dynamics are the correct inductive bias.
What We Aim to Demonstrate
- Trajectory fit. The Neural ODE can fit FHN spike trajectories with low MSE.
- Extrapolation. The same Neural ODE predicts trajectories 4× longer than the training horizon without re-training.
- Phase-portrait recovery. The learned vector field $f$ matches the ground-truth FHN flow with high cosine similarity — evidence the model learned the dynamics, not just the time series.
- Comparison with a discrete RNN. A parameter-matched residual RNN fits the time series equally well but its learned vector field is not a true continuous flow — it lives in step-space.