Deconstructing LNNs: Part 1 - Mathematics

Introduction

RNNs and LSTMs process sequences step-by-step, updating a hidden state to carry forward memory. They work well enough when data arrives at regular intervals and the environment is stable.

But they operate in discrete time with fixed weight matrices. The real world is neither discrete nor fixed. What if a network's hidden state evolved according to continuous differential equations that adapt their own structure based on the incoming data?

That is the core idea behind Liquid Neural Networks (LNNs). This series deconstructs Liquid Time-Constant (LTC) networks: the math (Part 1), a PyTorch implementation (Part 2), and a head-to-head benchmark against an LSTM (Part 3).

The Problem with Discrete Time

An LSTM applies the transition $h_t = f(h_{t-1}, x_t)$ exactly once per input, regardless of whether the gap between steps was 1 millisecond or 1 hour. If a sensor drops a packet, the network has no mechanism to account for that temporal gap. It just applies the same update rule as if nothing happened.

On top of that, the weight matrices are static after training. The same set of matrices handles every input, whether the environment is calm or chaotic. In domains like autonomous driving or high-frequency trading, the network needs to shift how it processes information depending on what it is seeing right now.

Neural Ordinary Differential Equations (Neural ODEs)

The first step is replacing discrete transitions entirely. Instead of defining the hidden state at step $t+1$, we define its derivative--the continuous rate of change.

In a Neural ODE, the hidden state $x(t)$ evolves according to a neural network $f$:

\frac{dx(t)}{dt} = f(x(t), I(t), \theta)

$I(t)$ is the input stream, $\theta$ are learned weights. To get the state at any time $T$, we integrate from $t=0$ to $t=T$ with a standard ODE solver (Euler, Runge-Kutta, etc.). This handles irregular sampling directly--we can evaluate the network at any point in continuous time.

The Liquid Time-Constant (LTC) Architecture

Neural ODEs still use static weights $\theta$ to govern the derivative. Hasani et al. (2021) went further, drawing on the neural wiring of C. elegans to design the LTC model.

In an LTC, the rate of change of a neuron's state is determined by a leak term and synaptic input:

\frac{dx(t)}{dt} = - \left[ \frac{1}{\tau} + f(x(t), I(t), \theta) \right] x(t) + A \cdot f(x(t), I(t), \theta)

Rearranging, we can isolate a dynamic, state-dependent time constant $\tau_{sys}$:

\tau_{sys} = \frac{\tau}{1 + \tau \cdot f(x(t), I(t), \theta)}

The effective speed and memory of the neuron ($\tau_{sys}$) now depends on both the input $I(t)$ and the current state $x(t)$. It is not a fixed hyperparameter--it is a learned, input-driven quantity.

Why "Liquid"?

The differential equation governing the system reshapes itself in response to the data. On smooth, slow-moving inputs, $\tau_{sys}$ stays high and the network integrates information over longer windows. On a sudden spike or discontinuity, $f$ jumps, $\tau_{sys}$ drops, and the network flushes old state to react immediately.

Because the dynamics live in the ODE structure rather than in large weight matrices, LTCs can match the performance of much larger models on control tasks with orders of magnitude fewer parameters.

Deconstructing Liquid Neural Networks (LNNs)

Part 1: Neural ODEs and Liquid Time-Constants

The Problem with Discrete Time

Neural Ordinary Differential Equations (Neural ODEs)

The Liquid Time-Constant (LTC) Architecture

Why "Liquid"?