Model Architecture

HYDRA is a hybrid GRU-Transformer designed to correct National Water Model errors in real-time. Here's how the three-stage pipeline works.

Model Pipeline

Stage 1

NWM Forecast

Hourly short-range discharge predictions

Stage 2

Residual Calculator

Compute model error against observations

Stage 3

HYDRA Transformer

Temporal attention with hydrologic constraints

Stage 4

Corrected Streamflow

Bias-corrected hydrograph at target sites

LegendInputs and intermediate residuals are transformed into a site-specific corrected discharge signal while preserving physical plausibility.

Total Parameters

998,648

~1 million trainable parameters

Model Size

3.81MB

Float32 precision weights

Input Features

NWM + ERA5 meteorological

Sequence Length

168

7 days hourly timesteps

Parameter Distribution

The model's 998,648 parameters are distributed across specialized components, with the transformer encoder comprising over half of the total capacity.

Transformer Encoder (4 layers)

529,92053.1%

Multi-Scale Temporal Convolutions

255,36325.6%

Fusion Network

82,0488.2%

Attention Pooling

66,0486.6%

GRU Encoder

56,0645.6%

Regime-Conditioned Bias

4,1610.4%

Feature Importance Gate

4,2400.4%

Prediction Heads & Other

8040.1%

Configuration

Hidden Dimension: d_model = 128
Attention Heads: num_heads = 4
Transformer Layers: num_layers = 4
Dropout Rate: dropout = 0.1
Multi-Scale Branches: 3 (short, mid, long)

Key Features

→Feature importance gating for precipitation emphasis
→Multi-scale temporal convolutions (1-7 day ranges)
→Regime-conditioned bias correction
→Attention pooling with learnable query token
→Hybrid GRU-Transformer architecture

Now that you understand the architecture, see how different configurations perform.

Explore Experiments