Model Architecture
HYDRA is a hybrid GRU-Transformer designed to correct National Water Model errors in real-time. Here's how the three-stage pipeline works.
Model Pipeline
Stage 1
NWM Forecast
Hourly short-range discharge predictions
Stage 2
Residual Calculator
Compute model error against observations
Stage 3
HYDRA Transformer
Temporal attention with hydrologic constraints
Stage 4
Corrected Streamflow
Bias-corrected hydrograph at target sites
LegendInputs and intermediate residuals are transformed into a site-specific corrected discharge signal while preserving physical plausibility.
Total Parameters
998,648
~1 million trainable parameters
Model Size
3.81MB
Float32 precision weights
Input Features
16
NWM + ERA5 meteorological
Sequence Length
168
7 days hourly timesteps
Parameter Distribution
The model's 998,648 parameters are distributed across specialized components, with the transformer encoder comprising over half of the total capacity.
Transformer Encoder (4 layers)
529,92053.1%
Multi-Scale Temporal Convolutions
255,36325.6%
Fusion Network
82,0488.2%
Attention Pooling
66,0486.6%
GRU Encoder
56,0645.6%
Regime-Conditioned Bias
4,1610.4%
Feature Importance Gate
4,2400.4%
Prediction Heads & Other
8040.1%
Configuration
- Hidden Dimension
- d_model = 128
- Attention Heads
- num_heads = 4
- Transformer Layers
- num_layers = 4
- Dropout Rate
- dropout = 0.1
- Multi-Scale Branches
- 3 (short, mid, long)
Key Features
- →Feature importance gating for precipitation emphasis
- →Multi-scale temporal convolutions (1-7 day ranges)
- →Regime-conditioned bias correction
- →Attention pooling with learnable query token
- →Hybrid GRU-Transformer architecture
Now that you understand the architecture, see how different configurations perform.
Explore Experiments