Gradient
Git commits for ML training
Stop restarting from scratch. Snapshot, rewind, and fork your training runs so iteration becomes commit + branch + compare.
The problem with ML training today
Training a model is a long, expensive, stochastic process. When your team discovers a late-stage issue like bad data, leakage, or a buggy transform, iteration breaks.
Days of compute wasted
Start from scratch every time you need to test a change.
Restart and pray
No way to recover or fork from a specific training state.
Blocked iteration
Cannot quickly experiment with different approaches.
The missing version-control layer
With Gradient, you can take a state-complete snapshot at step t, then rewind and fork from that exact moment - transforming iteration into a git-like workflow.
Before Gradient
With Gradient
State-complete snapshots
The technical core is what we mean by "snapshot." It is not just weights.
A Gradient snapshot includes the full training state that actually determines the trajectory:
Model weights
Current parameters of your neural network.
Optimizer state
Momentum, velocity, and adaptive learning rates.
Scheduler position
Current learning rate schedule state.
Dataloader and sampler state
Exact position in your training data.
RNG state
Random number generator state for reproducible forks.
This is what makes forks meaningfully comparable. You are changing one variable at a time from the same exact starting point, turning training into a controlled experiment rather than a prayer.
Stop burning compute on restarts
Join ML teams who are iterating faster with version-controlled training runs.