Gradient Descent Simulation v1.0.0

Ilyes Azouani

Gradient Descent Simulation (1.0.0)

This model visualizes gradient descent optimization - the fundamental algorithm used to train neural networks and other machine learning models. Agents represent different optimization algorithms searching for the minimum of a loss landscape (the “error surface” that ML models try to minimize during training).

The model demonstrates how different optimizer types (SGD, Momentum with different parameters) behave on various loss landscapes, from simple bowls to the notoriously difficult Rosenbrock “banana valley” function. This helps build intuition about why certain optimization algorithms work better than others for different problem geometries.

HOW IT WORKS

Agents (Optimizers):
- Each agent represents an optimizer instance trying to find the global minimum (located at coordinates 0,0)
- Agents move according to gradient descent rules: they calculate the gradient (slope) of the loss function at their current position and move “downhill”

Three Optimizer Types:
1. SGD (Red) - Standard Stochastic Gradient Descent with no momentum (momentum = 0)
2. Momentum (Yellow) - Uses momentum = 0.9 to accelerate in consistent directions
3. Momentum 0.95 (Green) - Higher momentum (0.95) with 2x learning rate for faster convergence

Movement Rules:
1. Calculate gradient at current position (numerical derivative)
2. Apply gradient clipping to prevent extreme steps
3. Update velocity using momentum: new_velocity = momentum × old_velocity - learning_rate × gradient
4. Add optional stochastic noise (simulating mini-batch effects)
5. Move to new position
6. Check if converged (loss below threshold for 20+ steps)

Loss Landscapes:
The patch colors represent the loss function value (darker blue = lower loss). Four landscapes are available:
- Simple Bowl: Smooth quadratic function - easiest to optimize
- Ravine: Elongated valley (10x steeper in one direction) - tests handling of ill-conditioned problems
- Rosenbrock: The famous “banana valley” with a curved, narrow valley - very challenging
- Complex: Multiple local minima created by sinusoidal oscillations overlaid on a bowl

HOW TO USE IT

Buttons:
- setup - Initializes the model: creates the loss landscape, places optimizers randomly, marks the global minimum (red patch at center)
- go - Runs the simulation continuously until all optimizers converge

Sliders:
- num-optimizers (0-100) - Number of optimizer agents to create
- base-learning-rate (0.01-1) - Step size for gradient descent. Smaller = slower but more stable.

Choosers:
- optimizer-type - Select which optimizer to use:
- “SGD” - All agents use standard gradient descent (red)
- “Momentum” - All agents use 0.9 momentum (yellow)
- “Momentum (0.95)” - All agents use 0.95 momentum (green)
- “Mixed” - Random mix of all three types

landscape-type - Select the loss function:
“Simple Bowl” - Smooth quadratic (easiest)
“Ravine” - Elongated valley (tests ill-conditioning)
“Rosenbrock” - Curved banana valley (very hard)
“Complex” - Multiple local minima (tests exploration)

Switches:
- show-trails - When ON, agents leave colored trails showing their optimization path
- add-noise - When ON, adds stochastic noise to gradients (simulates mini-batch learning)

Monitors:
- converged-count - Shows how many optimizers have converged to the minimum

Distill.pub’s “Momentum” visualization: https://distill.pub/2017/momentum/
Sebastian Ruder’s optimization overview: https://ruder.io/optimizing-gradient-descent/

CREDITS AND REFERENCES

Model Created By: AZOUANI Ilyes for DSTI School of Engineering - ABM Module

Mathematical References:
- Rosenbrock, H.H. (1960). “An automatic method for finding the greatest or least value of a function”
- Cauchy, AL. (1847). “Méthode générale pour la résolution des systèmes d’équations simultanées”
- Polyak, B.T. (1964). “Some methods of speeding up the convergence of iteration methods”

Machine Learning Context:
- Ruder, S. (2016). “An overview of gradient descent optimization algorithms.” arXiv:1609.04747

License:
This model is provided for educational purposes. Feel free to modify and extend it for learning about optimization algorithms and machine learning concepts.

Version: 1.0
Date: 2025

Release Notes

One of one release.

Associated Publications

Gradient Descent Simulation 1.0.0

Submitted by Ilyes Azouani Published Mar 18, 2026 Last modified Mar 18, 2026

HOW IT WORKS

HOW TO USE IT

Sliders:
- num-optimizers (0-100) - Number of optimizer agents to create
- base-learning-rate (0.01-1) - Step size for gradient descent. Smaller = slower but more stable.

landscape-type - Select the loss function:
“Simple Bowl” - Smooth quadratic (easiest)
“Ravine” - Elongated valley (tests ill-conditioning)
“Rosenbrock” - Curved banana valley (very hard)
“Complex” - Multiple local minima (tests exploration)

Switches:
- show-trails - When ON, agents leave colored trails showing their optimization path
- add-noise - When ON, adds stochastic noise to gradients (simulates mini-batch learning)

Monitors:
- converged-count - Shows how many optimizers have converged to the minimum

Distill.pub’s “Momentum” visualization: https://distill.pub/2017/momentum/
Sebastian Ruder’s optimization overview: https://ruder.io/optimizing-gradient-descent/

CREDITS AND REFERENCES

Model Created By: AZOUANI Ilyes for DSTI School of Engineering - ABM Module

Machine Learning Context:
- Ruder, S. (2016). “An overview of gradient descent optimization algorithms.” arXiv:1609.04747

License:
This model is provided for educational purposes. Feel free to modify and extend it for learning about optimization algorithms and machine learning concepts.

Version: 1.0
Date: 2025

Release Notes

One of one release.

Ilyes Azouani (2026, March 18). “Gradient Descent Simulation” (Version 1.0.0). CoMSES Computational Model Library. Retrieved from: https://www.comses.net/codebases/171093c2-9e2c-4b84-87f5-abb330352901/releases/1.0.0/

Create an Open Code Badge that links to this model more info

Contributors

Ilyes Azouani

DOI

No assigned DOI

Model Version

1.0.0

License

Apache-2.0

Programming Language

NetLogo

Software Framework

NetLogo

Operating System

Operating System Independent

Publish Date

Wednesday, March 18, 2026

Last Updated

Wednesday, March 18, 2026

Downloads 0

Peer Review

Unreviewed

This website uses cookies and Google Analytics to help us track user engagement and improve our site. If you'd like to know more information about what data we collect and why, please see our data privacy policy. If you continue to use this site, you consent to our use of cookies.

Computational Model Library

Gradient Descent Simulation (1.0.0)

HOW IT WORKS

HOW TO USE IT

CREDITS AND REFERENCES

Release Notes

Associated Publications

Gradient Descent Simulation 1.0.0

HOW IT WORKS

HOW TO USE IT

CREDITS AND REFERENCES

Release Notes

Cite this Model

Create an Open Code Badge that links to this model more info

Discussion

Computational Model Library

Gradient Descent Simulation (1.0.0)

HOW IT WORKS

HOW TO USE IT

RELATED MODELS

CREDITS AND REFERENCES

Release Notes

Associated Publications

Gradient Descent Simulation 1.0.0

HOW IT WORKS

HOW TO USE IT

RELATED MODELS

CREDITS AND REFERENCES

Release Notes

Cite this Model

Create an Open Code Badge that links to this model more info

Discussion