My Research Notes: AI Algorithms: Gradient Descent

Thursday, 15 May 2025

AI Algorithms: Gradient Descent

See the FAQ here

🧠 Gradient Descent: Summary

🔹 1. Basic Idea:

Gradient Descent is an optimization technique used to minimize a loss function.
We update parameters in the opposite direction of the gradient (steepest ascent).

🔹 2. Simple Example:

Function: $f(x) = (x - 3)^2$
Gradient: $f'(x) = 2(x - 3)$
Using gradient descent, we iteratively update $x$ to move closer to 3 (the minimum).

🔹 3. Code Example:

Implemented gradient descent in Python for the function $f(x)$
Showed iterative improvement of $x$ and decreasing values of $f(x)$

🔹 4. Real-World Analogy:

In machine learning, we don’t know the true function $f(x)$ that maps inputs to outputs.
Instead, we define a loss function to measure how bad our model is.
Gradient descent minimizes this loss, not the unknown real-world function.

🔹 5. Multiple Parameters:

For models with multiple parameters (e.g., weights and bias), we compute partial derivatives for each.
All parameters are updated simultaneously using their respective gradients.

🔹 6. When to Stop:

You stop gradient descent when one or more of the following conditions are met:

Maximum iterations reached.
Change in loss becomes very small.
Change in parameters is negligible.
Gradient becomes close to zero.

🧭 Key Formula (update rule):

For each parameter $\theta$ :

\theta := \theta - \alpha \cdot \nabla_\theta \text{Loss}

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)