Thursday, 15 May 2025

AI Algorithms: Gradient Descent

 See the FAQ here

🧠 Gradient Descent: Summary

🔹 1. Basic Idea:

  • Gradient Descent is an optimization technique used to minimize a loss function.

  • We update parameters in the opposite direction of the gradient (steepest ascent).


🔹 2. Simple Example:

  • Function: f(x)=(x3)2f(x) = (x - 3)^2

  • Gradient: f(x)=2(x3)f'(x) = 2(x - 3)

  • Using gradient descent, we iteratively update xx to move closer to 3 (the minimum).


🔹 3. Code Example:

  • Implemented gradient descent in Python for the function f(x)f(x)

  • Showed iterative improvement of xx and decreasing values of f(x)f(x)


🔹 4. Real-World Analogy:

  • In machine learning, we don’t know the true function f(x)f(x) that maps inputs to outputs.

  • Instead, we define a loss function to measure how bad our model is.

  • Gradient descent minimizes this loss, not the unknown real-world function.


🔹 5. Multiple Parameters:

  • For models with multiple parameters (e.g., weights and bias), we compute partial derivatives for each.

  • All parameters are updated simultaneously using their respective gradients.


🔹 6. When to Stop:

You stop gradient descent when one or more of the following conditions are met:

  1. Maximum iterations reached.

  2. Change in loss becomes very small.

  3. Change in parameters is negligible.

  4. Gradient becomes close to zero.


🧭 Key Formula (update rule):

For each parameter
\theta
:

θ:=θαθLoss\theta := \theta - \alpha \cdot \nabla_\theta \text{Loss}

No comments:

Post a Comment

🧠 You Only Laugh Once: Creativity and Humor in Deep Learning Community

It all started with a simple truth: Attention Is All You Need . Or at least, that’s what the transformers keep whispering at every AI confer...