, we are going to discuss not only how but also why gradient descent and stochastic gradient descent are used. We already know about linear regression, and recently I wrote about it in the context of vectors and projections. Now, we will try to understand gradient descent with the help of a linear regression problem. But before that, I just want to briefly recall what we already know about linear regression and the math behind it, so that anyone starting out finds it easy to follow. If you already know the basic math behind linear regression, then you can directly start from the section titled Why Do We Need Gradient Descent? Let’s say we started our machine learning journey, and the first thing we did was implementing a linear regression model using Python. We implemented it successfully and got the best values for the slope and intercept. Now we have a question: What’s actually happening behind this algorithm? We want to understand the math behind it.…