Understanding Reinforcement Learning with Neural Networks Part 2: Why Backpropagation Is Not Enough

1 / 4

Understanding Reinforcement Learning with Neural Networks Part 2: Why Backpropagation Is Not Enough

DEV Community·Rijul Rajesh·22 days ago

#EvEteHzs

#ai #machinelearning #software #coding #output #reinforcement

Reading 0:00

15s threshold

In the previous article , we explored an example where reinforcement learning is required and standard methods do not work. In this article, we will understand why policy gradients are needed, and why the standard backpropagation method does not work in certain situations. How Backpropagation Normally Works Assume we have the following training data, where the desired outputs are already known: Input (Hunger) Output p(B) 0.0 0 1.0 1 0.1 0 0.9 1 With this data, we can feed the input values into the neural network one at a time. The neural network produces an output, and we compare it with the ideal output value from the training data. Using this difference, we can measure how wrong the network is. Using Derivatives to Update the Bias We can calculate these differences for different values of the bias and visualize how the error changes as the bias changes. From this graph, we can calculate the derivative .…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

Understanding Reinforcement Learning with Neural Networks Part 2: Why Backpropagation Is Not Enough