Menu

Post image 1
Post image 2
Post image 3
Post image 4
Post image 5
1 / 5
0

Understanding Reinforcement Learning with Neural Networks Part 4: Positive and Negative Rewards

DEV Community·Rijul Rajesh·19 days ago
#hpfhUMpI
Reading 0:00
15s threshold

In the previous article , we began the process of guessing the ideal output. Let us continue with the same example. Suppose we receive a small number of fries . Since our hunger level is 0 , this is actually a good outcome. In this case, we should assign a reward of 1 . Now consider the opposite situation. Suppose we receive a large order of fries . Since we are not hungry enough to eat all the fries, this means we made a poor decision. In that case, we assign a reward of -1 . In general: Any positive reward indicates a good decision Any negative reward indicates a bad decision Updating the Derivative with the Reward We now use this reward to update the derivative. To do this, we simply multiply the derivative by the reward. Case 1: Correct Decision If the reward is 1 , then: The derivative remains unchanged. This means the derivative is already pointing in the correct direction. Case 2: Incorrect Decision If the reward is -1 , then: Now the derivative changes sign.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More