Menu

Post image 1
Post image 2
Post image 3
Post image 4
1 / 4
0

Understanding Transformers Part 17: Generating the Output Word

DEV Community·Rijul Rajesh·about 1 month ago
#9KM3rB6Q
Reading 0:00
15s threshold

In the previous article , we set up the residual connections to get the final output values from the decoder. In this article, we begin by passing these two output values through a fully connected layer. This layer has: One input for each value representing the current token (in this case, 2 inputs) One output for each word in the output vocabulary Since our vocabulary has 4 tokens, this gives us 4 output values. Selecting the Output Word Next, we pass these 4 output values through a softmax function. This allows us to select the most likely output word, which in this case is “vamos”. So far, the translation is correct. However, the process does not stop here. Continuing the Decoding Process The decoder continues generating words until it produces an token, which indicates the end of the sentence. To generate the next word, we feed the predicted word back into the decoder. We will explore this step in the next article. Looking for an easier way to install tools, libraries, or entire repositories?…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More