#Embeddingsize

2 posts

Feed·

Images only2 of 2 posts

🖼️

Chapter 11: The Full GPT - Assembling the Model

DEV Community·Gary Jackson·about 1 month ago

Pull everything into a GptModel class, package Adam as a reusable optimiser, and run the real 10,000-step training loop end-to-end.

15s

🖼️

DEV Community·Gary Jackson·about 1 month ago

Run several attention heads in parallel on embedding slices, add a two-layer MLP for per-position computation, and assemble a transformer block.

15s