Menu

#DeCodeR

20 posts

Feed·
20 of 20 posts
Transformer Neural Network Architecture Diagram — A Visual Guide for Engineers
🖼️
0

Transformer Neural Network Architecture Diagram — A Visual Guide for Engineers

DEV Community·Mia·20 days ago
#dfeZLTf0

From Dev.to - machinelearning: Transformer Neural Network Architecture Diagram — A Visual Guide for Engineers

15s
Read More
How I rebuilt the Codable migration pattern across 4 iOS apps in 2 hours
🖼️
0

How I rebuilt the Codable migration pattern across 4 iOS apps in 2 hours

DEV Community·孫昊·26 days ago
#q4y8UgsB
#swift#ios#codable#self#four#decoder

I was adding a single new feature to DaysUntil: yearly-recurring events. Twenty lines of product...

15s
Read More
📰
0

Dynamic batching for Encoder-Decoder MT training or generation when long sequence caps the batch size [P]

Reddit r/MachineLearning·u/Leather_Loan5314·about 1 month ago
#wN8KYA43

I built a small pytorch sampler called **dynabatch** after facing this specific batching issue while fine tuning a NLLB-200 600M model. Training on RTX 5090, the largest fixed batch size I could use was 8, any bigger leads to OOM.…

15s
Read More
T5Gemma: A new collection of encoder-decoder Gemma models
📰
0

T5Gemma: A new collection of encoder-decoder Gemma models

deepmind.google·Biao Zhang, Paul Suganthan, Ben Hora·about 1 month ago
#Pw6vvrNM
#arrow#chevron#post#menu#decoder#models

T5Gemma is a new family of encoder-decoder LLMs developed by converting and adapting pretrained decoder-only models based on the Gemma 2 framework, offering superior performance and efficiency compared to its decoder-only counterparts, particularly for…

15s
Read More