Menu

#Dynabatch

1 post

Feed
1 of 1 post
📰
0

Dynamic batching for Encoder-Decoder MT training or generation when long sequence caps the batch size [P]

Reddit r/MachineLearning·u/Leather_Loan5314·about 1 month ago
#wN8KYA43

I built a small pytorch sampler called **dynabatch** after facing this specific batching issue while fine tuning a NLLB-200 600M model. Training on RTX 5090, the largest fixed batch size I could use was 8, any bigger leads to OOM.…

15s
Read More