📰00Dynamic batching for Encoder-Decoder MT training or generation when long sequence caps the batch size [P]Reddit r/MachineLearning·u/Leather_Loan5314·about 1 month ago#wN8KYA43#batch#training#decoder#dynabatch#size#article+2 more🧰Tag tools✨Add tagI built a small pytorch sampler called **dynabatch** after facing this specific batching issue while fine tuning a NLLB-200 600M model. Training on RTX 5090, the largest fixed batch size I could use was 8, any bigger leads to OOM.… Read more15s0Read later0Read More