Menu

Post image 1
Post image 2
1 / 2
0

My high-res image-to-video kept OOMing — turns out I was decoding outside no_grad

DEV Community: pytorch·shinji shimizu·4 days ago
#EF5LhD5G
#dev#decode#fullscreen#no_grad#peak#article
Reading 0:00
15s threshold

TL;DR I run LTX-2.3 image-to-video (I2V) locally on a 96 GB GPU. At 1024×768 / 97 frames it peaked at 83.5 GiB — so close to the ceiling that it OOM'd whenever my image-generation server was co-resident, and 1280×768 OOM'd outright. I assumed I'd hit a hardware wall. I hadn't. 54 of those gigabytes were an autograd graph. The pipeline returns a lazy decode iterator; the real VAE decode runs when you encode the output — and in my harness that happened outside the with torch.no_grad(): block, so every conv activation in the decoder was retained for a backward pass that never comes. Moving one call inside the no_grad block: before after I2V 1024×768/97f peak 83.5 GiB 29.5 GiB (−65%) time 151.6 s 135.2 s (slightly faster) And the peak goes nearly flat across resolution — 2048×1536 (3.1 MP) tops out at 33.6 GiB . The "I need a bigger GPU" conclusion was a measurement artifact. The lever I tried first — finer VAE decode tiling — barely moved the number. That dead end is part of the story.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More