Menu

Post image 1
Post image 2
1 / 2
0

How I cut speech-bubble retries from 70% to 0% with 200 lines of Pillow code

DEV Community·qcrao·23 days ago
#qAMx5QcT
#ai#pillow#sideprojects#font#fullscreen#comic
Reading 0:00
15s threshold

If you've ever asked Stable Diffusion or DALL-E to render readable text inside a comic panel, you know the pain. It almost works. The letters look like letters. Until you read them — "WHAT ARE YOU DONIG" , "HEILP" , "BLEAH BLAH" . About 70% of my generations needed a regen just because the dialogue was garbled, and every regen burned ~$0.04 in GPU time. For Comicory I gave up trying to make the model render text and moved typography into a deterministic post-processing step. The model now draws empty speech bubbles. Pillow draws the words. Retry rate for text-related issues: zero. Total post-processing code: ~200 lines. Here's the pipeline. Step 1: Bubble shape detection The model is told (via prompt + LoRA) to draw an empty white speech bubble with a black outline somewhere in the panel. I find it with classic CV — no ML, no models, no surprises: from PIL import Image import numpy as np import cv2 def find_bubble ( panel : Image . Image ) -> tuple [ int , int , int , int ] | None : arr = np .…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More