Menu

Post image 1
Post image 2
Post image 3
1 / 3
0

How I Built a Document Detector in the Browser

DEV Community·Paul·21 days ago
#G49ViKb3
Reading 0:00
15s threshold

Scanning a document with your phone is one of those small tasks that comes up all the time. You need a decent photo of a page, and you need to send it somewhere quickly - by email, in a messenger, wherever. Usually that means reaching for an app. Modern browsers can run fairly serious code with WASM, so in some cases opening a site is easier than installing yet another app. It turned out to be a good computer vision problem. Real photos of documents are messy in all the usual ways: perspective distortion, poor lighting, glare, shadows, and cluttered backgrounds. Sometimes the page is partly out of frame. Sometimes the background itself is full of straight lines and rectangles that are easy to mistake for the document. Overall Approach I wanted to keep the whole thing relatively simple and run everything on the client. So instead of going down the neural network route, I built the detector with classical computer vision methods. The core idea is simple: don't trust any single detection method.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More