Team Members @k_sidharthareddy_15 | @k-deepak-544 | @nupur_madhrey_07 | @avika_kashyap | @dheerajkumar08 | @chanda_rajkumar Introduction So here's the thing — when We started working on MediSimplify , a project that takes medical reports and converts them into patient-friendly language, We thought the hard part would be the NLP simplification. Turns out, just getting the text out of the document was already a mini-nightmare. Medical reports come as everything: clean PDFs, scanned images, ancient faxed documents that someone scanned and emailed. OCR tools are finicky. Tesseract might not be installed on the deployment machine. A "PDF" might be a text-selectable document or a rasterized scan — and you can't tell which until you open it. We needed something that handled all of this gracefully, without crashing or silently returning garbage.…