PDF to Markdown Chapters This tool converts a long PDF document into organized Markdown files, splitting the content into chapters based on heading structure. It is ideal for converting technical documents, books, or reports into a structured format suitable for documentation websites, wikis, or static site generators. Features Extracts text from PDF using layout-aware parsing Identifies chapter headings using font size, style, and positional heuristics Splits content into separate Markdown files per chapter Preserves basic formatting such as bold, italic, lists, and code blocks Creates a table of contents ( _toc.md ) for easy navigation Lightweight and dependency-managed using standard Python libraries Usage Run the script from the command line: python main.py input.pdf --output-dir chapters/ Enter fullscreen mode Exit fullscreen mode This will create a directory (default: chapters/ ) containing individual .md files for each detected chapter and a _toc.md file.โฆ