Page Index - Vermont-Complex-Systems/pdf-zoo GitHub Wiki
83 page(s) in this GitHub Wiki:
- Home
- 📄 PDF Zoo Wiki
- 🧩 The PDF Parsing Challenge
- ⭐ My Faves
- 🚀 Quick Start
- 📚 OCR Solutions
- 🔧 Processing Tools
- ☁️ Cloud Services
- 📋 Entry Format
- 🏷️ Tag System (TAGxonomy)
- Primary Categories
- Functionality Tags
- 🤝 Contributing
- AWS Textract
- Please reload this page
- Azure Document Intelligence
- Please reload this page
- docling
- Please reload this page
- donut
- Please reload this page
- easyOCR
- Please reload this page
- florence
- Please reload this page
- Google Document AI
- Please reload this page
- GOT ocr2
- Please reload this page
- grobid
- Please reload this page
- kosmos 2.5
- Please reload this page
- langextract
- Please reload this page
- marker
- Please reload this page
- MinerU
- Please reload this page
- molmo
- Please reload this page
- nougat
- Please reload this page
- NuExtract
- Please reload this page
- OCRmyPDF
- Please reload this page
- olmocr
- Please reload this page
- Other ressources
- Please reload this page
- PaddleOCR
- Please reload this page
- PDF Extract Kit
- Please reload this page
- pdfium
- Please reload this page
- pdfminer.six
- Please reload this page
- pdfplumber
- Please reload this page
- publaynet
- Please reload this page
- PyMuPDF
- Please reload this page
- pypdf2
- Please reload this page
- pypdfium2
- Please reload this page
- Relevant Models
- Please reload this page
- s2orc doc2json
- Please reload this page
- spacylayout
- Please reload this page
- surya
- Please reload this page
- tesseract
- Please reload this page
- textra
- Please reload this page
- The PDF Parsing Challenge
- Please reload this page