tags : Machine Learning, Image Compression, Computer Vision
Comparision
Type | Name | Description |
---|---|---|
Service | Claude/OpenAI/AWS | They have APIs |
LSTM-CNN | Tesseract | |
PP-OCR(DB+CRNN) | PaddleOCR | Works with rotated stuff |
EasyOCR | ||
Toolbox, Modular models | doctr | Some people mention it works better than paddle and tesseract. |
Pytorch+mmlabs | MMOCR | Might be nice if using mmdetection stuff |
surya | Only for documents, doesn’t work in handwritten. faster than tesseract, Language support. Tries to guess proper reading order. | |
VLM | TrOCR | |
VLM | DONUT | |
VLM | InternVL | |
VLM | Idefics2 |
Resources
- Show HN: Gogosseract, a Go Lib for CGo-Free Tesseract OCR via Wazero | Hacker News
- Qwen2-VL-7B Instruct model gets 100% accuracy extracting text from this handwritten document