tags : Machine Learning, Image Compression, Computer Vision, Deploying ML applications (applied ML)

Comparison

TypeNameDescription
ServiceClaude/OpenAI/AWSThey have APIs
LSTM-CNNTesseract
PP-OCR(DB+CRNN)PaddleOCRWorks with rotated stuff
EasyOCR
Toolbox, Modular modelsdoctrSome people mention it works better than paddle and tesseract.
Pytorch+mmlabsMMOCRMight be nice if using mmdetection stuff
suryaOnly for documents, doesn’t work in handwritten. faster than tesseract, Language support. Tries to guess proper reading order.
VLMMGP-STRnew kid (2024)
VLMGOTnew kid (2024)
VLMolmOCRolmOCR – Open-Source OCR for Accurate Document Conversion (has comparision to GOT)
VLMROlmOCRbetter and faster olmOCR
VLMTrOCR
VLMDONUT
VLMInternVL
VLMIdefics2
  • olmOCR introduces a technique they call “Document Anchoring”, where the quality of the extracted text is enhanced with any text and metadata present in the PDF file.

Resources