AI Tools.

Search

image to text models

11 models · ranked by HuggingFace downloads

GLM-OCR

GLM-OCR is a multilingual OCR and document understanding model from ZhipuAI, built on the GLM architecture and supporting text recognition across Chinese, English, French, Spanish, Russian, German, Japanese, and Korean. It treats OCR as a sequence generation task, enabling structured text extraction from document images and screenshots. MIT licensed.

8,366,555 ↓ · 1,700 ♡

blip-image-captioning-base

blip-image-captioning-base is an open-source image-to-text model available on HuggingFace. Details are sourced from the public model registry.

2,260,611 ↓ · 851 ♡

blip-image-captioning-large

blip-image-captioning-large is an open-source image-to-text model available on HuggingFace. Details are sourced from the public model registry.

804,981 ↓ · 1,473 ♡

trocr-base-printed

trocr-base-printed is an open-source image-to-text model available on HuggingFace. Details are sourced from the public model registry.

619,561 ↓ · 206 ♡

PP-OCRv5_server_det

PP-OCRv5_server_det is an open-source image-to-text model available on HuggingFace. Details are sourced from the public model registry.

602,114 ↓ · 61 ♡

blip2-opt-2.7b-coco

blip2-opt-2.7b-coco is an open-source image-to-text model available on HuggingFace. Details are sourced from the public model registry.

585,916 ↓ · 11 ♡

pix2text-mfr

pix2text-mfr is an open-source image-to-text model available on HuggingFace. Details are sourced from the public model registry.

451,632 ↓ · 54 ♡

UVDoc

UVDoc is an open-source image-to-text model available on HuggingFace. Details are sourced from the public model registry.

417,047 ↓ · 8 ♡

PP-LCNet_x1_0_doc_ori

PP-LCNet_x1_0_doc_ori is an open-source image-to-text model available on HuggingFace. Details are sourced from the public model registry.

366,785 ↓ · 12 ♡

en_PP-OCRv5_mobile_rec

en_PP-OCRv5_mobile_rec is an open-source image-to-text model available on HuggingFace. Details are sourced from the public model registry.

342,559 ↓ · 2 ♡

nougat-base

nougat-base is an open-source image-to-text model available on HuggingFace. Details are sourced from the public model registry.

308,646 ↓ · 189 ♡