AI Tools.

Search

feature extraction models

62 models · ranked by HuggingFace downloads

bge-small-en-v1.5

Small English dense embedding model from BAAI's BGE (BAAI General Embedding) series, producing 384-dimensional vectors via MIT license. Optimized for MTEB retrieval benchmarks through a retrieval-focused training strategy, it achieves competitive scores relative to its parameter count. Suited for embedding workflows where throughput and cost matter more than peak accuracy.

34,386,222 ↓ · 451 ♡

bge-large-en-v1.5

BGE-Large-EN-v1.5 is BAAI's highest-capacity English embedding model in the v1.5 series, producing 1024-dimensional vectors. It achieves top MTEB retrieval scores among its generation of English-only embedding models, at the cost of higher compute and storage than BGE-small or BGE-base. MIT licensed with ONNX export support.

14,929,062 ↓ · 657 ♡

bge-base-en-v1.5

BGE-Base-EN-v1.5 is BAAI's mid-tier English embedding model in the v1.5 series, producing 768-dimensional vectors. It balances accuracy and compute cost between the small (384d) and large (1024d) variants, making it a practical default for English retrieval tasks where storage and inference overhead of the large model are undesirable. MIT licensed with ONNX export.

8,365,829 ↓ · 414 ♡

multilingual-e5-large

Multilingual-E5-Large is a 560-million-parameter multilingual embedding model from Microsoft Research, supporting 100+ languages via an XLM-RoBERTa backbone. Trained with E5's instruction-following approach (prepending 'query:' or 'passage:' prefixes), it achieves strong MTEB multilingual retrieval scores. MIT licensed with ONNX and OpenVINO export.

7,225,099 ↓ · 1,186 ♡

Qwen3-Embedding-0.6B

Qwen3-Embedding-0.6B is Alibaba Cloud's compact embedding model from the Qwen3 series, fine-tuned from Qwen3-0.6B-Base for text embedding tasks. At 0.6B parameters it provides instruction-following embedding capability at a size deployable without dedicated GPU infrastructure. Apache 2.0 licensed.

5,804,187 ↓ · 1,008 ♡

mxbai-embed-large-v1

mxbai-embed-large-v1 is Mixedbread AI's English embedding model producing 1024-dimensional vectors, trained for retrieval and ranking tasks using angle-optimized contrastive learning (AnglE). It achieves strong MTEB retrieval scores among English embedding models. Apache 2.0 licensed.

4,428,685 ↓ · 793 ♡

all-MiniLM-L6-v2

all-MiniLM-L6-v2 is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

3,325,641 ↓ · 116 ♡

w2v-bert-2.0

w2v-bert-2.0 is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

3,295,565 ↓ · 214 ♡

jina-embeddings-v3

jina-embeddings-v3 is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

2,790,515 ↓ · 1,141 ♡

bge-small-zh-v1.5

bge-small-zh-v1.5 is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

2,363,385 ↓ · 113 ♡

multilingual-e5-small

multilingual-e5-small is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

1,895,024 ↓ · 10 ♡

Qwen3-Embedding-8B

Qwen3-Embedding-8B is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

1,890,124 ↓ · 673 ♡

Qwen3-Embedding-4B

Qwen3-Embedding-4B is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

1,845,055 ↓ · 262 ♡

SapBERT-from-PubMedBERT-fulltext

SapBERT-from-PubMedBERT-fulltext is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

1,659,792 ↓ · 67 ♡

bge-base-en-v1.5

bge-base-en-v1.5 is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

1,621,048 ↓ · 9 ♡

UAE-Large-V1

UAE-Large-V1 is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

1,416,906 ↓ · 237 ♡

multilingual-e5-large-instruct

multilingual-e5-large-instruct is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

1,362,959 ↓ · 620 ♡

1

1 is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

1,334,054 ↓ · 1 ♡

repeat

repeat is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

1,241,456 ↓ · 0 ♡

bge-multilingual-gemma2

bge-multilingual-gemma2 is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

1,180,937 ↓ · 200 ♡

bge-reranker-large

bge-reranker-large is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

1,118,253 ↓ · 457 ♡

granite-embedding-small-english-r2

granite-embedding-small-english-r2 is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

1,085,468 ↓ · 68 ♡

bge-large-zh-v1.5

bge-large-zh-v1.5 is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

925,757 ↓ · 622 ♡

mimi

mimi is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

871,974 ↓ · 299 ♡

clap-htsat-unfused

clap-htsat-unfused is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

856,341 ↓ · 71 ♡

indobert-base-p1

indobert-base-p1 is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

842,279 ↓ · 47 ♡

llama-nemotron-embed-1b-v2

llama-nemotron-embed-1b-v2 is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

829,960 ↓ · 55 ♡

jina-embeddings-v2-small-en

jina-embeddings-v2-small-en is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

814,960 ↓ · 141 ♡

bart-base

bart-base is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

731,918 ↓ · 205 ♡

biobert-v1.1

biobert-v1.1 is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

722,758 ↓ · 111 ♡

FRIDA

FRIDA is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

699,189 ↓ · 136 ♡

other

other is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

634,268 ↓ · 0 ♡

specter2_base

specter2_base is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

623,623 ↓ · 44 ♡

wavlm-base-plus

wavlm-base-plus is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

610,825 ↓ · 36 ♡

opensearch-neural-sparse-encoding-doc-v2-distill

opensearch-neural-sparse-encoding-doc-v2-distill is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

608,922 ↓ · 19 ♡

wavlm-large

wavlm-large is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

598,031 ↓ · 105 ♡

conv-bert-base

conv-bert-base is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

597,872 ↓ · 10 ♡

splade-cocondenser-ensembledistil

splade-cocondenser-ensembledistil is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

595,975 ↓ · 61 ♡

vram-16

vram-16 is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

564,192 ↓ · 0 ♡

SapBERT-from-PubMedBERT-fulltext-mean-token

SapBERT-from-PubMedBERT-fulltext-mean-token is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

564,117 ↓ · 2 ♡

OTel-Embedding-300M

OTel-Embedding-300M is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

554,398 ↓ · 0 ♡

OTel-Embedding-33M

OTel-Embedding-33M is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

513,581 ↓ · 0 ♡

TinyBERT_L-4_H-312_v2

TinyBERT_L-4_H-312_v2 is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

497,323 ↓ · 1 ♡

OTel-Embedding-109M

OTel-Embedding-109M is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

496,440 ↓ · 1 ♡

e5-base-sts-en-de

e5-base-sts-en-de is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

495,215 ↓ · 17 ♡

jina-embeddings-v2-base-de

jina-embeddings-v2-base-de is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

483,861 ↓ · 83 ♡

bge-base-zh-v1.5

bge-base-zh-v1.5 is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

431,976 ↓ · 105 ♡

bge-small-en

bge-small-en is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

421,841 ↓ · 92 ♡

MedCPT-Query-Encoder

MedCPT-Query-Encoder is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

416,485 ↓ · 61 ♡

bge-small-en-v1.5

bge-small-en-v1.5 is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

405,253 ↓ · 14 ♡

lambda

lambda is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

390,394 ↓ · 0 ♡

e5-mistral-7b-instruct

e5-mistral-7b-instruct is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

382,399 ↓ · 564 ♡

rubert-base-cased

rubert-base-cased is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

371,896 ↓ · 127 ♡

distilbert-base-nli-mean-tokens

distilbert-base-nli-mean-tokens is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

349,715 ↓ · 13 ♡

opensearch-neural-sparse-encoding-v2-distill

opensearch-neural-sparse-encoding-v2-distill is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

342,318 ↓ · 10 ♡

OTel-Embedding-34M

OTel-Embedding-34M is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

333,330 ↓ · 0 ♡

bge-base-zh

bge-base-zh is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

332,507 ↓ · 58 ♡

paraphrase-albert-small-v2

paraphrase-albert-small-v2 is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

306,091 ↓ · 2 ♡

OTel-Embedding-22M

OTel-Embedding-22M is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

299,119 ↓ · 0 ♡

SFR-Embedding-2_R

SFR-Embedding-2_R is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

297,280 ↓ · 94 ♡

dac_44khz

dac_44khz is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

292,706 ↓ · 11 ♡

codebert-base

codebert-base is an open-source feature-extraction model available on HuggingFace. Details are sourced from the public model registry.

287,705 ↓ · 285 ♡