AI Tools.

Search

fill mask

bert-base-uncased

Google's original BERT base model in uncased form, pre-trained on BookCorpus and English Wikipedia via masked language modeling. Tokens are lowercased before processing, making it insensitive to capitalization. It remains a standard fine-tuning base for classification, NER, and extractive QA, though newer encoders outperform it on most benchmarks.

Last reviewed

Use cases

  • Fine-tuning for text classification (sentiment, topic, intent)
  • Named entity recognition with a token classification head
  • Extractive question answering on short passages
  • Sentence embedding via mean pooling of hidden states
  • Transfer learning starting point for domain-specific NLP tasks

Pros

  • Extensively benchmarked — failure modes and quirks well documented
  • Multi-framework support: PyTorch, TensorFlow, JAX, CoreML, ONNX, Rust
  • Apache 2.0 license; large ecosystem of domain-specific fine-tuned checkpoints
  • Low barrier for integration in HuggingFace-based pipelines

Cons

  • Lowercase tokenization breaks case-sensitive tasks like proper noun NER
  • 512-token context window insufficient for long documents without chunking
  • Encoder-only architecture cannot generate free-form text
  • Outperformed by DeBERTa and more recent encoders on most NLU benchmarks
  • No multilingual capability in the base checkpoint

FAQ

What is bert-base-uncased used for?

Fine-tuning for text classification (sentiment, topic, intent). Named entity recognition with a token classification head. Extractive question answering on short passages. Sentence embedding via mean pooling of hidden states. Transfer learning starting point for domain-specific NLP tasks.

Is bert-base-uncased free to use?

bert-base-uncased is an open-source model published on HuggingFace. License terms vary by model — check the model card for the specific license.

How do I run bert-base-uncased locally?

Most HuggingFace models can be loaded with transformers or the appropriate framework library. See the model card for framework-specific instructions and hardware requirements.

Tags

transformerspytorchtfjaxrustcoremlonnxsafetensorsbertfill-maskexbertendataset:bookcorpusdataset:wikipediaarxiv:1810.04805license:apache-2.0endpoints_compatibledeploy:azureregion:us