The Hugging Face Model Hub

The [Hugging Face Hub](https://huggingface.co) is the largest repository of pre-trained machine learning models, hosting over 400,000 models across NLP, computer vision, audio, and multimodal tasks. Think of it as GitHub for ML models.

Browsing the Hub

Search and Filters

The Hub provides powerful filtering to find the right model:

Task: text-classification, summarization, translation, image-classification, etc.

Library: Transformers, spaCy, TensorFlow, PyTorch, ONNX, etc.

Language: Filter by training language (English, multilingual, etc.)

Dataset: Filter by training dataset

License: apache-2.0, MIT, cc-by-4.0, etc.

Sort by: Downloads, likes, trending, recently updated

Using the Hub API

from huggingface_hub import HfApi, list_models
api = HfApi()
Search for sentiment analysis models
models = api.list_models(
    task="text-classification",
    sort="downloads",
    direction=-1,
    limit=5
)for model in models:
    print(f"{model.id:50s} Downloads: {model.downloads:,}")

Model Cards

Every well-maintained model has a Model Card (README.md) that describes: - **Intended Use**: What the model was designed for - **Training Data**: What data it was trained on - **Performance Metrics**: Benchmark results - **Limitations & Biases**: Known shortcomings - **How to Use**: Code examples - **Carbon Footprint**: Environmental impact of training Always read the model card before using a model in production!

Downloading and Using Models

From the Hub

from transformers import AutoTokenizer, AutoModelForSequenceClassification
Downloads automatically on first use, cached locally
model = AutoModelForSequenceClassification.from_pretrained(
    "cardiffnlp/twitter-roberta-base-sentiment-latest"
)
tokenizer = AutoTokenizer.from_pretrained(
    "cardiffnlp/twitter-roberta-base-sentiment-latest"
)

Specifying Revisions

# Use a specific version/commit
model = AutoModel.from_pretrained(
    "bert-base-uncased",
    revision="main"         # branch name
    # revision="v1.0"       # tag
    # revision="abc123..."  # commit hash
)

Cache Management

import os
Default cache location: ~/.cache/huggingface/hub
Override with environment variable:
os.environ["HF_HOME"] = "/my/custom/cache/path"
Or specify per call
model = AutoModel.from_pretrained(
    "bert-base-uncased",
    cache_dir="/my/cache"
)

Pushing Models to the Hub

from huggingface_hub import login
Authenticate (get token from huggingface.co/settings/tokens)
login(token="hf_...")
Push a model and tokenizer
model.push_to_hub("my-username/my-fine-tuned-model")
tokenizer.push_to_hub("my-username/my-fine-tuned-model")
Or save and push together using Trainer
trainer.push_to_hub("my-username/my-fine-tuned-model")

Hugging Face Spaces

Spaces are hosted demo applications powered by models on the Hub. They support:

Gradio: Python-based UI library (most popular for ML demos)

Streamlit: Python web app framework

Docker: Full custom applications

Static HTML: Simple static pages

Creating a Gradio Demo

Gradio is the easiest way to create interactive ML demos:

import gradio as gr
from transformers import pipeline
Load model
classifier = pipeline("sentiment-analysis")
def analyze_sentiment(text):
    """Analyze sentiment of input text."""
    result = classifier(text)[0]
    label = result["label"]
    score = result["score"]
    return f"{label} (confidence: {score:.2%})"
Create the interface
demo = gr.Interface(
    fn=analyze_sentiment,
    inputs=gr.Textbox(
        label="Enter text",
        placeholder="Type something to analyze..."
    ),
    outputs=gr.Textbox(label="Sentiment"),
    title="Sentiment Analyzer",
    description="Analyze the sentiment of any text using DistilBERT.",
    examples=[
        ["I love sunny days!"],
        ["This product is terrible."],
        ["The movie was okay, nothing special."]
    ]
)
Launch locally
demo.launch()
Or launch with a public link
demo.launch(share=True)

Advanced Gradio Features

import gradio as gr
from transformers import pipeline
summarizer = pipeline("summarization")
ner = pipeline("ner", aggregation_strategy="simple")
def summarize(text, max_len, min_len):
    result = summarizer(text, max_length=max_len, min_length=min_len)
    return result[0]["summary_text"]
def extract_entities(text):
    entities = ner(text)
    # Format as a table
    rows = []
    for e in entities:
        rows.append([e["word"], e["entity_group"], f"{e['score']:.3f}"])
    return rows
Tabbed interface with multiple tools
with gr.Blocks() as demo:
    gr.Markdown("# NLP Toolkit")
    with gr.Tab("Summarization"):
        text_input = gr.Textbox(label="Input Text", lines=5)
        max_slider = gr.Slider(20, 200, value=50, label="Max Length")
        min_slider = gr.Slider(5, 50, value=15, label="Min Length")
        summary_output = gr.Textbox(label="Summary")
        summarize_btn = gr.Button("Summarize")
        summarize_btn.click(
            summarize,
            inputs=[text_input, max_slider, min_slider],
            outputs=summary_output
        )
    with gr.Tab("NER"):
        ner_input = gr.Textbox(label="Input Text", lines=3)
        ner_output = gr.Dataframe(
            headers=["Entity", "Type", "Score"],
            label="Entities"
        )
        ner_btn = gr.Button("Extract Entities")
        ner_btn.click(extract_entities, inputs=ner_input, outputs=ner_output)demo.launch()

Licensing Matters

Always check the license before using a model in production: - **apache-2.0 / MIT**: Very permissive, commercial use allowed - **cc-by-4.0**: Must give attribution - **cc-by-nc-4.0**: Non-commercial only - **openrail / bigscience-openrail-m**: Use restrictions on harmful applications - **llama2 / llama3**: Meta's custom licenses with specific terms Some models have no license specified — treat these with caution for commercial use.

Model Comparison Strategies

When selecting a model for your project, consider:

Factor	What to Check
Accuracy	Benchmark scores on your target task
Speed	Model size, inference time, latency requirements
Size	Parameter count, disk/memory footprint
License	Commercial use allowed?
Maintenance	When was it last updated? Active community?
Data	What was it trained on? Any data contamination risks?

Practical Comparison

from transformers import pipeline
import time
models = [
    "distilbert-base-uncased-finetuned-sst-2-english",
    "nlptown/bert-base-multilingual-uncased-sentiment",
    "cardiffnlp/twitter-roberta-base-sentiment-latest",
]
test_texts = [
    "This is absolutely wonderful!",
    "Terrible experience, would not recommend.",
    "It's okay, nothing special.",
]
for model_name in models:
    pipe = pipeline("sentiment-analysis", model=model_name)
    start = time.time()
    results = pipe(test_texts)
    elapsed = time.time() - start    print(f"\nModel: {model_name}")
    print(f"  Time: {elapsed:.3f}s")
    for text, result in zip(test_texts, results):
        print(f"  {text[:30]:30s} -> {result['label']} ({result['score']:.3f})")