Skip to main content

The Hugging Face Model Hub

Browsing models, model cards, Spaces, Gradio, and licensing

~35 min
Listen to this lesson

The Hugging Face Model Hub

The [Hugging Face Hub](https://huggingface.co) is the largest repository of pre-trained machine learning models, hosting over 400,000 models across NLP, computer vision, audio, and multimodal tasks. Think of it as GitHub for ML models.

Browsing the Hub

Search and Filters

The Hub provides powerful filtering to find the right model:

  • Task: text-classification, summarization, translation, image-classification, etc.
  • Library: Transformers, spaCy, TensorFlow, PyTorch, ONNX, etc.
  • Language: Filter by training language (English, multilingual, etc.)
  • Dataset: Filter by training dataset
  • License: apache-2.0, MIT, cc-by-4.0, etc.
  • Sort by: Downloads, likes, trending, recently updated
  • Using the Hub API

    from huggingface_hub import HfApi, list_models

    api = HfApi()

    Search for sentiment analysis models

    models = api.list_models( task="text-classification", sort="downloads", direction=-1, limit=5 )

    for model in models: print(f"{model.id:50s} Downloads: {model.downloads:,}")

    Model Cards

    Every well-maintained model has a Model Card (README.md) that describes: - **Intended Use**: What the model was designed for - **Training Data**: What data it was trained on - **Performance Metrics**: Benchmark results - **Limitations & Biases**: Known shortcomings - **How to Use**: Code examples - **Carbon Footprint**: Environmental impact of training Always read the model card before using a model in production!

    Downloading and Using Models

    From the Hub

    from transformers import AutoTokenizer, AutoModelForSequenceClassification

    Downloads automatically on first use, cached locally

    model = AutoModelForSequenceClassification.from_pretrained( "cardiffnlp/twitter-roberta-base-sentiment-latest" ) tokenizer = AutoTokenizer.from_pretrained( "cardiffnlp/twitter-roberta-base-sentiment-latest" )

    Specifying Revisions

    # Use a specific version/commit
    model = AutoModel.from_pretrained(
        "bert-base-uncased",
        revision="main"         # branch name
        # revision="v1.0"       # tag
        # revision="abc123..."  # commit hash
    )
    

    Cache Management

    import os

    Default cache location: ~/.cache/huggingface/hub

    Override with environment variable:

    os.environ["HF_HOME"] = "/my/custom/cache/path"

    Or specify per call

    model = AutoModel.from_pretrained( "bert-base-uncased", cache_dir="/my/cache" )

    Pushing Models to the Hub

    from huggingface_hub import login

    Authenticate (get token from huggingface.co/settings/tokens)

    login(token="hf_...")

    Push a model and tokenizer

    model.push_to_hub("my-username/my-fine-tuned-model") tokenizer.push_to_hub("my-username/my-fine-tuned-model")

    Or save and push together using Trainer

    trainer.push_to_hub("my-username/my-fine-tuned-model")

    Hugging Face Spaces

    Spaces are hosted demo applications powered by models on the Hub. They support:

  • Gradio: Python-based UI library (most popular for ML demos)
  • Streamlit: Python web app framework
  • Docker: Full custom applications
  • Static HTML: Simple static pages
  • Creating a Gradio Demo

    Gradio is the easiest way to create interactive ML demos:

    import gradio as gr
    from transformers import pipeline

    Load model

    classifier = pipeline("sentiment-analysis")

    def analyze_sentiment(text): """Analyze sentiment of input text.""" result = classifier(text)[0] label = result["label"] score = result["score"] return f"{label} (confidence: {score:.2%})"

    Create the interface

    demo = gr.Interface( fn=analyze_sentiment, inputs=gr.Textbox( label="Enter text", placeholder="Type something to analyze..." ), outputs=gr.Textbox(label="Sentiment"), title="Sentiment Analyzer", description="Analyze the sentiment of any text using DistilBERT.", examples=[ ["I love sunny days!"], ["This product is terrible."], ["The movie was okay, nothing special."] ] )

    Launch locally

    demo.launch()

    Or launch with a public link

    demo.launch(share=True)

    Advanced Gradio Features

    import gradio as gr
    from transformers import pipeline

    summarizer = pipeline("summarization") ner = pipeline("ner", aggregation_strategy="simple")

    def summarize(text, max_len, min_len): result = summarizer(text, max_length=max_len, min_length=min_len) return result[0]["summary_text"]

    def extract_entities(text): entities = ner(text) # Format as a table rows = [] for e in entities: rows.append([e["word"], e["entity_group"], f"{e['score']:.3f}"]) return rows

    Tabbed interface with multiple tools

    with gr.Blocks() as demo: gr.Markdown("# NLP Toolkit")

    with gr.Tab("Summarization"): text_input = gr.Textbox(label="Input Text", lines=5) max_slider = gr.Slider(20, 200, value=50, label="Max Length") min_slider = gr.Slider(5, 50, value=15, label="Min Length") summary_output = gr.Textbox(label="Summary") summarize_btn = gr.Button("Summarize") summarize_btn.click( summarize, inputs=[text_input, max_slider, min_slider], outputs=summary_output )

    with gr.Tab("NER"): ner_input = gr.Textbox(label="Input Text", lines=3) ner_output = gr.Dataframe( headers=["Entity", "Type", "Score"], label="Entities" ) ner_btn = gr.Button("Extract Entities") ner_btn.click(extract_entities, inputs=ner_input, outputs=ner_output)

    demo.launch()

    Licensing Matters

    Always check the license before using a model in production: - **apache-2.0 / MIT**: Very permissive, commercial use allowed - **cc-by-4.0**: Must give attribution - **cc-by-nc-4.0**: Non-commercial only - **openrail / bigscience-openrail-m**: Use restrictions on harmful applications - **llama2 / llama3**: Meta's custom licenses with specific terms Some models have no license specified — treat these with caution for commercial use.

    Model Comparison Strategies

    When selecting a model for your project, consider:

    FactorWhat to Check
    AccuracyBenchmark scores on your target task
    SpeedModel size, inference time, latency requirements
    SizeParameter count, disk/memory footprint
    LicenseCommercial use allowed?
    MaintenanceWhen was it last updated? Active community?
    DataWhat was it trained on? Any data contamination risks?

    Practical Comparison

    from transformers import pipeline
    import time

    models = [ "distilbert-base-uncased-finetuned-sst-2-english", "nlptown/bert-base-multilingual-uncased-sentiment", "cardiffnlp/twitter-roberta-base-sentiment-latest", ]

    test_texts = [ "This is absolutely wonderful!", "Terrible experience, would not recommend.", "It's okay, nothing special.", ]

    for model_name in models: pipe = pipeline("sentiment-analysis", model=model_name) start = time.time() results = pipe(test_texts) elapsed = time.time() - start

    print(f"\nModel: {model_name}") print(f" Time: {elapsed:.3f}s") for text, result in zip(test_texts, results): print(f" {text[:30]:30s} -> {result['label']} ({result['score']:.3f})")