AI System Design Patterns
Building a production AI system is much more than training a model. It requires designing data pipelines, serving infrastructure, monitoring systems, and governance frameworks that all work together. In this lesson, you'll learn the design patterns and architectural decisions behind real-world AI systems.
The 80/20 Rule of ML Systems
Build vs Buy vs Fine-Tune
The first architectural decision is whether to build a custom model, buy an off-the-shelf solution, or fine-tune a pre-trained model.
| Factor | Build from Scratch | Fine-Tune Pre-Trained | Buy / API |
|---|---|---|---|
| Data required | Large (10K-1M+ labeled) | Moderate (100-10K labeled) | None or few-shot |
| Time to deploy | Months | Weeks | Days |
| Cost | High (compute + team) | Moderate | Per-request pricing |
| Customization | Full control | Moderate | Limited |
| Maintenance | Full responsibility | Moderate | Vendor handles |
| Data privacy | Data stays in-house | Data stays in-house | Data sent to vendor |
| Best when | Unique problem, large data, competitive edge | Good pre-trained base exists | Commodity task, fast time-to-market |
Decision Framework
Is this a commodity task (translation, OCR, sentiment)?
YES → Use an API / buy
NO → Does a strong pre-trained model exist for your domain?
YES → Fine-tune it
NO → Build from scratch
Inference Patterns: Online vs Batch vs Near-Real-Time
| Pattern | Latency | Throughput | Cost | Use Case |
|---|---|---|---|---|
| Online (Real-Time) | <100ms | Per-request | High (always-on) | Fraud detection, recommendations, chatbots |
| Batch | Hours | Millions at once | Low (scheduled) | Credit scoring, report generation, ETL |
| Near-Real-Time (Streaming) | Seconds to minutes | Continuous stream | Medium | Anomaly detection, IoT, live dashboards |
Online Inference Architecture
Client → Load Balancer → Model Server (GPU/CPU)
↓
Feature Store (online) → cached features
↓
Prediction → Response
Batch Inference Architecture
Scheduler → Data Warehouse → Feature Pipeline → Model
↓
Prediction Store → Downstream Systems
Reference Architecture: Document Intelligence System
Let's design a complete system for classifying, extracting, and routing government documents.
┌─────────────────────────────────────────────┐
│ Document Intelligence │
├─────────────────────────────────────────────┤
│ │
Document Upload │ ┌──────────┐ ┌──────────────┐ │
──────────────► │ │ OCR │───►│ Text Clean │ │
│ └──────────┘ └──────┬───────┘ │
│ │ │
│ ┌──────────▼──────────┐ │
│ │ Feature Store │ │
│ │ (embeddings, meta) │ │
│ └──────────┬──────────┘ │
│ ┌─────┴─────┐ │
│ │ │ │
│ ┌────────▼──┐ ┌────▼────────┐ │
│ │ Classifier │ │ NER/Entity │ │
│ │ (type/dept)│ │ Extraction │ │
│ └────────┬──┘ └────┬────────┘ │
│ └─────┬─────┘ │
│ ┌──────────▼──────────┐ │
│ │ Decision Engine │ │
│ │ (routing + priority)│ │
│ └──────────┬──────────┘ │
│ │ │
│ ┌──────────▼──────────┐ │
│ │ Monitoring & │ │
│ │ Audit Trail │ │
│ └─────────────────────┘ │
└─────────────────────────────────────────────┘
Scaling layers in this architecture: 1. Data Layer: Scalable storage (S3/GCS), data versioning, partitioning 2. Training Layer: Distributed training, experiment tracking, model registry 3. Serving Layer: Auto-scaling model servers, load balancing, caching 4. Monitoring Layer: Drift detection, performance metrics, alerting
AI Governance & Ethics for Government
Government AI systems have unique requirements beyond typical commercial applications. These are not optional — they are legal, ethical, and operational necessities.
Key Concerns
| Concern | Why It Matters | Example |
|---|---|---|
| Bias & Fairness | Government decisions affect citizens' lives | A benefits-eligibility model that disadvantages certain demographics |
| Explainability | Decisions must be justifiable and auditable | A citizen denied a permit deserves to know why |
| Privacy | Government holds sensitive personal data | PII in training data must be protected (FISMA, FedRAMP) |
| Security | Models can be attacked (adversarial inputs, data poisoning) | An adversarial document crafted to fool a classifier |
| Compliance | Federal regulations require specific safeguards | NIST AI RMF, OMB AI guidance, Section 508 |
| Accessibility | Services must be accessible to all citizens | Section 508 requires AI-powered tools to be usable by people with disabilities |
The Government AI Checklist
Before deploying an AI system in a government context: