Skip to main content

TensorFlow & Keras Fundamentals

Learn the TensorFlow ecosystem: tensors, model-building APIs, training loops, and production patterns

~60 min
Listen to this lesson

TensorFlow & Keras Fundamentals

Now that you understand what neural networks compute under the hood, let's use TensorFlow — Google's production-grade framework — to build and train them efficiently.

TensorFlow handles:

  • Automatic differentiation (no manual backprop!)
  • GPU/TPU acceleration (seamless hardware scaling)
  • High-level APIs (Keras) for rapid prototyping
  • Low-level APIs (GradientTape) for full control
  • TensorFlow Tensors

    Tensors are the fundamental data structure — like NumPy arrays, but with GPU support and automatic differentiation.

    python
    1import tensorflow as tf
    2import numpy as np
    3
    4# --- Creating Tensors ---
    5
    6# Constants: immutable tensors
    7a = tf.constant([[1, 2], [3, 4]], dtype=tf.float32)
    8print("Constant:\n", a)
    9
    10# Variables: mutable tensors (used for weights)
    11w = tf.Variable(tf.random.normal([3, 2]), name="weights")
    12print("\nVariable:\n", w)
    13
    14# From NumPy (zero-copy when possible)
    15np_array = np.array([1.0, 2.0, 3.0])
    16tensor = tf.constant(np_array)
    17print("\nFrom NumPy:", tensor)
    18
    19# Random tensors (common for initialization)
    20uniform = tf.random.uniform([2, 3], minval=-1, maxval=1)
    21normal = tf.random.normal([2, 3], mean=0.0, stddev=0.05)
    22print("\nUniform:\n", uniform)
    23print("\nNormal:\n", normal)
    24
    25# --- Basic Operations ---
    26x = tf.constant([[1.0, 2.0], [3.0, 4.0]])
    27y = tf.constant([[5.0, 6.0], [7.0, 8.0]])
    28
    29print("\nAdd:", x + y)
    30print("Multiply:", x * y)           # Element-wise
    31print("MatMul:", x @ y)             # Matrix multiplication
    32print("Reduce sum:", tf.reduce_sum(x))
    33print("Reduce mean:", tf.reduce_mean(x, axis=1))
    34
    35# --- GPU Check ---
    36print("\nGPU available:", len(tf.config.list_physical_devices('GPU')) > 0)
    37print("TF version:", tf.__version__)

    Three Ways to Build Keras Models

    Keras (bundled with TensorFlow) provides three APIs for building models. Each trades simplicity for flexibility.

    1. Sequential API — Simplest, for Linear Stacks

    python
    1from tensorflow import keras
    2from tensorflow.keras import layers
    3
    4# Sequential: layers stacked one after another
    5model_seq = keras.Sequential([
    6    layers.Dense(128, activation="relu", input_shape=(784,)),
    7    layers.Dense(64, activation="relu"),
    8    layers.Dense(10, activation="softmax"),
    9])
    10
    11model_seq.summary()
    12# Total params: 109,386
    13# This is perfect for simple feedforward networks.

    2. Functional API — For Complex Architectures

    Use this when you need multiple inputs/outputs, skip connections, or shared layers.

    python
    1# Functional API: explicit input/output graph
    2inputs = keras.Input(shape=(784,))
    3x = layers.Dense(128, activation="relu")(inputs)
    4x = layers.Dropout(0.3)(x)
    5x = layers.Dense(64, activation="relu")(x)
    6x = layers.Dropout(0.3)(x)
    7outputs = layers.Dense(10, activation="softmax")(x)
    8
    9model_func = keras.Model(inputs=inputs, outputs=outputs, name="mnist_classifier")
    10model_func.summary()
    11
    12# The Functional API can also handle multiple inputs:
    13# input_a = keras.Input(shape=(32,), name="text_features")
    14# input_b = keras.Input(shape=(128,), name="image_features")
    15# combined = layers.Concatenate()([input_a, input_b])
    16# ...

    3. Model Subclassing — Maximum Flexibility

    Use this when you need dynamic behavior (e.g., different forward passes for training vs. inference).

    python
    1class CustomModel(keras.Model):
    2    def __init__(self):
    3        super().__init__()
    4        self.dense1 = layers.Dense(128, activation="relu")
    5        self.dropout1 = layers.Dropout(0.3)
    6        self.dense2 = layers.Dense(64, activation="relu")
    7        self.dropout2 = layers.Dropout(0.3)
    8        self.classifier = layers.Dense(10, activation="softmax")
    9
    10    def call(self, inputs, training=False):
    11        x = self.dense1(inputs)
    12        x = self.dropout1(x, training=training)
    13        x = self.dense2(x)
    14        x = self.dropout2(x, training=training)
    15        return self.classifier(x)
    16
    17
    18model_custom = CustomModel()
    19# Build the model by passing data through it
    20model_custom(tf.zeros([1, 784]))
    21model_custom.summary()

    Complete MNIST Training Example

    Let's put everything together with a real training pipeline. This example includes all the best practices you'll use in production.

    python
    1import tensorflow as tf
    2from tensorflow import keras
    3from tensorflow.keras import layers
    4
    5# --- 1. Load and preprocess data ---
    6(X_train, y_train), (X_test, y_test) = keras.datasets.mnist.load_data()
    7
    8# Normalize pixel values to [0, 1]
    9X_train = X_train.reshape(-1, 784).astype("float32") / 255.0
    10X_test = X_test.reshape(-1, 784).astype("float32") / 255.0
    11
    12# Reserve 10,000 samples for validation
    13X_val, y_val = X_train[:10000], y_train[:10000]
    14X_train, y_train = X_train[10000:], y_train[10000:]
    15
    16print(f"Train: {X_train.shape}, Val: {X_val.shape}, Test: {X_test.shape}")
    17
    18# --- 2. Build model with BatchNorm and Dropout ---
    19model = keras.Sequential([
    20    layers.Dense(256, input_shape=(784,)),
    21    layers.BatchNormalization(),
    22    layers.Activation("relu"),
    23    layers.Dropout(0.3),
    24
    25    layers.Dense(128),
    26    layers.BatchNormalization(),
    27    layers.Activation("relu"),
    28    layers.Dropout(0.3),
    29
    30    layers.Dense(10, activation="softmax"),
    31])
    32
    33# --- 3. Compile with optimizer, loss, and metrics ---
    34model.compile(
    35    optimizer=keras.optimizers.Adam(learning_rate=1e-3),
    36    loss="sparse_categorical_crossentropy",
    37    metrics=["accuracy"],
    38)
    39
    40# --- 4. Set up callbacks ---
    41callbacks = [
    42    # Stop training when validation loss stops improving
    43    keras.callbacks.EarlyStopping(
    44        monitor="val_loss",
    45        patience=5,
    46        restore_best_weights=True,
    47    ),
    48    # Reduce learning rate when validation loss plateaus
    49    keras.callbacks.ReduceLROnPlateau(
    50        monitor="val_loss",
    51        factor=0.5,
    52        patience=3,
    53        min_lr=1e-6,
    54    ),
    55]
    56
    57# --- 5. Train ---
    58history = model.fit(
    59    X_train, y_train,
    60    epochs=50,
    61    batch_size=128,
    62    validation_data=(X_val, y_val),
    63    callbacks=callbacks,
    64    verbose=1,
    65)
    66
    67# --- 6. Evaluate ---
    68test_loss, test_acc = model.evaluate(X_test, y_test, verbose=0)
    69print(f"\nTest accuracy: {test_acc:.4f}")
    70print(f"Test loss:     {test_loss:.4f}")

    Custom Training with GradientTape

    model.fit() handles everything automatically. But sometimes you need full control — for custom losses, complex training schedules, GANs, or reinforcement learning. That's where tf.GradientTape comes in.

    python
    1import tensorflow as tf
    2from tensorflow import keras
    3from tensorflow.keras import layers
    4
    5# Build a simple model
    6model = keras.Sequential([
    7    layers.Dense(128, activation="relu", input_shape=(784,)),
    8    layers.Dense(10),  # No softmax — we'll use from_logits=True
    9])
    10
    11# Setup
    12optimizer = keras.optimizers.Adam(learning_rate=1e-3)
    13loss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True)
    14train_acc_metric = keras.metrics.SparseCategoricalAccuracy()
    15
    16# Load data
    17(X_train, y_train), (X_test, y_test) = keras.datasets.mnist.load_data()
    18X_train = X_train.reshape(-1, 784).astype("float32") / 255.0
    19
    20# Create a tf.data pipeline for efficient batching
    21train_dataset = (
    22    tf.data.Dataset.from_tensor_slices((X_train, y_train))
    23    .shuffle(buffer_size=10000)
    24    .batch(128)
    25    .prefetch(tf.data.AUTOTUNE)  # Overlap data loading with training
    26)
    27
    28# --- Custom training loop ---
    29@tf.function  # Compile to graph for speed
    30def train_step(x_batch, y_batch):
    31    with tf.GradientTape() as tape:
    32        logits = model(x_batch, training=True)
    33        loss_value = loss_fn(y_batch, logits)
    34
    35    # Compute gradients
    36    gradients = tape.gradient(loss_value, model.trainable_variables)
    37    # Update weights
    38    optimizer.apply_gradients(zip(gradients, model.trainable_variables))
    39
    40    train_acc_metric.update_state(y_batch, logits)
    41    return loss_value
    42
    43# Training
    44for epoch in range(5):
    45    train_acc_metric.reset_state()
    46
    47    for step, (x_batch, y_batch) in enumerate(train_dataset):
    48        loss = train_step(x_batch, y_batch)
    49
    50    acc = train_acc_metric.result()
    51    print(f"Epoch {epoch+1}: loss={float(loss):.4f}, accuracy={float(acc):.4f}")

    When to Use Custom Training Loops

    Use model.fit() for 90% of cases — it's simpler, handles callbacks, logging, and distribution strategies automatically. Use GradientTape when you need: - Custom loss functions that depend on intermediate layers - Multiple models that interact (GANs, actor-critic RL) - Gradient accumulation for very large batches - Non-standard training procedures (meta-learning, etc.) Start with model.fit() and switch to GradientTape only when you hit its limits.

    tf.data: Efficient Data Pipelines

    The tf.data API lets you build input pipelines that load and preprocess data efficiently, overlapping data preparation with training.

    python
    1import tensorflow as tf
    2
    3# Basic pipeline pattern
    4dataset = (
    5    tf.data.Dataset.from_tensor_slices((features, labels))
    6    .shuffle(buffer_size=10000)      # Randomize order
    7    .batch(64)                       # Group into batches
    8    .map(preprocess_fn,              # Apply transformations
    9         num_parallel_calls=tf.data.AUTOTUNE)
    10    .prefetch(tf.data.AUTOTUNE)      # Overlap with training
    11)
    12
    13# AUTOTUNE lets TensorFlow automatically determine the optimal
    14# number of parallel operations based on your hardware.
    15
    16# For large datasets that don't fit in memory:
    17# dataset = tf.data.TFRecordDataset("data.tfrecord")
    18# dataset = tf.data.experimental.CsvDataset("data.csv", ...)