TensorFlow & Keras Fundamentals

Now that you understand what neural networks compute under the hood, let's use TensorFlow — Google's production-grade framework — to build and train them efficiently.

TensorFlow handles:

Automatic differentiation (no manual backprop!)

GPU/TPU acceleration (seamless hardware scaling)

High-level APIs (Keras) for rapid prototyping

Low-level APIs (GradientTape) for full control

TensorFlow Tensors

Tensors are the fundamental data structure — like NumPy arrays, but with GPU support and automatic differentiation.

python

1import tensorflow as tf
2import numpy as np
3
4# --- Creating Tensors ---
5
6# Constants: immutable tensors
7a = tf.constant([[1, 2], [3, 4]], dtype=tf.float32)
8print("Constant:\n", a)
9
10# Variables: mutable tensors (used for weights)
11w = tf.Variable(tf.random.normal([3, 2]), name="weights")
12print("\nVariable:\n", w)
13
14# From NumPy (zero-copy when possible)
15np_array = np.array([1.0, 2.0, 3.0])
16tensor = tf.constant(np_array)
17print("\nFrom NumPy:", tensor)
18
19# Random tensors (common for initialization)
20uniform = tf.random.uniform([2, 3], minval=-1, maxval=1)
21normal = tf.random.normal([2, 3], mean=0.0, stddev=0.05)
22print("\nUniform:\n", uniform)
23print("\nNormal:\n", normal)
24
25# --- Basic Operations ---
26x = tf.constant([[1.0, 2.0], [3.0, 4.0]])
27y = tf.constant([[5.0, 6.0], [7.0, 8.0]])
28
29print("\nAdd:", x + y)
30print("Multiply:", x * y)           # Element-wise
31print("MatMul:", x @ y)             # Matrix multiplication
32print("Reduce sum:", tf.reduce_sum(x))
33print("Reduce mean:", tf.reduce_mean(x, axis=1))
34
35# --- GPU Check ---
36print("\nGPU available:", len(tf.config.list_physical_devices('GPU')) > 0)
37print("TF version:", tf.__version__)

Three Ways to Build Keras Models

Keras (bundled with TensorFlow) provides three APIs for building models. Each trades simplicity for flexibility.

1. Sequential API — Simplest, for Linear Stacks

python

1from tensorflow import keras
2from tensorflow.keras import layers
3
4# Sequential: layers stacked one after another
5model_seq = keras.Sequential([
6    layers.Dense(128, activation="relu", input_shape=(784,)),
7    layers.Dense(64, activation="relu"),
8    layers.Dense(10, activation="softmax"),
9])
10
11model_seq.summary()
12# Total params: 109,386
13# This is perfect for simple feedforward networks.

2. Functional API — For Complex Architectures

Use this when you need multiple inputs/outputs, skip connections, or shared layers.

python

1# Functional API: explicit input/output graph
2inputs = keras.Input(shape=(784,))
3x = layers.Dense(128, activation="relu")(inputs)
4x = layers.Dropout(0.3)(x)
5x = layers.Dense(64, activation="relu")(x)
6x = layers.Dropout(0.3)(x)
7outputs = layers.Dense(10, activation="softmax")(x)
8
9model_func = keras.Model(inputs=inputs, outputs=outputs, name="mnist_classifier")
10model_func.summary()
11
12# The Functional API can also handle multiple inputs:
13# input_a = keras.Input(shape=(32,), name="text_features")
14# input_b = keras.Input(shape=(128,), name="image_features")
15# combined = layers.Concatenate()([input_a, input_b])
16# ...

3. Model Subclassing — Maximum Flexibility

Use this when you need dynamic behavior (e.g., different forward passes for training vs. inference).

python

1class CustomModel(keras.Model):
2    def __init__(self):
3        super().__init__()
4        self.dense1 = layers.Dense(128, activation="relu")
5        self.dropout1 = layers.Dropout(0.3)
6        self.dense2 = layers.Dense(64, activation="relu")
7        self.dropout2 = layers.Dropout(0.3)
8        self.classifier = layers.Dense(10, activation="softmax")
9
10    def call(self, inputs, training=False):
11        x = self.dense1(inputs)
12        x = self.dropout1(x, training=training)
13        x = self.dense2(x)
14        x = self.dropout2(x, training=training)
15        return self.classifier(x)
16
17
18model_custom = CustomModel()
19# Build the model by passing data through it
20model_custom(tf.zeros([1, 784]))
21model_custom.summary()

Complete MNIST Training Example

Let's put everything together with a real training pipeline. This example includes all the best practices you'll use in production.

python

1import tensorflow as tf
2from tensorflow import keras
3from tensorflow.keras import layers
4
5# --- 1. Load and preprocess data ---
6(X_train, y_train), (X_test, y_test) = keras.datasets.mnist.load_data()
7
8# Normalize pixel values to [0, 1]
9X_train = X_train.reshape(-1, 784).astype("float32") / 255.0
10X_test = X_test.reshape(-1, 784).astype("float32") / 255.0
11
12# Reserve 10,000 samples for validation
13X_val, y_val = X_train[:10000], y_train[:10000]
14X_train, y_train = X_train[10000:], y_train[10000:]
15
16print(f"Train: {X_train.shape}, Val: {X_val.shape}, Test: {X_test.shape}")
17
18# --- 2. Build model with BatchNorm and Dropout ---
19model = keras.Sequential([
20    layers.Dense(256, input_shape=(784,)),
21    layers.BatchNormalization(),
22    layers.Activation("relu"),
23    layers.Dropout(0.3),
24
25    layers.Dense(128),
26    layers.BatchNormalization(),
27    layers.Activation("relu"),
28    layers.Dropout(0.3),
29
30    layers.Dense(10, activation="softmax"),
31])
32
33# --- 3. Compile with optimizer, loss, and metrics ---
34model.compile(
35    optimizer=keras.optimizers.Adam(learning_rate=1e-3),
36    loss="sparse_categorical_crossentropy",
37    metrics=["accuracy"],
38)
39
40# --- 4. Set up callbacks ---
41callbacks = [
42    # Stop training when validation loss stops improving
43    keras.callbacks.EarlyStopping(
44        monitor="val_loss",
45        patience=5,
46        restore_best_weights=True,
47    ),
48    # Reduce learning rate when validation loss plateaus
49    keras.callbacks.ReduceLROnPlateau(
50        monitor="val_loss",
51        factor=0.5,
52        patience=3,
53        min_lr=1e-6,
54    ),
55]
56
57# --- 5. Train ---
58history = model.fit(
59    X_train, y_train,
60    epochs=50,
61    batch_size=128,
62    validation_data=(X_val, y_val),
63    callbacks=callbacks,
64    verbose=1,
65)
66
67# --- 6. Evaluate ---
68test_loss, test_acc = model.evaluate(X_test, y_test, verbose=0)
69print(f"\nTest accuracy: {test_acc:.4f}")
70print(f"Test loss:     {test_loss:.4f}")

Custom Training with GradientTape

model.fit() handles everything automatically. But sometimes you need full control — for custom losses, complex training schedules, GANs, or reinforcement learning. That's where tf.GradientTape comes in.

python

1import tensorflow as tf
2from tensorflow import keras
3from tensorflow.keras import layers
4
5# Build a simple model
6model = keras.Sequential([
7    layers.Dense(128, activation="relu", input_shape=(784,)),
8    layers.Dense(10),  # No softmax — we'll use from_logits=True
9])
10
11# Setup
12optimizer = keras.optimizers.Adam(learning_rate=1e-3)
13loss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True)
14train_acc_metric = keras.metrics.SparseCategoricalAccuracy()
15
16# Load data
17(X_train, y_train), (X_test, y_test) = keras.datasets.mnist.load_data()
18X_train = X_train.reshape(-1, 784).astype("float32") / 255.0
19
20# Create a tf.data pipeline for efficient batching
21train_dataset = (
22    tf.data.Dataset.from_tensor_slices((X_train, y_train))
23    .shuffle(buffer_size=10000)
24    .batch(128)
25    .prefetch(tf.data.AUTOTUNE)  # Overlap data loading with training
26)
27
28# --- Custom training loop ---
29@tf.function  # Compile to graph for speed
30def train_step(x_batch, y_batch):
31    with tf.GradientTape() as tape:
32        logits = model(x_batch, training=True)
33        loss_value = loss_fn(y_batch, logits)
34
35    # Compute gradients
36    gradients = tape.gradient(loss_value, model.trainable_variables)
37    # Update weights
38    optimizer.apply_gradients(zip(gradients, model.trainable_variables))
39
40    train_acc_metric.update_state(y_batch, logits)
41    return loss_value
42
43# Training
44for epoch in range(5):
45    train_acc_metric.reset_state()
46
47    for step, (x_batch, y_batch) in enumerate(train_dataset):
48        loss = train_step(x_batch, y_batch)
49
50    acc = train_acc_metric.result()
51    print(f"Epoch {epoch+1}: loss={float(loss):.4f}, accuracy={float(acc):.4f}")

When to Use Custom Training Loops

Use model.fit() for 90% of cases — it's simpler, handles callbacks, logging, and distribution strategies automatically. Use GradientTape when you need: - Custom loss functions that depend on intermediate layers - Multiple models that interact (GANs, actor-critic RL) - Gradient accumulation for very large batches - Non-standard training procedures (meta-learning, etc.) Start with model.fit() and switch to GradientTape only when you hit its limits.

tf.data: Efficient Data Pipelines

The tf.data API lets you build input pipelines that load and preprocess data efficiently, overlapping data preparation with training.

python

1import tensorflow as tf
2
3# Basic pipeline pattern
4dataset = (
5    tf.data.Dataset.from_tensor_slices((features, labels))
6    .shuffle(buffer_size=10000)      # Randomize order
7    .batch(64)                       # Group into batches
8    .map(preprocess_fn,              # Apply transformations
9         num_parallel_calls=tf.data.AUTOTUNE)
10    .prefetch(tf.data.AUTOTUNE)      # Overlap with training
11)
12
13# AUTOTUNE lets TensorFlow automatically determine the optimal
14# number of parallel operations based on your hardware.
15
16# For large datasets that don't fit in memory:
17# dataset = tf.data.TFRecordDataset("data.tfrecord")
18# dataset = tf.data.experimental.CsvDataset("data.csv", ...)

TensorFlow & Keras Fundamentals

Now that you understand what neural networks compute under the hood, let's use TensorFlow — Google's production-grade framework — to build and train them efficiently.

TensorFlow handles:

Automatic differentiation (no manual backprop!)

GPU/TPU acceleration (seamless hardware scaling)

High-level APIs (Keras) for rapid prototyping

Low-level APIs (GradientTape) for full control

1import tensorflow as tf 2import numpy as np 3 4# --- Creating Tensors --- 5 6# Constants: immutable tensors 7a = tf.constant([[1, 2], [3, 4]], dtype=tf.float32) 8print("Constant:\n", a) 9 10# Variables: mutable tensors (used for weights) 11w = tf.Variable(tf.random.normal([3, 2]), name="weights") 12print("\nVariable:\n", w) 13 14# From NumPy (zero-copy when possible) 15np_array = np.array([1.0, 2.0, 3.0]) 16tensor = tf.constant(np_array) 17print("\nFrom NumPy:", tensor) 18 19# Random tensors (common for initialization) 20uniform = tf.random.uniform([2, 3], minval=-1, maxval=1) 21normal = tf.random.normal([2, 3], mean=0.0, stddev=0.05) 22print("\nUniform:\n", uniform) 23print("\nNormal:\n", normal) 24 25# --- Basic Operations --- 26x = tf.constant([[1.0, 2.0], [3.0, 4.0]]) 27y = tf.constant([[5.0, 6.0], [7.0, 8.0]]) 28 29print("\nAdd:", x + y) 30print("Multiply:", x * y) # Element-wise 31print("MatMul:", x @ y) # Matrix multiplication 32print("Reduce sum:", tf.reduce_sum(x)) 33print("Reduce mean:", tf.reduce_mean(x, axis=1)) 34 35# --- GPU Check --- 36print("\nGPU available:", len(tf.config.list_physical_devices('GPU')) > 0) 37print("TF version:", tf.__version__)

1from tensorflow import keras 2from tensorflow.keras import layers 3 4# Sequential: layers stacked one after another 5model_seq = keras.Sequential([ 6 layers.Dense(128, activation="relu", input_shape=(784,)), 7 layers.Dense(64, activation="relu"), 8 layers.Dense(10, activation="softmax"), 9]) 10 11model_seq.summary() 12# Total params: 109,386 13# This is perfect for simple feedforward networks.

1# Functional API: explicit input/output graph 2inputs = keras.Input(shape=(784,)) 3x = layers.Dense(128, activation="relu")(inputs) 4x = layers.Dropout(0.3)(x) 5x = layers.Dense(64, activation="relu")(x) 6x = layers.Dropout(0.3)(x) 7outputs = layers.Dense(10, activation="softmax")(x) 8 9model_func = keras.Model(inputs=inputs, outputs=outputs, name="mnist_classifier") 10model_func.summary() 11 12# The Functional API can also handle multiple inputs: 13# input_a = keras.Input(shape=(32,), name="text_features") 14# input_b = keras.Input(shape=(128,), name="image_features") 15# combined = layers.Concatenate()([input_a, input_b]) 16# ...

1class CustomModel(keras.Model): 2 def __init__(self): 3 super().__init__() 4 self.dense1 = layers.Dense(128, activation="relu") 5 self.dropout1 = layers.Dropout(0.3) 6 self.dense2 = layers.Dense(64, activation="relu") 7 self.dropout2 = layers.Dropout(0.3) 8 self.classifier = layers.Dense(10, activation="softmax") 9 10 def call(self, inputs, training=False): 11 x = self.dense1(inputs) 12 x = self.dropout1(x, training=training) 13 x = self.dense2(x) 14 x = self.dropout2(x, training=training) 15 return self.classifier(x) 16 17 18model_custom = CustomModel() 19# Build the model by passing data through it 20model_custom(tf.zeros([1, 784])) 21model_custom.summary()

1import tensorflow as tf 2from tensorflow import keras 3from tensorflow.keras import layers 4 5# --- 1. Load and preprocess data --- 6(X_train, y_train), (X_test, y_test) = keras.datasets.mnist.load_data() 7 8# Normalize pixel values to [0, 1] 9X_train = X_train.reshape(-1, 784).astype("float32") / 255.0 10X_test = X_test.reshape(-1, 784).astype("float32") / 255.0 11 12# Reserve 10,000 samples for validation 13X_val, y_val = X_train[:10000], y_train[:10000] 14X_train, y_train = X_train[10000:], y_train[10000:] 15 16print(f"Train: {X_train.shape}, Val: {X_val.shape}, Test: {X_test.shape}") 17 18# --- 2. Build model with BatchNorm and Dropout --- 19model = keras.Sequential([ 20 layers.Dense(256, input_shape=(784,)), 21 layers.BatchNormalization(), 22 layers.Activation("relu"), 23 layers.Dropout(0.3), 24 25 layers.Dense(128), 26 layers.BatchNormalization(), 27 layers.Activation("relu"), 28 layers.Dropout(0.3), 29 30 layers.Dense(10, activation="softmax"), 31]) 32 33# --- 3. Compile with optimizer, loss, and metrics --- 34model.compile( 35 optimizer=keras.optimizers.Adam(learning_rate=1e-3), 36 loss="sparse_categorical_crossentropy", 37 metrics=["accuracy"], 38) 39 40# --- 4. Set up callbacks --- 41callbacks = [ 42 # Stop training when validation loss stops improving 43 keras.callbacks.EarlyStopping( 44 monitor="val_loss", 45 patience=5, 46 restore_best_weights=True, 47 ), 48 # Reduce learning rate when validation loss plateaus 49 keras.callbacks.ReduceLROnPlateau( 50 monitor="val_loss", 51 factor=0.5, 52 patience=3, 53 min_lr=1e-6, 54 ), 55] 56 57# --- 5. Train --- 58history = model.fit( 59 X_train, y_train, 60 epochs=50, 61 batch_size=128, 62 validation_data=(X_val, y_val), 63 callbacks=callbacks, 64 verbose=1, 65) 66 67# --- 6. Evaluate --- 68test_loss, test_acc = model.evaluate(X_test, y_test, verbose=0) 69print(f"\nTest accuracy: {test_acc:.4f}") 70print(f"Test loss: {test_loss:.4f}")

1import tensorflow as tf 2from tensorflow import keras 3from tensorflow.keras import layers 4 5# Build a simple model 6model = keras.Sequential([ 7 layers.Dense(128, activation="relu", input_shape=(784,)), 8 layers.Dense(10), # No softmax — we'll use from_logits=True 9]) 10 11# Setup 12optimizer = keras.optimizers.Adam(learning_rate=1e-3) 13loss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True) 14train_acc_metric = keras.metrics.SparseCategoricalAccuracy() 15 16# Load data 17(X_train, y_train), (X_test, y_test) = keras.datasets.mnist.load_data() 18X_train = X_train.reshape(-1, 784).astype("float32") / 255.0 19 20# Create a tf.data pipeline for efficient batching 21train_dataset = ( 22 tf.data.Dataset.from_tensor_slices((X_train, y_train)) 23 .shuffle(buffer_size=10000) 24 .batch(128) 25 .prefetch(tf.data.AUTOTUNE) # Overlap data loading with training 26) 27 28# --- Custom training loop --- 29@tf.function # Compile to graph for speed 30def train_step(x_batch, y_batch): 31 with tf.GradientTape() as tape: 32 logits = model(x_batch, training=True) 33 loss_value = loss_fn(y_batch, logits) 34 35 # Compute gradients 36 gradients = tape.gradient(loss_value, model.trainable_variables) 37 # Update weights 38 optimizer.apply_gradients(zip(gradients, model.trainable_variables)) 39 40 train_acc_metric.update_state(y_batch, logits) 41 return loss_value 42 43# Training 44for epoch in range(5): 45 train_acc_metric.reset_state() 46 47 for step, (x_batch, y_batch) in enumerate(train_dataset): 48 loss = train_step(x_batch, y_batch) 49 50 acc = train_acc_metric.result() 51 print(f"Epoch {epoch+1}: loss={float(loss):.4f}, accuracy={float(acc):.4f}")

1import tensorflow as tf 2 3# Basic pipeline pattern 4dataset = ( 5 tf.data.Dataset.from_tensor_slices((features, labels)) 6 .shuffle(buffer_size=10000) # Randomize order 7 .batch(64) # Group into batches 8 .map(preprocess_fn, # Apply transformations 9 num_parallel_calls=tf.data.AUTOTUNE) 10 .prefetch(tf.data.AUTOTUNE) # Overlap with training 11) 12 13# AUTOTUNE lets TensorFlow automatically determine the optimal 14# number of parallel operations based on your hardware. 15 16# For large datasets that don't fit in memory: 17# dataset = tf.data.TFRecordDataset("data.tfrecord") 18# dataset = tf.data.experimental.CsvDataset("data.csv", ...)