NumPy: The Foundation of ML in Python
NumPy (Numerical Python) is the backbone of nearly every ML library in Python. TensorFlow, PyTorch, scikit-learn — they all rely on NumPy arrays under the hood. If you want to do ML, you must be fluent in NumPy.
Why NumPy?
Creating Arrays
NumPy arrays (ndarray) are the fundamental data structure. Here are the most common ways to create them:
1import numpy as np
2
3# From Python lists
4a = np.array([1, 2, 3, 4, 5])
5print(a) # [1 2 3 4 5]
6print(a.dtype) # int64
7print(a.shape) # (5,)
8
9# 2D array (matrix)
10matrix = np.array([[1, 2, 3],
11 [4, 5, 6]])
12print(matrix.shape) # (2, 3) — 2 rows, 3 columns
13
14# Common creation functions
15zeros = np.zeros((3, 4)) # 3x4 matrix of zeros
16ones = np.ones((2, 5)) # 2x5 matrix of ones
17full = np.full((3, 3), 7) # 3x3 matrix filled with 7
18eye = np.eye(4) # 4x4 identity matrix
19rand = np.random.randn(3, 4) # 3x4 matrix of random normal values
20arange = np.arange(0, 10, 2) # [0, 2, 4, 6, 8]
21linspace = np.linspace(0, 1, 5) # [0.0, 0.25, 0.5, 0.75, 1.0]Reshaping Arrays
Reshaping is one of the most critical skills in ML. You'll constantly reshape data to match what models expect.
1import numpy as np
2
3a = np.arange(12)
4print(a) # [ 0 1 2 3 4 5 6 7 8 9 10 11]
5print(a.shape) # (12,)
6
7# Reshape to 3 rows x 4 columns
8b = a.reshape(3, 4)
9print(b)
10# [[ 0 1 2 3]
11# [ 4 5 6 7]
12# [ 8 9 10 11]]
13
14# Using -1 lets NumPy infer the dimension
15c = a.reshape(2, -1) # 2 rows, NumPy figures out 6 columns
16print(c.shape) # (2, 6)
17
18d = a.reshape(-1, 3) # NumPy figures out 4 rows, 3 columns
19print(d.shape) # (4, 3)
20
21# Flatten back to 1D
22flat = b.flatten() # Returns a copy
23raveled = b.ravel() # Returns a view (more memory efficient)
24
25# Add a dimension (critical for ML)
26x = np.array([1, 2, 3]) # shape: (3,)
27row_vec = x[np.newaxis, :] # shape: (1, 3) — row vector
28col_vec = x[:, np.newaxis] # shape: (3, 1) — column vector
29# Equivalent: x.reshape(1, -1) and x.reshape(-1, 1)Image Tensors
In computer vision, images are represented as NumPy arrays. Understanding their shape is essential.
1import numpy as np
2
3# A single RGB image: (height, width, channels)
4image = np.random.randint(0, 256, size=(224, 224, 3), dtype=np.uint8)
5print(image.shape) # (224, 224, 3)
6print(image.dtype) # uint8 (values 0–255)
7
8# A batch of images: (batch_size, height, width, channels)
9batch = np.random.randint(0, 256, size=(32, 224, 224, 3), dtype=np.uint8)
10print(batch.shape) # (32, 224, 224, 3)
11
12# Access the 5th image in the batch
13fifth_image = batch[4] # shape: (224, 224, 3)
14
15# Get the red channel of the first image
16red_channel = batch[0, :, :, 0] # shape: (224, 224)
17
18# Normalize pixel values to [0, 1] for neural networks
19normalized = batch.astype(np.float32) / 255.0
20print(normalized.dtype) # float32
21print(normalized.max()) # 1.0Indexing and Slicing
NumPy provides powerful ways to access and modify array elements.
1import numpy as np
2
3a = np.array([[10, 20, 30, 40],
4 [50, 60, 70, 80],
5 [90, 100, 110, 120]])
6
7# Basic indexing (row, column)
8print(a[0, 1]) # 20 — first row, second column
9print(a[2, -1]) # 120 — last row, last column
10
11# Slicing: a[row_start:row_end, col_start:col_end]
12print(a[0:2, 1:3])
13# [[20 30]
14# [60 70]]
15
16# All rows, specific columns
17print(a[:, 0]) # [10 50 90] — first column
18print(a[:, -1]) # [40 80 120] — last column
19
20# Boolean indexing (filtering)
21mask = a > 50
22print(mask)
23# [[False False False False]
24# [False True True True]
25# [ True True True True]]
26print(a[mask]) # [ 60 70 80 90 100 110 120]
27
28# Fancy indexing (index with arrays)
29rows = np.array([0, 2])
30cols = np.array([1, 3])
31print(a[rows, cols]) # [20 120] — elements at (0,1) and (2,3)
32
33# Combining boolean and fancy indexing
34scores = np.array([85, 42, 91, 67, 55, 99])
35passing = scores[scores >= 60]
36print(passing) # [85 91 67 99]Broadcasting
1import numpy as np
2
3# Scalar broadcast: operates on every element
4a = np.array([[1, 2, 3],
5 [4, 5, 6]])
6print(a * 10)
7# [[10 20 30]
8# [40 50 60]]
9
10# Vector broadcast: vector applied to every row
11row_means = a.mean(axis=1, keepdims=True) # shape (2, 1)
12centered = a - row_means # subtracts each row's mean from that row
13
14# Common ML pattern: normalize features (columns)
15data = np.random.randn(100, 5) # 100 samples, 5 features
16mean = data.mean(axis=0) # shape (5,) — mean of each feature
17std = data.std(axis=0) # shape (5,) — std of each feature
18normalized = (data - mean) / std # broadcasting! shape stays (100, 5)
19
20# Outer product via broadcasting
21x = np.array([1, 2, 3])[:, np.newaxis] # shape (3, 1)
22y = np.array([10, 20, 30])[np.newaxis, :] # shape (1, 3)
23outer = x * y # shape (3, 3)
24print(outer)
25# [[ 10 20 30]
26# [ 20 40 60]
27# [ 30 60 90]]Vectorization: Why NumPy is Fast
The #1 rule of NumPy: avoid Python loops. Use vectorized operations instead. The difference is dramatic.
1import numpy as np
2import time
3
4size = 1_000_000
5a = np.random.randn(size)
6b = np.random.randn(size)
7
8# --- SLOW: Python loop ---
9start = time.time()
10result_loop = []
11for i in range(size):
12 result_loop.append(a[i] + b[i])
13loop_time = time.time() - start
14print(f"Python loop: {loop_time:.4f} seconds")
15
16# --- FAST: Vectorized NumPy ---
17start = time.time()
18result_vec = a + b
19vec_time = time.time() - start
20print(f"NumPy vectorized: {vec_time:.6f} seconds")
21
22print(f"Speedup: {loop_time / vec_time:.0f}x faster!")
23# Typical output:
24# Python loop: 0.2500 seconds
25# NumPy vectorized: 0.001200 seconds
26# Speedup: 208x faster!Vectorization Mindset
Essential Operations for ML
Here are the NumPy operations you'll reach for constantly in ML work:
1import numpy as np
2
3data = np.random.randn(5, 3)
4
5# Aggregation along axes
6print(data.sum(axis=0)) # sum each column — shape (3,)
7print(data.sum(axis=1)) # sum each row — shape (5,)
8print(data.mean(axis=0)) # mean of each feature
9print(data.std(axis=0)) # std of each feature
10
11# Matrix operations
12A = np.random.randn(3, 4)
13B = np.random.randn(4, 2)
14C = A @ B # matrix multiply — shape (3, 2)
15# Equivalent: np.dot(A, B) or np.matmul(A, B)
16
17# Transpose
18print(A.T.shape) # (4, 3)
19
20# Stacking arrays
21x1 = np.array([1, 2, 3])
22x2 = np.array([4, 5, 6])
23vertical = np.vstack([x1, x2]) # shape (2, 3)
24horizontal = np.hstack([x1, x2]) # shape (6,)
25
26# Argmax / Argmin (critical for classification)
27predictions = np.array([0.1, 0.7, 0.2])
28predicted_class = np.argmax(predictions) # 1
29print(predicted_class)
30
31# Where (conditional selection)
32scores = np.array([85, 42, 91, 67])
33result = np.where(scores >= 60, "pass", "fail")
34print(result) # ['pass' 'fail' 'pass' 'pass']