Bias & Fairness in Machine Learning

Machine learning models can perpetuate and amplify societal biases present in training data. A hiring model trained on historical decisions might discriminate by gender; a recidivism prediction model might have different error rates by race. Understanding, detecting, and mitigating these biases is both an ethical imperative and, increasingly, a legal requirement.

This lesson covers the types of bias, mathematical fairness definitions, detection tools, and mitigation strategies.

The Impossibility Theorem

It is mathematically impossible to simultaneously satisfy all fairness criteria (demographic parity, equalized odds, and calibration) when base rates differ between groups. This means fairness requires making explicit value judgments about which fairness criteria matter most in each application context. There is no purely technical solution to fairness.

Types of Bias

Data Bias

Type	Description	Example
Selection bias	Training data is not representative	Medical data mostly from one demographic
Measurement bias	Features are measured differently across groups	Wealthier areas have more sensors
Label bias	Labels reflect historical discrimination	Historical hiring decisions that excluded women
Representation bias	Some groups are underrepresented	Few elderly users in tech product data

Algorithmic Bias

Type	Description	Example
Optimization bias	Model optimizes for majority group	Accuracy-maximizing model ignores minorities
Feature bias	Proxy features encode protected attributes	Zip code as a proxy for race
Feedback loops	Model predictions affect future data	Predictive policing increases arrests in targeted areas

Fairness Metrics

Group Fairness Metrics

Demographic Parity (Statistical Parity): The selection rate should be equal across groups. P(Y_hat=1 | A=0) = P(Y_hat=1 | A=1)

Equalized Odds: True positive rate and false positive rate should be equal across groups. P(Y_hat=1 | Y=1, A=0) = P(Y_hat=1 | Y=1, A=1) AND P(Y_hat=1 | Y=0, A=0) = P(Y_hat=1 | Y=0, A=1)

Equal Opportunity: A relaxation of equalized odds — only requires equal true positive rates. P(Y_hat=1 | Y=1, A=0) = P(Y_hat=1 | Y=1, A=1)

Calibration: Among those predicted positive at a given probability, the actual positive rate should be equal. P(Y=1 | Y_hat=p, A=0) = P(Y=1 | Y_hat=p, A=1)

The Four-Fifths Rule

A practical guideline from US employment law: the selection rate for any protected group should be at least 80% of the rate for the group with the highest selection rate. Also known as the "80% rule" or "disparate impact ratio."

python

1import numpy as np
2from sklearn.ensemble import GradientBoostingClassifier
3from sklearn.model_selection import train_test_split
4from sklearn.metrics import confusion_matrix
5
6np.random.seed(42)
7
8# --- Create a biased dataset ---
9# Simulate a hiring scenario with gender bias
10n = 3000
11gender = np.random.binomial(1, 0.5, n)  # 0=female, 1=male
12education = np.random.normal(5, 1.5, n)
13experience = np.random.normal(5, 2, n)
14
15# Bias: males get a boost in hiring probability
16score = 0.3 * education + 0.4 * experience + 0.8 * gender
17noise = np.random.normal(0, 1, n)
18hired = (score + noise > 4).astype(int)
19
20X = np.column_stack([gender, education, experience])
21y = hired
22feature_names = ["gender", "education", "experience"]
23
24X_train, X_test, y_train, y_test = train_test_split(
25    X, y, test_size=0.3, random_state=42
26)
27
28# --- Train model ---
29gbc = GradientBoostingClassifier(
30    n_estimators=100, max_depth=3, random_state=42
31)
32gbc.fit(X_train, y_train)
33y_pred = gbc.predict(X_test)
34y_proba = gbc.predict_proba(X_test)[:, 1]
35
36print(f"Overall accuracy: {(y_pred == y_test).mean():.4f}")
37
38# --- Compute fairness metrics ---
39female_mask = X_test[:, 0] == 0
40male_mask = X_test[:, 0] == 1
41
42def fairness_metrics(y_true, y_pred, group_mask, group_name):
43    """Compute fairness metrics for a subgroup."""
44    cm = confusion_matrix(y_true[group_mask], y_pred[group_mask])
45    tn, fp, fn, tp = cm.ravel()
46    tpr = tp / (tp + fn) if (tp + fn) > 0 else 0
47    fpr = fp / (fp + tn) if (fp + tn) > 0 else 0
48    selection_rate = y_pred[group_mask].mean()
49    accuracy = (y_pred[group_mask] == y_true[group_mask]).mean()
50    return {
51        "group": group_name,
52        "n": group_mask.sum(),
53        "selection_rate": selection_rate,
54        "tpr": tpr,
55        "fpr": fpr,
56        "accuracy": accuracy,
57    }
58
59female_metrics = fairness_metrics(y_test, y_pred, female_mask, "Female")
60male_metrics = fairness_metrics(y_test, y_pred, male_mask, "Male")
61
62print("\n=== Fairness Metrics ===")
63print(f"{'Metric':<20} {'Female':>10} {'Male':>10} {'Ratio':>10}")
64print("-" * 55)
65for metric in ["selection_rate", "tpr", "fpr", "accuracy"]:
66    f_val = female_metrics[metric]
67    m_val = male_metrics[metric]
68    ratio = f_val / m_val if m_val > 0 else float("inf")
69    flag = " FAIL" if ratio < 0.8 else " PASS"
70    print(f"{metric:<20} {f_val:>10.4f} {m_val:>10.4f} {ratio:>9.2f}{flag}")
71
72# Demographic parity
73dp_diff = abs(female_metrics["selection_rate"] - male_metrics["selection_rate"])
74print(f"\nDemographic Parity Difference: {dp_diff:.4f}")
75
76# Equalized odds
77eo_tpr_diff = abs(female_metrics["tpr"] - male_metrics["tpr"])
78eo_fpr_diff = abs(female_metrics["fpr"] - male_metrics["fpr"])
79print(f"Equalized Odds (TPR gap): {eo_tpr_diff:.4f}")
80print(f"Equalized Odds (FPR gap): {eo_fpr_diff:.4f}")
81
82# Four-fifths rule
83ratio_4_5 = female_metrics["selection_rate"] / male_metrics["selection_rate"]
84print(f"\nFour-Fifths Rule: {ratio_4_5:.4f} "
85      f"({'PASS' if ratio_4_5 >= 0.8 else 'FAIL - disparate impact detected'})")

Bias Mitigation Strategies

Pre-Processing (before training)

Modify the training data to remove bias:

Resampling: Over/under-sample to equalize group representation

Reweighting: Assign higher weights to underrepresented groups

Disparate Impact Remover: Transform features to remove correlation with protected attributes

Fair representation learning: Learn a latent representation that encodes task-relevant information but not protected attributes

In-Processing (during training)

Modify the learning algorithm:

Adversarial debiasing: Add an adversary that tries to predict the protected attribute from model outputs; the main model learns to fool the adversary

Fairness constraints: Add fairness metrics as constraints or regularization terms in the loss function

Exponentiated Gradient: Reduce fair classification to a sequence of cost-sensitive classification problems

Post-Processing (after training)

Modify the model's predictions:

Threshold adjustment: Use different decision thresholds for each group to equalize metrics

Calibrated equalized odds: Find the threshold combination that satisfies equalized odds while minimizing accuracy loss

Reject option classification: In the uncertainty region, favor the underprivileged group

python

1import numpy as np
2from sklearn.ensemble import GradientBoostingClassifier
3from sklearn.model_selection import train_test_split
4
5np.random.seed(42)
6
7# Recreate biased dataset
8n = 3000
9gender = np.random.binomial(1, 0.5, n)
10education = np.random.normal(5, 1.5, n)
11experience = np.random.normal(5, 2, n)
12score = 0.3 * education + 0.4 * experience + 0.8 * gender
13noise = np.random.normal(0, 1, n)
14hired = (score + noise > 4).astype(int)
15X = np.column_stack([gender, education, experience])
16y = hired
17
18X_train, X_test, y_train, y_test = train_test_split(
19    X, y, test_size=0.3, random_state=42
20)
21
22# --- Mitigation 1: Remove protected attribute ---
23print("=== Mitigation 1: Remove Gender Feature ===")
24X_train_no_gender = X_train[:, 1:]  # Drop gender column
25X_test_no_gender = X_test[:, 1:]
26
27gbc_no_gender = GradientBoostingClassifier(
28    n_estimators=100, max_depth=3, random_state=42
29)
30gbc_no_gender.fit(X_train_no_gender, y_train)
31y_pred_ng = gbc_no_gender.predict(X_test_no_gender)
32
33female = X_test[:, 0] == 0
34male = X_test[:, 0] == 1
35sr_f = y_pred_ng[female].mean()
36sr_m = y_pred_ng[male].mean()
37print(f"Selection rate: Female={sr_f:.4f}, Male={sr_m:.4f}, "
38      f"Ratio={sr_f/sr_m:.4f}")
39print(f"Accuracy: {(y_pred_ng == y_test).mean():.4f}")
40
41# --- Mitigation 2: Reweighting ---
42print("\n=== Mitigation 2: Sample Reweighting ===")
43# Compute weights to balance group-label combinations
44groups = X_train[:, 0]
45weights = np.ones(len(y_train))
46
47for g in [0, 1]:
48    for label in [0, 1]:
49        mask = (groups == g) & (y_train == label)
50        expected = len(y_train) / 4
51        actual = mask.sum()
52        weights[mask] = expected / actual if actual > 0 else 1.0
53
54gbc_reweight = GradientBoostingClassifier(
55    n_estimators=100, max_depth=3, random_state=42
56)
57gbc_reweight.fit(X_train, y_train, sample_weight=weights)
58y_pred_rw = gbc_reweight.predict(X_test)
59
60sr_f_rw = y_pred_rw[female].mean()
61sr_m_rw = y_pred_rw[male].mean()
62print(f"Selection rate: Female={sr_f_rw:.4f}, Male={sr_m_rw:.4f}, "
63      f"Ratio={sr_f_rw/sr_m_rw:.4f}")
64print(f"Accuracy: {(y_pred_rw == y_test).mean():.4f}")
65
66# --- Mitigation 3: Threshold adjustment (post-processing) ---
67print("\n=== Mitigation 3: Threshold Adjustment ===")
68gbc_full = GradientBoostingClassifier(
69    n_estimators=100, max_depth=3, random_state=42
70)
71gbc_full.fit(X_train, y_train)
72y_proba = gbc_full.predict_proba(X_test)[:, 1]
73
74# Find thresholds that equalize selection rates
75target_rate = y_proba.mean()  # Use overall mean as target
76
77best_thresh = {"female": 0.5, "male": 0.5}
78for name, mask in [("female", female), ("male", male)]:
79    for t in np.arange(0.1, 0.9, 0.01):
80        sr = (y_proba[mask] >= t).mean()
81        if abs(sr - target_rate) < abs(
82            (y_proba[mask] >= best_thresh[name]).mean() - target_rate
83        ):
84            best_thresh[name] = t
85
86y_pred_thresh = np.zeros(len(y_test), dtype=int)
87y_pred_thresh[female] = (y_proba[female] >= best_thresh["female"]).astype(int)
88y_pred_thresh[male] = (y_proba[male] >= best_thresh["male"]).astype(int)
89
90sr_f_t = y_pred_thresh[female].mean()
91sr_m_t = y_pred_thresh[male].mean()
92print(f"Thresholds: Female={best_thresh['female']:.2f}, "
93      f"Male={best_thresh['male']:.2f}")
94print(f"Selection rate: Female={sr_f_t:.4f}, Male={sr_m_t:.4f}, "
95      f"Ratio={sr_f_t/(sr_m_t+1e-8):.4f}")
96print(f"Accuracy: {(y_pred_thresh == y_test).mean():.4f}")
97
98# --- Summary ---
99print("\n=== Comparison ===")
100print(f"{'Method':<25} {'DP Ratio':>10} {'Accuracy':>10}")
101print("-" * 45)
102methods = [
103    ("Baseline (with gender)", X_test[:, 0] == 0, X_test[:, 0] == 1,
104     gbc_full.predict(X_test)),
105    ("Remove gender", female, male, y_pred_ng),
106    ("Reweighting", female, male, y_pred_rw),
107    ("Threshold adjustment", female, male, y_pred_thresh),
108]
109for name, f_m, m_m, preds in methods:
110    sr_f = preds[f_m].mean()
111    sr_m = preds[m_m].mean()
112    ratio = sr_f / sr_m if sr_m > 0 else 0
113    acc = (preds == y_test).mean()
114    flag = " *" if ratio >= 0.8 else ""
115    print(f"{name:<25} {ratio:>10.4f} {acc:>10.4f}{flag}")

Removing the Protected Attribute Is Often Not Enough

Simply removing the protected attribute (e.g., gender) from the features does not eliminate bias. Other features (zip code, name, purchasing patterns) can serve as proxies for the protected attribute. This is called 'redundant encoding.' True bias mitigation requires measuring fairness metrics and applying systematic mitigation techniques.

Fairness Toolkits

Fairlearn (Microsoft)

Python library for assessing and improving fairness. Provides:

MetricFrame: Compute any metric disaggregated by group

ThresholdOptimizer: Post-processing threshold adjustment

ExponentiatedGradient: In-processing constrained optimization

GridSearch: Find the fairness-accuracy Pareto frontier

AIF360 (IBM)

Comprehensive toolkit with 70+ fairness metrics and 10+ algorithms:

Pre-processing: Reweighting, Disparate Impact Remover, Optimized Preprocessing

In-processing: Adversarial Debiasing, Prejudice Remover

Post-processing: Equalized Odds, Calibrated Equalized Odds, Reject Option

Choosing a Strategy

1. Start with measurement: You cannot improve what you do not measure 2. Try post-processing first: It is the easiest and does not require retraining 3. Use pre-processing for data issues: If the problem is in the data, fix the data 4. Use in-processing for algorithmic issues: If the model itself introduces bias 5. Document and monitor: Fairness is not a one-time check; monitor in production

Bias & Fairness in Machine Learning

This lesson covers the types of bias, mathematical fairness definitions, detection tools, and mitigation strategies.

The Impossibility Theorem

Types of Bias

Data Bias

Type	Description	Example
Selection bias	Training data is not representative	Medical data mostly from one demographic
Measurement bias	Features are measured differently across groups	Wealthier areas have more sensors
Label bias	Labels reflect historical discrimination	Historical hiring decisions that excluded women
Representation bias	Some groups are underrepresented	Few elderly users in tech product data

Algorithmic Bias

Type	Description	Example
Optimization bias	Model optimizes for majority group	Accuracy-maximizing model ignores minorities
Feature bias	Proxy features encode protected attributes	Zip code as a proxy for race
Feedback loops	Model predictions affect future data	Predictive policing increases arrests in targeted areas

Fairness Metrics

Group Fairness Metrics

Demographic Parity (Statistical Parity): The selection rate should be equal across groups. P(Y_hat=1 | A=0) = P(Y_hat=1 | A=1)

Equalized Odds: True positive rate and false positive rate should be equal across groups. P(Y_hat=1 | Y=1, A=0) = P(Y_hat=1 | Y=1, A=1) AND P(Y_hat=1 | Y=0, A=0) = P(Y_hat=1 | Y=0, A=1)

Equal Opportunity: A relaxation of equalized odds — only requires equal true positive rates. P(Y_hat=1 | Y=1, A=0) = P(Y_hat=1 | Y=1, A=1)

Calibration: Among those predicted positive at a given probability, the actual positive rate should be equal. P(Y=1 | Y_hat=p, A=0) = P(Y=1 | Y_hat=p, A=1)

The Four-Fifths Rule

python

1import numpy as np
2from sklearn.ensemble import GradientBoostingClassifier
3from sklearn.model_selection import train_test_split
4from sklearn.metrics import confusion_matrix
5
6np.random.seed(42)
7
8# --- Create a biased dataset ---
9# Simulate a hiring scenario with gender bias
10n = 3000
11gender = np.random.binomial(1, 0.5, n)  # 0=female, 1=male
12education = np.random.normal(5, 1.5, n)
13experience = np.random.normal(5, 2, n)
14
15# Bias: males get a boost in hiring probability
16score = 0.3 * education + 0.4 * experience + 0.8 * gender
17noise = np.random.normal(0, 1, n)
18hired = (score + noise > 4).astype(int)
19
20X = np.column_stack([gender, education, experience])
21y = hired
22feature_names = ["gender", "education", "experience"]
23
24X_train, X_test, y_train, y_test = train_test_split(
25    X, y, test_size=0.3, random_state=42
26)
27
28# --- Train model ---
29gbc = GradientBoostingClassifier(
30    n_estimators=100, max_depth=3, random_state=42
31)
32gbc.fit(X_train, y_train)
33y_pred = gbc.predict(X_test)
34y_proba = gbc.predict_proba(X_test)[:, 1]
35
36print(f"Overall accuracy: {(y_pred == y_test).mean():.4f}")
37
38# --- Compute fairness metrics ---
39female_mask = X_test[:, 0] == 0
40male_mask = X_test[:, 0] == 1
41
42def fairness_metrics(y_true, y_pred, group_mask, group_name):
43    """Compute fairness metrics for a subgroup."""
44    cm = confusion_matrix(y_true[group_mask], y_pred[group_mask])
45    tn, fp, fn, tp = cm.ravel()
46    tpr = tp / (tp + fn) if (tp + fn) > 0 else 0
47    fpr = fp / (fp + tn) if (fp + tn) > 0 else 0
48    selection_rate = y_pred[group_mask].mean()
49    accuracy = (y_pred[group_mask] == y_true[group_mask]).mean()
50    return {
51        "group": group_name,
52        "n": group_mask.sum(),
53        "selection_rate": selection_rate,
54        "tpr": tpr,
55        "fpr": fpr,
56        "accuracy": accuracy,
57    }
58
59female_metrics = fairness_metrics(y_test, y_pred, female_mask, "Female")
60male_metrics = fairness_metrics(y_test, y_pred, male_mask, "Male")
61
62print("\n=== Fairness Metrics ===")
63print(f"{'Metric':<20} {'Female':>10} {'Male':>10} {'Ratio':>10}")
64print("-" * 55)
65for metric in ["selection_rate", "tpr", "fpr", "accuracy"]:
66    f_val = female_metrics[metric]
67    m_val = male_metrics[metric]
68    ratio = f_val / m_val if m_val > 0 else float("inf")
69    flag = " FAIL" if ratio < 0.8 else " PASS"
70    print(f"{metric:<20} {f_val:>10.4f} {m_val:>10.4f} {ratio:>9.2f}{flag}")
71
72# Demographic parity
73dp_diff = abs(female_metrics["selection_rate"] - male_metrics["selection_rate"])
74print(f"\nDemographic Parity Difference: {dp_diff:.4f}")
75
76# Equalized odds
77eo_tpr_diff = abs(female_metrics["tpr"] - male_metrics["tpr"])
78eo_fpr_diff = abs(female_metrics["fpr"] - male_metrics["fpr"])
79print(f"Equalized Odds (TPR gap): {eo_tpr_diff:.4f}")
80print(f"Equalized Odds (FPR gap): {eo_fpr_diff:.4f}")
81
82# Four-fifths rule
83ratio_4_5 = female_metrics["selection_rate"] / male_metrics["selection_rate"]
84print(f"\nFour-Fifths Rule: {ratio_4_5:.4f} "
85      f"({'PASS' if ratio_4_5 >= 0.8 else 'FAIL - disparate impact detected'})")

Bias Mitigation Strategies

Pre-Processing (before training)

Modify the training data to remove bias:

Resampling: Over/under-sample to equalize group representation

Reweighting: Assign higher weights to underrepresented groups

Disparate Impact Remover: Transform features to remove correlation with protected attributes

Fair representation learning: Learn a latent representation that encodes task-relevant information but not protected attributes

In-Processing (during training)

Modify the learning algorithm:

Adversarial debiasing: Add an adversary that tries to predict the protected attribute from model outputs; the main model learns to fool the adversary

Fairness constraints: Add fairness metrics as constraints or regularization terms in the loss function

Exponentiated Gradient: Reduce fair classification to a sequence of cost-sensitive classification problems

Post-Processing (after training)

Modify the model's predictions:

Threshold adjustment: Use different decision thresholds for each group to equalize metrics

Calibrated equalized odds: Find the threshold combination that satisfies equalized odds while minimizing accuracy loss

Reject option classification: In the uncertainty region, favor the underprivileged group

python

1import numpy as np
2from sklearn.ensemble import GradientBoostingClassifier
3from sklearn.model_selection import train_test_split
4
5np.random.seed(42)
6
7# Recreate biased dataset
8n = 3000
9gender = np.random.binomial(1, 0.5, n)
10education = np.random.normal(5, 1.5, n)
11experience = np.random.normal(5, 2, n)
12score = 0.3 * education + 0.4 * experience + 0.8 * gender
13noise = np.random.normal(0, 1, n)
14hired = (score + noise > 4).astype(int)
15X = np.column_stack([gender, education, experience])
16y = hired
17
18X_train, X_test, y_train, y_test = train_test_split(
19    X, y, test_size=0.3, random_state=42
20)
21
22# --- Mitigation 1: Remove protected attribute ---
23print("=== Mitigation 1: Remove Gender Feature ===")
24X_train_no_gender = X_train[:, 1:]  # Drop gender column
25X_test_no_gender = X_test[:, 1:]
26
27gbc_no_gender = GradientBoostingClassifier(
28    n_estimators=100, max_depth=3, random_state=42
29)
30gbc_no_gender.fit(X_train_no_gender, y_train)
31y_pred_ng = gbc_no_gender.predict(X_test_no_gender)
32
33female = X_test[:, 0] == 0
34male = X_test[:, 0] == 1
35sr_f = y_pred_ng[female].mean()
36sr_m = y_pred_ng[male].mean()
37print(f"Selection rate: Female={sr_f:.4f}, Male={sr_m:.4f}, "
38      f"Ratio={sr_f/sr_m:.4f}")
39print(f"Accuracy: {(y_pred_ng == y_test).mean():.4f}")
40
41# --- Mitigation 2: Reweighting ---
42print("\n=== Mitigation 2: Sample Reweighting ===")
43# Compute weights to balance group-label combinations
44groups = X_train[:, 0]
45weights = np.ones(len(y_train))
46
47for g in [0, 1]:
48    for label in [0, 1]:
49        mask = (groups == g) & (y_train == label)
50        expected = len(y_train) / 4
51        actual = mask.sum()
52        weights[mask] = expected / actual if actual > 0 else 1.0
53
54gbc_reweight = GradientBoostingClassifier(
55    n_estimators=100, max_depth=3, random_state=42
56)
57gbc_reweight.fit(X_train, y_train, sample_weight=weights)
58y_pred_rw = gbc_reweight.predict(X_test)
59
60sr_f_rw = y_pred_rw[female].mean()
61sr_m_rw = y_pred_rw[male].mean()
62print(f"Selection rate: Female={sr_f_rw:.4f}, Male={sr_m_rw:.4f}, "
63      f"Ratio={sr_f_rw/sr_m_rw:.4f}")
64print(f"Accuracy: {(y_pred_rw == y_test).mean():.4f}")
65
66# --- Mitigation 3: Threshold adjustment (post-processing) ---
67print("\n=== Mitigation 3: Threshold Adjustment ===")
68gbc_full = GradientBoostingClassifier(
69    n_estimators=100, max_depth=3, random_state=42
70)
71gbc_full.fit(X_train, y_train)
72y_proba = gbc_full.predict_proba(X_test)[:, 1]
73
74# Find thresholds that equalize selection rates
75target_rate = y_proba.mean()  # Use overall mean as target
76
77best_thresh = {"female": 0.5, "male": 0.5}
78for name, mask in [("female", female), ("male", male)]:
79    for t in np.arange(0.1, 0.9, 0.01):
80        sr = (y_proba[mask] >= t).mean()
81        if abs(sr - target_rate) < abs(
82            (y_proba[mask] >= best_thresh[name]).mean() - target_rate
83        ):
84            best_thresh[name] = t
85
86y_pred_thresh = np.zeros(len(y_test), dtype=int)
87y_pred_thresh[female] = (y_proba[female] >= best_thresh["female"]).astype(int)
88y_pred_thresh[male] = (y_proba[male] >= best_thresh["male"]).astype(int)
89
90sr_f_t = y_pred_thresh[female].mean()
91sr_m_t = y_pred_thresh[male].mean()
92print(f"Thresholds: Female={best_thresh['female']:.2f}, "
93      f"Male={best_thresh['male']:.2f}")
94print(f"Selection rate: Female={sr_f_t:.4f}, Male={sr_m_t:.4f}, "
95      f"Ratio={sr_f_t/(sr_m_t+1e-8):.4f}")
96print(f"Accuracy: {(y_pred_thresh == y_test).mean():.4f}")
97
98# --- Summary ---
99print("\n=== Comparison ===")
100print(f"{'Method':<25} {'DP Ratio':>10} {'Accuracy':>10}")
101print("-" * 45)
102methods = [
103    ("Baseline (with gender)", X_test[:, 0] == 0, X_test[:, 0] == 1,
104     gbc_full.predict(X_test)),
105    ("Remove gender", female, male, y_pred_ng),
106    ("Reweighting", female, male, y_pred_rw),
107    ("Threshold adjustment", female, male, y_pred_thresh),
108]
109for name, f_m, m_m, preds in methods:
110    sr_f = preds[f_m].mean()
111    sr_m = preds[m_m].mean()
112    ratio = sr_f / sr_m if sr_m > 0 else 0
113    acc = (preds == y_test).mean()
114    flag = " *" if ratio >= 0.8 else ""
115    print(f"{name:<25} {ratio:>10.4f} {acc:>10.4f}{flag}")

Removing the Protected Attribute Is Often Not Enough

Fairness Toolkits

Fairlearn (Microsoft)

Python library for assessing and improving fairness. Provides:

MetricFrame: Compute any metric disaggregated by group

ThresholdOptimizer: Post-processing threshold adjustment

ExponentiatedGradient: In-processing constrained optimization

GridSearch: Find the fairness-accuracy Pareto frontier

AIF360 (IBM)

Comprehensive toolkit with 70+ fairness metrics and 10+ algorithms:

Pre-processing: Reweighting, Disparate Impact Remover, Optimized Preprocessing

In-processing: Adversarial Debiasing, Prejudice Remover

Post-processing: Equalized Odds, Calibrated Equalized Odds, Reject Option