Discover how FLAI detects, measures and mitigates algorithmic bias using the same datasets and examples from the official repository. From two-dimensional metrics to fair data generation.
FLAI introduces the first metric that conceptually distinguishes between two types of discrimination:
Statistical Parity: Do groups achieve the same outcomes?
Measures whether the proportion of positive outcomes is similar across groups.
EQA = |P(Y=1|G=0) - P(Y=1|G=1)|
Equal Opportunity: Do groups have the same opportunities?
Measures whether true positives are detected equally across both groups.
EQI = |P(ΕΆ=1|Y=1,G=0) - P(ΕΆ=1|Y=1,G=1)|
We compare groups by gender in predicting income >50K:
Negative EQI: Women have lower true positive rate (less opportunity)
Positive EQA: Difference in overall distribution of positive predictions
Fairness = β(0.08Β² + 0.04Β²) = 0.09: Moderate level of bias detected
FLAI implements a comprehensive process from detection to mitigation:
Load the dataset and calculate two-dimensional fairness metrics
flai_dataset = data.Data(df, transform=True)
Build a Bayesian graph that explains causal relationships
flai_graph = CausalGraph(flai_dataset, target='label')
Adjust causal relationships and probability tables
flai_graph.mitigate_edge_relation(sensible_feature=['sex','age'])
Create synthetic datasets free from bias
fair_data = flai_graph.generate_dataset(n_samples=1000)
Metric | Original | Post-Mitigation | Improvement |
---|---|---|---|
EQI (Equity) | -0.04 | 0.00 | β Eliminated |
EQA (Equality) | 0.08 | 0.00 | β Eliminated |
Fairness Global | 0.09 | 0.00 | β Perfect |
Accuracy | 0.758 | 0.746 | π Slight reduction |
From detected bias to complete fairness
We explore the same datasets used in the research:
Objective: Predict income >50K based on demographic characteristics
Sensitive variable: Gender, Age
Findings: Significant bias against women and young people
df_f, datos_f = flai_dataset.fairness_eqa_eqi(
features=['education'],
target_column='income',
column_filter=['gender'],
plot=True
)
Result: EQI=-0.04, EQA=0.08, F=0.09
Objective: Predict criminal recidivism risk
Sensitive variable: Race
Findings: Racial discrimination in justice algorithms
compas_graph = CausalGraph(compas_data, target='risk_score')
compas_graph.mitigate_calculation_cpd(
sensible_feature=['race']
)
Impact: 85% reduction in racial bias
Objective: Bank credit decisions
Sensitive variable: Gender, Age
Findings: Bias against young women
fair_credit_data = credit_graph.generate_dataset(
n_samples=1000,
methodtype='bayes'
)
Result: Fair dataset for training new models
These results have been peer-reviewed and published in:
Run real FLAI code using PyScript. Data is processed entirely in your browser:
# Simulate FLAI behavior with synthetic data
import pandas as pd
import numpy as np
import random
# Function to simulate FLAI's EQA-EQI metric
def calculate_fairness_flai(data, group_col, target_col):
groups = data[group_col].unique()
if len(groups) != 2:
return "Error: Need exactly 2 groups"
g0, g1 = groups[0], groups[1]
# Calculate EQA (Equality) - Statistical Parity
rate_g0 = data[data[group_col] == g0][target_col].mean()
rate_g1 = data[data[group_col] == g1][target_col].mean()
eqa = abs(rate_g1 - rate_g0)
# Simulate EQI (Equity) - Equal Opportunity
# In real data, this would require ground truth
eqi = abs(np.random.normal(0, 0.02)) # Simulated
# Global fairness
fairness = np.sqrt(eqa**2 + eqi**2)
return {
'EQA (Equality)': round(eqa, 3),
'EQI (Equity)': round(eqi, 3),
'Global Fairness': round(fairness, 3),
'Privileged Group': f"{g1} (rate: {rate_g1:.3f})",
'Unprivileged Group': f"{g0} (rate: {rate_g0:.3f})"
}
# Generate synthetic data similar to Adult dataset
np.random.seed(42)
random.seed(42)
sample_data = pd.DataFrame({
'gender': np.random.choice(['Female', 'Male'], 1000, p=[0.4, 0.6]),
'education': np.random.choice([0, 1, 2, 3], 1000, p=[0.3, 0.3, 0.3, 0.1]),
'high_income': np.random.choice([0, 1], 1000, p=[0.7, 0.3])
})
# Introduce artificial bias: males have higher probability of high income
male_mask = sample_data['gender'] == 'Male'
sample_data.loc[male_mask, 'high_income'] = np.random.choice(
[0, 1], male_mask.sum(), p=[0.6, 0.4] # 40% vs 30%
)
print("=== FLAI DEMO: Bias Detection ===\\n")
print("π Synthetic Dataset Generated:")
print(f" β’ Size: {len(sample_data)} records")
print(f" β’ Females: {(sample_data['gender']=='Female').sum()}")
print(f" β’ Males: {(sample_data['gender']=='Male').sum()}")
print()
# Calculate fairness metrics
result = calculate_fairness_flai(sample_data, 'gender', 'high_income')
print("π Fairness Analysis (FLAI Method):")
for key, value in result.items():
print(f" β’ {key}: {value}")
print()
print("π Interpretation:")
if result['Global Fairness'] > 0.05:
print(" β οΈ BIAS DETECTED - Mitigation required")
print(" π§ Recommendation: Apply causal mitigation with FLAI")
else:
print(" β
Fair dataset - No significant bias detected")
print()
print("π To use FLAI in your project:")
print(" pip install flai-causal")
print(" # See complete examples at GitHub.com/rugonzs/FLAI")
This demo uses PyScript/Pyodide to run real Python in the browser. The shown code is an educational simulation of FLAI's behavior.
For complete analysis with real data, install FLAI in your local Python environment.
Install FLAI in your Python project:
pip install flai-causal