Project 08: Uncertainty and Probabilistic Inference¶
1. Project Overview¶
This project focuses on quantifying uncertainty in deep learning models using either Bayesian sampling (HMC) or distribution-free methods (Conformal Prediction).
2. Option A: Uncertainty in Medical Cost Prediction (Bayesian)¶
Goal¶
Build a robust regression model with rigorous uncertainty bounds using Hamiltonian Monte Carlo (HMC).
Dataset¶
- Medical Insurance Cost Prediction: Predict individual medical costs billed by health insurance.
- Alternative: Diabetes Dataset.
Implementation Steps¶
- Model Definition: Define a Bayesian Neural Network with 1-2 hidden layers. Use a Gaussian prior for the weights.
- Hamiltonian Monte Carlo:
- Use
PyroorHamiltorch(PyTorch) to implement the HMC sampler. -
Run the chain for at least 1000 samples after burn-in.
-
Diagnostics: Plot the trace of the weights and check the R-hat (\(\hat{R}\)) statistic to ensure convergence.
- Prediction:
- For a test input \(x^*\), compute the predictive distribution \(p(y^*|x^*, D) \approx \frac{1}{M} \sum_m p(y^*|x^*, w^{(m)})\).
- Calculate the mean and the \(95\%\) credible interval.
Expected Results¶
- The BNN should provide wider uncertainty intervals for regions with sparse data.
- Analysis: Compare the "Calibration" of the BNN against a standard MLP. A well-calibrated model's \(95\%\) interval should contain the true value exactly \(95\%\) of the time.
3. Option B: Conformal Prediction for Safe Classification (Frequentist)¶
Goal¶
Create a classification system that outputs a set of labels, guaranteed to contain the true label with \(99\%\) confidence.
Dataset¶
- Credit Card Fraud Detection: Detect fraudulent transactions.
- Alternative: MNIST or [CIFAR-10].
Implementation Steps¶
- Base Model: Train a standard classifier (e.g., Logistic Regression, Random Forest, or a CNN).
- Data Splitting: Split the training data into a training set (\(80\%\)) and a calibration set (\(20\%\)).
- Non-conformity Scores:
-
For the calibration set, compute \(S_i = 1 - \hat{f}(x_i)_{y_i}\), where \(\hat{f}(x_i)_{y_i}\) is the predicted probability of the true class.
-
Threshold Calculation: Calculate \(\hat{q}\), the \((1-\alpha)\) quantile of \(S_i\) (e.g., \(\alpha=0.01\) for \(99\%\) coverage).
- Set Generation: For a new \(x\), include all classes \(y\) such that \(1 - \hat{f}(x)_y \le \hat{q}\).
Expected Results¶
- For ambiguous images (e.g., a digit that looks like both 1 and 7), the model should return a set \(\{1, 7\}\) rather than a single incorrect guess.
- Analysis: Plot the "Average Set Size" vs. the confidence level \((1-\alpha)\). Show how the set size increases as you demand higher confidence.
4. Analysis & Deliverables¶
Technical Report Requirements¶
- Uncertainty Plot: For regression, show a plot of \(y\) vs. \(x\) with the \(95\%\) uncertainty band. For classification, show examples of "Multi-label sets".
- Reliability Diagram: Plot "Observed Coverage" vs. "Nominal Coverage". The points should lie on the \(y=x\) line.
- Comparison: Contrast Option A (Bayesian) and Option B (Conformal).
- Which one is more computationally expensive?
-
Which one makes more assumptions about the data?
-
Kaggle Link: Uncertainty Estimation Datasets.
Tips¶
- HMC: Start with a small network. Sampling from a deep ResNet with HMC is extremely difficult.
- Conformal: Ensure your calibration set is truly representative of your test set (exchangeability).