2026 · Statistical Modeling
Benchmarking In-Context Learning against Chained Equations: A Simulation Study on Item Nonresponse Adjustment
Statistical Modeling Machine Learning Monte Carlo Simulation
Benchmarking In-Context Learning against Chained Equations: A Simulation Study on Item Nonresponse Adjustment
University of Maryland | Jan 2026 – May 2026
Python Monte Carlo Methods MICE TabPFN Survey Methods
Project Overview
A rigorous Monte Carlo simulation study benchmarking missing data imputation pipelines for item nonresponse adjustment in survey settings. The study executed 720 total simulation iterations spanning 24 distinct missingness conditions across MCAR and MAR mechanisms up to 30 percent missingness, with evaluation framed through the Total Survey Error perspective.
Key Contributions
Simulation Architecture: Engineered a Monte Carlo simulation framework executing 720 total iterations to benchmark four missing data imputation pipelines, including MICE-PMM, MICE-LASSO, MICE-RF, and TabPFN-based in-context learning, across 24 distinct missingness conditions.
Coefficient Bias Reduction: Achieved a 5.3 percent reduction in downstream coefficient bias by implementing a gradient-boosted TabPFN surrogate over traditional MICE-PMM under complex Missing at Random scenarios.
Inferential Error Quantification: Identified a 16.7 percent degradation in 95 percent confidence interval validity when machine learning imputations were used without Rubin's Combining Rules, highlighting a critical inferential risk for applied survey workflows.
Optimal Regularized Imputation: Found MICE-LASSO to be the strongest regularized approach for continuous variable imputation, minimizing RMSE to 5.14-5.40 on a 0-100 response scale across the full simulation grid.
Downstream Inference Evaluation: Evaluated inference retention against a baseline logistic regression classifier for high-stress indicators with baseline AUC of 0.978, directly linking imputation quality to downstream model performance.
Technical Stack
- Methods: Statistical Modeling, Monte Carlo Simulation, Multiple Imputation by Chained Equations, Rubin’s Rules
- Machine Learning: TabPFN, Gradient-Boosted Surrogates, Logistic Regression
- Programming: Python
- Application Area: Missing Data, Survey Methods, Total Survey Error
Key Results
| Metric | Value |
|---|---|
| Simulation Iterations | 720 |
| Missingness Conditions | 24 |
| Bias Reduction | 5.3% |
| CI Validity Degradation | 16.7% |
| Minimum RMSE | 5.14-5.40 |
Associated With
| University of Maryland | Jan 2026 – May 2026 |
