2026 · Statistical Modeling

Benchmarking In-Context Learning against Chained Equations: A Simulation Study on Item Nonresponse Adjustment

Statistical Modeling Machine Learning Monte Carlo Simulation

Benchmarking In-Context Learning against Chained Equations: A Simulation Study on Item Nonresponse Adjustment

University of Maryland | Jan 2026 – May 2026

Python Monte Carlo Methods MICE TabPFN Survey Methods

Project Overview

A rigorous Monte Carlo simulation study benchmarking missing data imputation pipelines for item nonresponse adjustment in survey settings. The study executed 720 total simulation iterations spanning 24 distinct missingness conditions across MCAR and MAR mechanisms up to 30 percent missingness, with evaluation framed through the Total Survey Error perspective.

Key Contributions

Simulation Architecture: Engineered a Monte Carlo simulation framework executing 720 total iterations to benchmark four missing data imputation pipelines, including MICE-PMM, MICE-LASSO, MICE-RF, and TabPFN-based in-context learning, across 24 distinct missingness conditions.
Coefficient Bias Reduction: Achieved a 5.3 percent reduction in downstream coefficient bias by implementing a gradient-boosted TabPFN surrogate over traditional MICE-PMM under complex Missing at Random scenarios.
Inferential Error Quantification: Identified a 16.7 percent degradation in 95 percent confidence interval validity when machine learning imputations were used without Rubin's Combining Rules, highlighting a critical inferential risk for applied survey workflows.
Optimal Regularized Imputation: Found MICE-LASSO to be the strongest regularized approach for continuous variable imputation, minimizing RMSE to 5.14-5.40 on a 0-100 response scale across the full simulation grid.
Downstream Inference Evaluation: Evaluated inference retention against a baseline logistic regression classifier for high-stress indicators with baseline AUC of 0.978, directly linking imputation quality to downstream model performance.

Technical Stack

  • Methods: Statistical Modeling, Monte Carlo Simulation, Multiple Imputation by Chained Equations, Rubin’s Rules
  • Machine Learning: TabPFN, Gradient-Boosted Surrogates, Logistic Regression
  • Programming: Python
  • Application Area: Missing Data, Survey Methods, Total Survey Error

Key Results

MetricValue
Simulation Iterations720
Missingness Conditions24
Bias Reduction5.3%
CI Validity Degradation16.7%
Minimum RMSE5.14-5.40

Associated With

University of MarylandJan 2026 – May 2026