Benchmarking In-Context Learning against Chained Equations: A Simulation Study on Item Nonresponse Adjustment

2026 · Statistical Modeling

Statistical Modeling Machine Learning Monte Carlo Simulation

Benchmarking In-Context Learning against Chained Equations: A Simulation Study on Item Nonresponse Adjustment

University of Maryland | Jan 2026 – May 2026

Python Monte Carlo Methods MICE TabPFN Survey Methods

Project Overview

A rigorous Monte Carlo simulation study benchmarking missing data imputation pipelines for item nonresponse adjustment in survey settings. The study executed 720 total simulation iterations spanning 24 distinct missingness conditions across MCAR and MAR mechanisms up to 30 percent missingness, with evaluation framed through the Total Survey Error perspective.

Key Contributions

Simulation Architecture: Engineered a Monte Carlo simulation framework executing 720 total iterations to benchmark four missing data imputation pipelines, including MICE-PMM, MICE-LASSO, MICE-RF, and TabPFN-based in-context learning, across 24 distinct missingness conditions.

Coefficient Bias Reduction: Achieved a 5.3 percent reduction in downstream coefficient bias by implementing a gradient-boosted TabPFN surrogate over traditional MICE-PMM under complex Missing at Random scenarios.

Inferential Error Quantification: Identified a 16.7 percent degradation in 95 percent confidence interval validity when machine learning imputations were used without Rubin's Combining Rules, highlighting a critical inferential risk for applied survey workflows.

Optimal Regularized Imputation: Found MICE-LASSO to be the strongest regularized approach for continuous variable imputation, minimizing RMSE to 5.14-5.40 on a 0-100 response scale across the full simulation grid.

Downstream Inference Evaluation: Evaluated inference retention against a baseline logistic regression classifier for high-stress indicators with baseline AUC of 0.978, directly linking imputation quality to downstream model performance.

Technical Stack

Methods: Statistical Modeling, Monte Carlo Simulation, Multiple Imputation by Chained Equations, Rubin’s Rules
Machine Learning: TabPFN, Gradient-Boosted Surrogates, Logistic Regression
Programming: Python
Application Area: Missing Data, Survey Methods, Total Survey Error

Key Results

Metric	Value
Simulation Iterations	720
Missingness Conditions	24
Bias Reduction	5.3%
CI Validity Degradation	16.7%
Minimum RMSE	5.14-5.40

Associated With

University of Maryland

Jan 2026 – May 2026

Share on

Bluesky Facebook LinkedIn Mastodon X (formerly Twitter)

Namit Shrivastava

Benchmarking In-Context Learning against Chained Equations: A Simulation Study on Item Nonresponse Adjustment

Project Overview

Key Contributions

Technical Stack

Key Results

Associated With

Share on