About me

I'm an assistant professor in Operations and Logistics at UBC's Sauder School of Business. I’m interested in experimentation and control problems in marketplace settings, specifically challenges relating to interference, sequential designs, policy evaluation, and policy optimization. I'm also involved with AI-for-ecommerce startup Cimulate (we're hiring!).

I completed my PhD at the Operations Research Center at MIT, advised by Prof. Vivek Farias. Previously, I was a data scientist at Uber, working on match optimization for UberPOOL, and I did my undergrad at Northwestern University. You can find my full CV here.

In my spare time I play jazz, look for birds, and procrastinate by tweaking my Emacs config.

Publications

Differences-in-Neighbors for Network Interference in Experiments
with Tianyi Peng, Naimeng Ye
Details

We propose a new estimator for experiments with network interference, dubbed Differences-in-Neighbors (DN). This estimator exploits "smoothness" in many common outcome functions to sharply reduce bias relative to naive differences-in-means estimation, while achieving variance exponentially smaller than unbiased inverse propensity weighted estimators. This has further advantages when combined with clustering approaches -- by accounting for bias explicitly, this estimator enables the use of smaller clusters without discarding boundary units, further reducing variance. This approach also subsumes previous Differences-in-Qs approaches as special cases.
Speeding up Policy Simulation in Supply Chain RL
with Vivek Farias, Joren Gijsbrechts, Aryan Khojandi, Tianyi Peng
ICML 2025
MSOM Service Management SIG 2025

Details

Simulating a single trajectory of a dynamical system under some state-dependent policy is a core bottleneck in policy optimization algorithms. The many inherently serial policy evaluations that must be performed in a single simulation constitute the bulk of this bottleneck. To wit, in applying policy optimization to supply chain optimization (SCO) problems, simulating a single month of a supply chain can take several hours. We present an iterative algorithm for policy simulation, which we dub Picard Iteration. This scheme carefully assigns policy evaluation tasks to independent processes. Within an iteration, a single process evaluates the policy only on its assigned tasks while assuming a certain 'cached' evaluation for other tasks; the cache is updated at the end of the iteration. Implemented on GPUs, this scheme admits batched evaluation of the policy on a single trajectory. We prove that the structure afforded by many SCO problems allows convergence in a small number of iterations, independent of the horizon. We demonstrate practical speedups of 400x on large-scale SCO problems even with a single GPU, and also demonstrate practical efficacy in other RL environments.
Correcting for Interference in Experiments: A Case Study at Douyin
with Vivek Farias, Hao Li, Tianyi Peng, Xinyuyang Ren, and Huawei Zhang
Preliminary: RecSys 2023.

Details

Interference is a ubiquitous problem in experiments conducted on two-sided content marketplaces, such as Douyin (China’s analog of TikTok). In many cases, creators are the natural unit of experimentation, but creators interfere with each other through competition for viewers’ limited time and attention. “Naive” estimators currently used in practice simply ignore the interference, but in doing so incur bias on the order of the treatment effect. We formalize the problem of inference in such experiments as one of policy evaluation. Off-policy estimators, while unbiased, are impractically high variance. We introduce a novel Monte-Carlo estimator, based on “Differences-in-Qs” (DQ) techniques, which achieves bias that is second-order in the treatment effect, while remaining sample-efficient to estimate. On the theoretical side, our contribution is to develop a generalized theory of Taylor expansions for policy evaluation, which extends DQ theory to all major MDP formulations. On the practical side, we implement our estimator on Douyin’s experimentation platform, and in the process develop DQ into a truly “plug-and-play” estimator for interference in real-world settings: one which provides robust, low-bias, low-variance treatment effect estimates; admits computationally cheap, asymptotically exact uncertainty quantification; and reduces MSE by 99\% compared to the best existing alternatives in our applications.
Markovian Interference in Experiments
with Vivek Farias, Andrew A. Li, and Tianyi Peng.
Preliminary: NeurIPS 2022. Major revision, submitted to Management Science.
- Winner, 2022 RMP Jeff McGill Student Paper Award.
- Winner, 2022 APS Best Student Paper Award.
- Oral presentation at NeurIPS 2022.
Details

We model experimentation in marketplace settings as an off-policy evaluation (OPE) problem. On one hand, “naive” A/B testing suffers bias on the order of the treatment effect; on the other, unbiased OPE estimators suffer from extremely high variance. We develop a bias-corrected estimator with bias second-order in the treatment effect, and variance exponentially smaller than unbiased OPE. This estimator dramatically improves over alternatives, even in our experiments on a large-scale ridesharing simulator. This correction is derived from a Taylor-like expansion of the off-policy objective value, of independent interest.
Synthetically Controlled Bandits
with Vivek Farias, Ciamac Moallemi, and Tianyi Peng.
Preliminary: MSOM Service Management SIG 2022. Major revision, submitted to Management Science.

Details

Synthetic controls are commonly used to control for non-stationarity, but the fixed experimental designs typical in such settings are both costly and fragile to non-stationarity in the controls. By instead designing experiments adaptively, we can robustly identify the optimal treatment while incurring low regret.
The Limits to Learning a Diffusion Process
with Jackie Baek, Vivek Farias, Andreea Georgescu, Retsef Levi, Tianyi Peng, Deeksha Sinha, and Joshua Wilde.
Preliminary: EC 2021. To appear in Management Science.

Details

Diffusion processes model viral transmission in a population. Two classic examples are the SIR model for disease transmission, and the Bass model for product adoption. We present a Cramer-Rao bound characterizing the difficulty of learning such models in a stochastic setting. Key parameters cannot be learned with reasonable variance until 2/3 of the time where infections (or product adoptions) are at their peak, a major barrier to timely pandemic response. In practice, we work around this via a regularization approach based on geographic heterogeneity, and apply this approach to predicting the spread of COVID-19.
Evaluation of individual and ensemble probabilistic forecasts of COVID-19 mortality in the US
with Estee Cramer et. al.
PNAS 2022
Optimizing Offer Sets in Sublinear Time
with Vivek Farias, Andrew A. Li, and Deeksha Sinha.
Preliminary: EC 2020. To appear in Management Science.

Details

Online retailers must be able to serve high-quality product assortments within milliseconds of a request. At a scale of millions of products and users, this necessitates approaches to assortment optimization that work in time sub-linear in the number of products. We design a sampling scheme, based on locality-sensitive hashing, which samples a set of candidate products which is 1) sub-linear in size, and 2) contains a provable approximation to the optimal assortment. Experiments show show a much better tradeoff between response time and optimality, relative to existing heuristics.
Non-parametric Approximate Dynamic Programming via the Kernel Method
with Nikhil Bhat, Vivek Farias, and Ciamac Moallemi.
Accepted in Stochastic Systems.

Details

A novel, non-parametric approximate dynamic programming (ADP) algorithm that enjoys dimension-independent approximation and sample complexity guarantees. We obtain this algorithm by ‘kernelizing’ a recent mathematical program for ADP, the “smoothed approximate LP”.

Teaching

UBC BAIT 509: Business Applications of Machine Learning
Winter 2025, Instructor
Introduction to machine learning with a focus on applied regression / classification, and deep learning.

UBC BAMS 521: Analytics Leadership
Winter 2025, Instructor
Project management and problem solving for business applications of analytics.

MIT 15.778: Introduction to Operations Management
Summer 2020, 2021, 2022, TA
Inventory management, queueing, capacity analysis. Core class for the Sloan Fellows MBA program for mid-career professionals. Developed an interactive revenue management game, now also used at several other universities.

MIT 15.003: Analytics Tools
Fall 2019, 2021, Instructor
Data science tools in R and Python, for the Masters of Business Analytics program.

MIT Computing in Optimization and Statistics
IAP 2018, Instructor
Data science tools in R and Python, for MIT graduate and undergraduate students.

Table of Contents

About me

Publications

Teaching