5B  Coding for Experiments

~10 hours Power, Surveys, Randomization Intermediate-Advanced

Learning Objectives

  • Calculate statistical power and determine required sample sizes
  • Program surveys and integrate with platforms like Qualtrics
  • Implement randomization with proper stratification
  • Distribute experiments via Prolific and other platforms
  • Verify randomization and check balance of covariates

Running experiments requires careful planning and precise implementation. This module covers the programming aspects of experimental research—from determining how many participants you need, to building your survey, randomizing treatments, and verifying that everything worked correctly.

Essential Reference

This module draws heavily on:
Duflo, E., Glennerster, R., & Kremer, M. (2007). "Using Randomization in Development Economics Research: A Toolkit." Handbook of Development Economics, Vol. 4.
Available at: J-PAL Resources

Module Overview

This module is organized into four subpages, each covering a critical aspect of experimental implementation:

The Experimental Pipeline

A well-executed experiment follows this workflow:

  1. Power Analysis (Pre-registration)
    Before data collection: Determine how many participants you need to detect your effect of interest. Write a pre-analysis plan.
  2. Survey/Instrument Design
    Before launch: Program your survey, set up treatment arms, test thoroughly with pilot participants.
  3. Randomization
    At recruitment: Randomly assign participants to treatment and control groups, potentially stratifying on key variables.
  4. Data Collection
    During experiment: Monitor response rates, check for technical issues, manage panel recruitment.
  5. Balance Verification
    After collection: Confirm that randomization produced comparable groups before analyzing outcomes.
  6. Analysis
    Final stage: Estimate treatment effects following your pre-analysis plan. See Module 6.

Key Concepts Preview

Statistical Power

Power is the probability of detecting an effect when it truly exists. The standard target is 80% power at a 5% significance level. The minimum detectable effect (MDE) depends on:

  • Sample size (N): More participants = smaller detectable effects
  • Outcome variance (σ²): More noise = need more data
  • Treatment allocation (P): 50-50 split is optimal for simple designs
  • Significance level (α): Usually 0.05
Power Formula (Simple RCT)

MDE = (t1-κ + tα/2) × √[ σ² / (P × (1-P) × N) ] Where: - t1-κ ≈ 0.84 for 80% power - tα/2 ≈ 1.96 for 5% significance (two-sided) - P = proportion treated (often 0.5) - σ² = outcome variance - N = total sample size

Randomization Methods

Method Description Use When
Simple Coin flip for each unit Large samples, no important stratifying variables
Stratified Randomize within subgroups Want balance on key variables (gender, region)
Block Fixed number per block Want exact proportions in each stratum
Cluster Randomize groups, not individuals Treatment at group level (schools, villages)

Balance Tables

A balance table compares treatment and control groups on baseline characteristics. It typically shows:

  • Mean (or proportion) in each group
  • Difference between groups
  • P-value testing if difference is statistically significant
  • Standardized difference (effect size)

Good randomization should produce no systematic differences—p-values should be distributed uniformly, and you should not see more significant differences than expected by chance.

Essential Tools
  • Stata: randtreat (randomization), iebaltab (balance tables)
  • R: randomizr package, cobalt package
  • Python: randomization module, custom implementations
  • Survey platforms: Qualtrics, SurveyMonkey, Google Forms
  • Panels: Prolific, MTurk, CloudResearch