GeoLift-SDID Command Line Interface Documentation¶

This document provides a comprehensive reference for the command-line interfaces available in the GeoLift-SDID toolset. Each tool is designed for a specific stage in the GeoLift-SDID workflow.

Core Command-Line Tools¶

1. `donor_evaluator.py`¶

Evaluates potential donor locations (control units) for Synthetic Difference-in-Differences analysis based on pre-treatment similarity metrics.

python recipes/donor_evaluator.py [OPTIONS]

OPTIONS:
  --config CONFIG         Path to configuration YAML file (REQUIRED)
  --output-dir OUTPUT_DIR Output directory for results (overrides config)

The configuration file should include:

data_path: Path to input data CSV file (REQUIRED)
treatment_locations: List of treatment location IDs (REQUIRED)
treatment_date: Intervention date (YYYY-MM-DD) (REQUIRED)
pre_treatment_periods: Number of pre-treatment periods to analyze
min_donors: Minimum number of donors to recommend (default: 5)
max_donors: Maximum number of donors to recommend (default: 10)
min_correlation_threshold: Minimum correlation threshold for donors (default: 0.7)
max_rmse_threshold: Maximum RMSE threshold for donors (default: 5000)
shapemap_file: Path to GeoJSON file for map visualization (optional)

2. `power_calculator.py`¶

Calculates the statistical power of a GeoLift-SDID analysis for different combinations of effect sizes and post-treatment durations.

python recipes/power_calculator.py [OPTIONS]

OPTIONS:
  --mode {power,selection}  Analysis mode: 'power' to calculate statistical power,
                           or 'selection' to select optimal markets (REQUIRED)
  --config CONFIG          Path to configuration YAML file (REQUIRED)
  --output OUTPUT          Output directory path (overrides config)

The configuration file should include:

data_path: Path to input data CSV file
treatment_units: List of treatment unit IDs
donor_units: List of control unit IDs (recommended donors)
intervention_date: Intervention date (YYYY-MM-DD)
min_expected_lift: Minimum expected lift percentage (default: 5)
max_expected_lift: Maximum expected lift percentage (default: 20)
lift_step: Step size for lift percentage calculations (default: 5)
min_post_periods: Minimum number of post-treatment periods (default: 7)
max_post_periods: Maximum number of post-treatment periods (default: 42)
post_periods_step: Step size for post-treatment period calculations (default: 7)
n_sims: Number of simulations to run (default: 30)
confidence_level: Confidence level for power calculation (default: 0.9)

3. `geolift_single_cell.py`¶

Runs a GeoLift-SDID analysis for a single treatment unit (location).

python recipes/geolift_single_cell.py [OPTIONS]

OPTIONS:
  --config CONFIG                   Path to configuration YAML file
  --data DATA                       Path to the input data file (overrides config)
  --treatment TREATMENT             Treatment unit ID (overrides config)
  --intervention-date DATE          Intervention date YYYY-MM-DD (overrides config)
  --end-date DATE                   End date for analysis YYYY-MM-DD (overrides config)
  --output OUTPUT                   Output directory (overrides config)

4. `geolift_multi_cell.py`¶

Runs a GeoLift-SDID analysis for multiple treatment units (locations) with heterogeneous effects.

python recipes/geolift_multi_cell.py [OPTIONS]

OPTIONS:
  --config CONFIG                   Path to analysis config file (default: configs/geolift_analysis_config.yaml)
  --viz-config CONFIG               Path to visualization config file (default: configs/visuals_config.yaml)
  --shapemap SHAPEMAP               Path to shapemap file for location visualization
  --data DATA                       Path to input data file (overrides config)
  --treatments TREATMENTS [...]     Names of treatment locations
  --donor-recommendations FILE      Path to donor_recommendations.yaml file to use for control units
  --intervention-date DATE          Start date of intervention (YYYY-MM-DD)
  --end-date DATE                   End date of intervention (YYYY-MM-DD)
  --output OUTPUT                   Output directory (overrides config)

5. `generate_analysis_report.py`¶

Generates an AI-powered interpretation of GeoLift-SDID analysis results using large language models.

python recipes/generate_analysis_report.py [OPTIONS]

OPTIONS:
  --outputs OUTPUTS               Path to the outputs directory containing analysis results (REQUIRED)
  --api-key API_KEY               API key for the model (optional, can also be set via environment variables)
  --model MODEL                   Model to use for interpretation (default: "deepseek-r1")
  --verbose                       Enable verbose logging

Supported models include: deepseek-r1, deepseek-coder, llama3, mistral, and gpt-4. Each model can use an API key from the corresponding environment variable (e.g., DEEPSEEK_API_KEY, OPENAI_API_KEY).

Example Usage¶

Complete Analysis Workflow¶

# 1. Evaluate donors for treatment locations
# Single-cell donor evaluation
python recipes/donor_evaluator.py --config configs/donor_eval_config_singlecell.yaml

# Multi-cell donor evaluation
python recipes/donor_evaluator.py --config configs/donor_eval_config_multicell.yaml
# 2. Run power analysis with recommended donors
# Using directly configured files
python recipes/power_calculator.py --mode power --config configs/power_analysis_config_singlecell.yaml
python recipes/power_calculator.py --mode power --config configs/power_analysis_config_multicell.yaml

# Alternatively, using donor evaluator output
python recipes/power_calculator.py --mode power --config outputs/donor_eval_YYYYMMDD_HHMMSS/power_analysis_config.yaml
# 3. Perform GeoLift-SDID analysis
python recipes/geolift_single_cell.py --data data/GeoLift_Singlecell.csv --treatment 501 --intervention-date 2024-03-01 --output outputs/singlecell_analysis
# 4. Generate AI-powered report
python recipes/generate_analysis_report.py --outputs outputs/singlecell_analysis

Notes¶

File paths can be relative or absolute
For date parameters, use YYYY-MM-DD format
The donor_evaluator.py script will generate a recommended donors file and power analysis configuration
The GeoLift-SDID implementation is in the synthdid package directory (not src)
When using older versions of matplotlib, you may encounter style compatibility issues with ‘seaborn-whitegrid’. In newer versions, use ‘seaborn-v0_8-whitegrid’ instead.
Different configuration files exist for single-cell and multi-cell analyses
Make sure all required parameters (especially data_path) are included in your configuration files

GeoLift-SDID Command Line Interface Documentation¶

Core Command-Line Tools¶

1. donor_evaluator.py¶

2. power_calculator.py¶

3. geolift_single_cell.py¶

4. geolift_multi_cell.py¶

5. generate_analysis_report.py¶