Skip to content

Quick Start Guide

This guide will get you up and running with sortition algorithms in just a few minutes.

Installation

pip install sortition-algorithms

# Optional: Install CLI support
pip install 'sortition-algorithms[cli]'

# Optional: Install leximin algorithm support (requires commercial/academic license)
pip install 'sortition-algorithms[gurobi]'

Basic Concepts

Before diving in, understand these key concepts:

  • Sortition: Random selection that maintains demographic representativeness
  • Features: Demographic characteristics (e.g., Gender, Age, Location)
  • Quotas: Min/max targets for each demographic group
  • Stratified Selection: Random selection that respects quotas

Your First Selection

1. Prepare Your Data

You'll need two CSV files:

demographics.csv (features with quotas):

feature,value,min,max
Gender,Male,45,55
Gender,Female,45,55
Age,18-30,20,30
Age,31-50,30,40
Age,51+,30,50

candidates.csv (people to select from):

id,Gender,Age,Location
person1,Male,18-30,Urban
person2,Female,31-50,Rural
person3,Male,51+,Urban
...

2. Run Your First Selection

from sortition_algorithms import (
    run_stratification,
    read_in_features,
    read_in_people,
    Settings
)

# Load your data
settings = Settings()
features = read_in_features("demographics.csv")
people = read_in_people("candidates.csv", settings, features)

# Select a panel of 50 people
success, selected_panels, messages = run_stratification(
    features=features,
    people=people,
    number_people_wanted=50,
    settings=settings
)

if success:
    selected_people = selected_panels[0]  # frozenset of person IDs
    print(f"✅ Successfully selected {len(selected_people)} people")
    print("Selected IDs:", list(selected_people)[:5], "...")
else:
    print("❌ Selection failed")
    for msg in messages:
        print(msg)

3. Export Results

from sortition_algorithms import selected_remaining_tables

# Get formatted tables for export
selected_table, remaining_table, info = selected_remaining_tables(
    people, selected_panels[0], features, settings
)

# Save to CSV
import csv

with open("selected.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerows(selected_table)

with open("remaining.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerows(remaining_table)

Using the Command Line

For quick operations, use the CLI:

# CSV workflow
python -m sortition_algorithms csv \
  --settings settings.toml \
  --features-csv demographics.csv \
  --people-csv candidates.csv \
  --selected-csv selected.csv \
  --remaining-csv remaining.csv \
  --number-wanted 50

Configuration with Settings

Customize behavior with a settings file:

settings.toml:

# Random seed for reproducible results (optional)
random_number_seed = 42

# Ensure household diversity
check_same_address = true
check_same_address_columns = ["Address", "Postcode"]

# Selection algorithm: "maximin", "leximin", "nash", or "legacy"
selection_algorithm = "maximin"

# Maximum selection attempts
max_attempts = 10

# Output columns to include
columns_to_keep = ["Name", "Email", "Phone"]
settings = Settings.load_from_file("settings.toml")

Common Patterns

Working with Google Sheets

from sortition_algorithms import GSheetAdapter
from pathlib import Path

adapter = GSheetAdapter(Path("credentials.json"))
features, msgs = adapter.load_features("My Spreadsheet", "Demographics")
people, msgs = adapter.load_people("Candidates", settings, features)

Address Checking for Household Diversity

# Ensure only one person per household is selected
settings = Settings(
    check_same_address=True,
    check_same_address_columns=["Address", "Postcode"]
)

Multiple Selection Algorithms

# Maximin: Maximize the minimum probability
settings.selection_algorithm = "maximin"

# Nash: Maximize the product of probabilities
settings.selection_algorithm = "nash"

# Leximin: Lexicographic maximin (requires Gurobi)
settings.selection_algorithm = "leximin"

What's Next?

Troubleshooting

"Selection failed" errors: Check that your quotas are achievable given your candidate pool. The sum of minimum quotas shouldn't exceed your target panel size.

Import errors: Ensure you've installed the package correctly. For Gurobi features, you need a valid license.

Empty results: Verify your CSV files have the correct format and column names match between demographics and candidates files.