API Reference¶
Complete documentation for all public functions and classes in the sortition-algorithms library.
Core Functions¶
run_stratification()¶
Main function for running stratified random selection with retry logic.
def run_stratification(
features: FeatureCollection,
people: People,
number_people_wanted: int,
settings: Settings,
test_selection: bool = False,
number_selections: int = 1,
) -> tuple[bool, list[frozenset[str]], list[str]]:
Parameters:
features
: FeatureCollection with min/max quotas for each feature valuepeople
: People object containing the pool of candidatesnumber_people_wanted
: Desired size of the panelsettings
: Settings object containing configurationtest_selection
: If True, don't randomize (for testing only)number_selections
: Number of panels to return (usually 1)
Returns:
success
: Whether selection succeeded within max attemptsselected_committees
: List of committees (frozensets of person IDs)output_lines
: Debug and status messages
Raises:
InfeasibleQuotasError
: If quotas cannot be satisfiedSelectionError
: For various failure casesValueError
: For invalid parametersRuntimeError
: If required solver is not available
Example:
success, panels, messages = run_stratification(
features, people, 100, Settings()
)
if success:
selected_people = panels[0] # frozenset of IDs
find_random_sample()¶
Lower-level algorithm function for finding random committees.
def find_random_sample(
features: FeatureCollection,
people: People,
number_people_wanted: int,
settings: Settings,
selection_algorithm: str = "maximin",
test_selection: bool = False,
number_selections: int = 1,
) -> tuple[list[frozenset[str]], list[str]]:
Parameters:
selection_algorithm
: One of "maximin", "leximin", "nash", or "legacy"- Other parameters same as
run_stratification()
Returns:
committee_lottery
: List of committees (may contain duplicates)output_lines
: Debug strings
Example:
committees, messages = find_random_sample(
features, people, 50, settings, "nash"
)
selected_remaining_tables()¶
Format selection results for export to CSV or other formats.
def selected_remaining_tables(
full_people: People,
people_selected: frozenset[str],
features: FeatureCollection,
settings: Settings,
) -> tuple[list[list[str]], list[list[str]], list[str]]:
Parameters:
full_people
: Original People objectpeople_selected
: Single frozenset of selected person IDsfeatures
: FeatureCollection used for selectionsettings
: Settings object
Returns:
selected_rows
: Table with selected people dataremaining_rows
: Table with remaining people dataoutput_lines
: Additional information messages
Example:
selected_table, remaining_table, info = selected_remaining_tables(
people, selected_panel, features, settings
)
# Write to CSV
import csv
with open("selected.csv", "w", newline="") as f:
csv.writer(f).writerows(selected_table)
Data Loading Functions¶
read_in_features()¶
Load feature definitions from a CSV file.
def read_in_features(features_file: str | Path) -> FeatureCollection:
Parameters:
features_file
: Path to CSV file with feature definitions
Expected CSV format:
feature,value,min,max
Gender,Male,45,55
Gender,Female,45,55
Age,18-30,20,30
Returns:
FeatureCollection
: Object containing all features and quotas
Example:
features = read_in_features("demographics.csv")
read_in_people()¶
Load candidate pool from a CSV file.
def read_in_people(
people_file: str | Path,
settings: Settings,
features: FeatureCollection
) -> People:
Parameters:
people_file
: Path to CSV file with candidate datasettings
: Settings object for configurationfeatures
: FeatureCollection for validation
Expected CSV format:
id,Name,Gender,Age,Email
p001,Alice,Female,18-30,alice@example.com
p002,Bob,Male,31-50,bob@example.com
Returns:
People
: Object containing candidate pool
Example:
people = read_in_people("candidates.csv", settings, features)
Settings Class¶
Configuration object for customizing selection behavior.
class Settings:
def __init__(
self,
random_number_seed: int | None = None,
check_same_address: bool = False,
check_same_address_columns: list[str] | None = None,
selection_algorithm: str = "maximin",
max_attempts: int = 10,
columns_to_keep: list[str] | None = None,
id_column: str = "id",
):
Parameters:
random_number_seed
: Fixed seed for reproducible results (None = random)check_same_address
: Enable household diversity checkingcheck_same_address_columns
: Columns that define an addressselection_algorithm
: "maximin", "leximin", "nash", or "legacy"max_attempts
: Maximum selection retry attemptscolumns_to_keep
: Additional columns to include in outputid_column
: Name of the ID column in people data
Class Methods:
Settings.load_from_file()¶
@classmethod
def load_from_file(
cls,
settings_file_path: Path
) -> tuple[Settings, str]:
Load settings from a TOML file.
Example settings.toml:
random_number_seed = 42
check_same_address = true
check_same_address_columns = ["Address", "Postcode"]
selection_algorithm = "maximin"
max_attempts = 10
columns_to_keep = ["Name", "Email", "Phone"]
Returns:
Settings
: Configured settings objectstr
: Status message
Example:
settings, msg = Settings.load_from_file(Path("config.toml"))
print(msg) # "Settings loaded from config.toml"
Data Adapters¶
CSVAdapter¶
Handles CSV file input and output operations.
class CSVAdapter:
def load_features_from_file(
self, features_file: Path
) -> tuple[FeatureCollection, list[str]]:
def load_people_from_file(
self, people_file: Path, settings: Settings, features: FeatureCollection
) -> tuple[People, list[str]]:
def output_selected_remaining(
self, selected_rows: list[list[str]], remaining_rows: list[list[str]]
) -> None:
Example:
adapter = CSVAdapter()
features, msgs = adapter.load_features_from_file(Path("features.csv"))
people, msgs = adapter.load_people_from_file(Path("people.csv"), settings, features)
# Set output files
adapter.selected_file = open("selected.csv", "w", newline="")
adapter.remaining_file = open("remaining.csv", "w", newline="")
adapter.output_selected_remaining(selected_table, remaining_table)
GSheetAdapter¶
Handles Google Sheets input and output operations.
class GSheetAdapter:
def __init__(
self,
credentials_file: Path,
gen_rem_tab: str = "on"
):
def load_features(
self, gsheet_name: str, tab_name: str
) -> tuple[FeatureCollection | None, list[str]]:
def load_people(
self, tab_name: str, settings: Settings, features: FeatureCollection
) -> tuple[People | None, list[str]]:
def output_selected_remaining(
self, selected_rows: list[list[str]], remaining_rows: list[list[str]], settings: Settings
) -> None:
Parameters:
credentials_file
: Path to Google API credentials JSONgen_rem_tab
: "on" or "off" to control remaining tab generation
Example:
adapter = GSheetAdapter(Path("credentials.json"))
features, msgs = adapter.load_features("My Spreadsheet", "Demographics")
people, msgs = adapter.load_people("Candidates", settings, features)
adapter.selected_tab_name = "Selected"
adapter.remaining_tab_name = "Remaining"
adapter.output_selected_remaining(selected_table, remaining_table, settings)
Core Data Classes¶
FeatureCollection¶
Container for demographic features and their quotas.
Key Methods:
def check_desired(self, number_people_wanted: int) -> None:
# Validates that quotas are achievable for the desired panel size
# Raises exception if infeasible
def feature_names(self) -> list[str]:
# Returns list of all feature names
def feature_values_counts(self) -> Iterator[tuple[str, str, FeatureValueCounts]]:
# Iterate over all feature values and their count objects
People¶
Container for the candidate pool.
Key Methods:
def __len__(self) -> int:
# Number of people in the pool
def __iter__(self) -> Iterator[str]:
# Iterate over person IDs
def get_person_dict(self, person_id: str) -> dict[str, str]:
# Get all data for a specific person
def matching_address(
self, person_id: str, address_columns: list[str]
) -> list[str]:
# Find people with matching address to given person
def remove(self, person_id: str) -> None:
# Remove person from pool
def remove_many(self, person_ids: list[str]) -> None:
# Remove multiple people from pool
Error Classes¶
InfeasibleQuotasError¶
Raised when quotas cannot be satisfied with the available candidate pool.
class InfeasibleQuotasError(Exception):
def __init__(self, output: list[str])
Attributes:
output
: List of diagnostic messages explaining the infeasibility
SelectionError¶
General error for selection process failures.
class SelectionError(Exception):
pass
Utility Functions¶
set_random_provider()¶
Configure the random number generator for reproducible results.
def set_random_provider(seed: int | None) -> None
Parameters:
seed
: Random seed (None for secure random)
Example:
set_random_provider(42) # Reproducible results
set_random_provider(None) # Secure random
Type Hints¶
Common type aliases used throughout the API:
# A committee is a set of person IDs
Committee = frozenset[str]
# Selection results are lists of committees
SelectionResult = list[Committee]
# Tables are lists of rows (lists of strings)
Table = list[list[str]]