Why Sortition Algorithms?

Act 1

The Invitation

Citizens' assemblies, juries, and deliberative panels all start the same way: inviting people to participate. Letters go out to thousands of randomly chosen households.

~200,000

households in the city

▾

10,000

letters sent [note]

In practice, assemblies also check for multiple respondents from the same household and typically keep only one per address, to avoid over-representing a single home.

Some people respond. [note] They fill in a form online, on their phone, or by calling a number.

Reasons for not responding - too busy, not interested, letter lost in the post, moved away, don't feel qualified.

filled in online responded on phone called the number

~400

responses

The assembly only has room for around 50 participants. So from these ~400 responses, we need to choose 50 people in a way that is both representative — reflecting the wider population — and fair — giving everyone a reasonable chance of being selected. What exactly these words mean can vary by context, but both matter.

Respondents: The people who responded — the pool we choose from.
Participants: The respondents who are selected to serve on the assembly.

For the rest of this walkthrough, we'll use a simplified example: 16 respondents, from which we need to select 8 participants.

Act 2

The Skewed Pool

One key issue we have to address is that the candidate pool is self-selected. The people who respond are not a random cross-section of the population — they're the people who chose to respond. This matters.

Here are our 16 respondents. Each has three demographic attributes: Gender, Region, and Age.

How does our pool compare to the population?

Why is the pool skewed? Self-selection bias. Older people may have more time to respond. People in the South — perhaps a poorer area — may feel less able to participate, or less confident that the assembly will listen to them. The reasons vary, but the result is the same: the pool doesn't look like the city.

The pool doesn't look like the population. If we pick randomly from this pool, our selection will inherit the same skew.

Act 3

What Would “Representative” Mean?

Before we try selecting anyone, let's think about what we actually want. If the panel is supposed to represent the city, what would that look like?

Precise balance

We might decide that some categories should precisely reflect the population. Gender in our city is roughly 50/50, and we're selecting 8 people, so we might want exactly 4 Male and 4 Female selected participants. Same for age: exactly 4 Young and 4 Senior.

Adequate representation

For other categories, exact balance may be less important than ensuring adequate representation. For Region, we might be happy with anywhere from 3 to 5 Northerners (and correspondingly 3 to 5 Southerners). This gives flexibility while still ensuring no region is shut out.

We could in principle set criteria for intersections too — like “at least 1 Young Southerner” — but the maths gets complicated fast, so in practice targets are set per category.

We now have a concrete idea of what a representative panel looks like: a set of criteria for how many people from each category value should be included.

Act 5

From Targets to Fairness

We can take our earlier ideas about representativeness and turn them into concrete targets that the panel must meet. These are set by the organisers based on census data and the assembly's purpose.

Targets meet pool

Here are our targets alongside the candidate pool. Some are comfortable to meet, others are tight — there isn't much room for manoeuvre.

Many valid selections

It turns out that, for our 16-person example, there are many different panels of 8 that satisfy all the targets. Meeting the targets is possible — in fact, there are many ways to do it. The question is which valid panel to pick.

Here are two examples — you can check that they really do satisfy the targets.

Notice that Omar and Priya appear in both panels. That's not a coincidence…

Not everyone has the same chance

Across all valid panels, some people appear far more often than others.

People who appear in very few valid panels might feel: “There's no point in registering — I have almost no chance anyway.” That's a problem for the legitimacy of the whole process.

Representativeness vs Fairness

There's a genuine tension here. The panel must look like the population (representativeness), but each individual should also have a fair shot at being selected (fairness). There is a third concern too: explainability. [more]

People understand voting in an election, and that juries are randomly selected. It may be worth accepting slightly less mathematical fairness in exchange for an algorithm the public can more easily understand. Conversations continue in the sortition community about the trade-offs among these options.

Real assemblies might have 400 respondents, 15 demographic categories, and need to select 50 participants. The maths gets complicated fast — far beyond what a spreadsheet can handle.

The good news: mathematicians have developed algorithms that handle both representativeness and fairness. Let's see how they work.

The Legacy Algorithm →

An efficient way of selecting a panel that satisfies the targets, meaning it is representative. However, the algorithm does not address fairness.

The Common Algorithm →

The first part of the fairer algorithms (maximin, nash and leximin). These aim to balance both representativeness and fairness — meeting the targets while ensuring no candidate has a particularly low chance of selection.