Sortition Foundation

Why Sortition Algorithms?

How do you select a representative group of people — fairly?

Scroll down to find out ▾
Act 1

The Invitation

Citizens' assemblies, juries, and deliberative panels all start the same way: inviting people to participate. Letters go out to thousands of randomly chosen households.

~200,000
households in the city
10,000
letters sent [note]
In practice, assemblies also check for multiple respondents from the same household and typically keep only one per address, to avoid over-representing a single home.

Most people don't respond.

too busy not interested letter lost in the post moved away doesn't feel qualified

But some people do. They fill in a form online, on their phone, or by calling a number.

filled in online responded on phone called the number
~400
responses (about 4%)

The assembly only has room for around 50 panellists. So from these ~400 responses, we need to choose 50 people in a way that is both representative — reflecting the wider population — and fair — giving everyone a reasonable chance of being selected. What exactly these words mean can vary by context, but both matter.

Candidates
The people who responded — the pool we choose from.
Panellists
The people who are selected to serve on the assembly.

For the rest of this walkthrough, we'll use a simplified example: 16 candidates, from which we need to select 8 panellists.

Act 2

The Skewed Pool

One key issue we have to address is that the candidate pool is self-selected. The people who respond are not a random cross-section of the population — they're the people who chose to respond. This matters.

Here are our 16 candidates. Each has three demographic attributes: Gender, Region, and Age.

How does our pool compare to the population?

Why is the pool skewed? Self-selection bias. Older people may have more time to respond. People in the South — perhaps a poorer area — may feel less able to participate, or less confident that the assembly will listen to them. The reasons vary, but the result is the same: the pool doesn't look like the city.

Looking at intersections

Another way to see the skew is to look at combinations of categories. Some intersections have many candidates, others have very few.

Count of candidates in each intersection

The pool doesn't look like the population. Omar is the only Young Southern Male. Priya is the only Young Southern Female. Meanwhile, there are 4 Senior Northern Males. If we pick randomly from this pool, our selection will inherit the same skew.
Act 3

What Would “Representative” Mean?

Before we try selecting anyone, let's think about what we actually want. If the panel is supposed to represent the city, what would that look like?

Precise balance

We might decide that some categories should precisely reflect the population. Gender in our city is roughly 50/50, and we're selecting 8 people, so we might want exactly 4 Male and 4 Female panellists. Same for age: exactly 4 Young and 4 Senior.

Adequate representation

For other categories, exact balance may be less important than ensuring adequate representation. For Region, we might be happy with anywhere from 3 to 5 Northerners (and correspondingly 3 to 5 Southerners). This gives flexibility while still ensuring no region is shut out.

We could in principle set criteria for intersections too — like “at least 1 Young Southerner” — but the maths gets complicated fast, so in practice targets are set per category.

We now have a concrete idea of what a representative panel looks like: a set of criteria for how many people from each category value should be included.
Act 4

Just Pick 8 at Random

So, does a simple random draw meet our criteria? Let's try it a few times.

None of these meet our criteria for representativeness. If this assembly is deciding transport policy for the whole city, but most members are from the North, Southern communities have little voice in decisions that affect them. Random selection from a skewed pool gives a skewed result — we need something better.
Act 5

From Targets to Fairness

We can take our earlier ideas about representativeness and turn them into concrete targets that the panel must meet. These are set by the organisers based on census data and the assembly's purpose.

Targets meet pool

Here are our targets alongside the candidate pool. Some are comfortable to meet, others are tight — there isn't much room for manoeuvre.

Many valid selections

It turns out that, for our 16-person example, there are many different panels of 8 that satisfy all the targets. Meeting the targets is possible — in fact, there are many ways to do it. The question is which valid panel to pick.

Here are two examples — you can check that they really do satisfy the targets.

Notice that Omar and Priya appear in both panels. That's not a coincidence…

Not everyone has the same chance

Across all valid panels, some people appear far more often than others.

People who appear in very few valid panels might feel: “There's no point in registering — I have almost no chance anyway.” That's a problem for the legitimacy of the whole process.

Representativeness vs Fairness

There's a genuine tension here. The panel must look like the population (representativeness), but each individual should also have a fair shot at being selected (fairness). There is a third concern too: explainability. [more]

People understand voting in an election, and that juries are randomly selected. It may be worth accepting slightly less mathematical fairness in exchange for an algorithm the public can more easily understand. Conversations continue in the sortition community about the trade-offs among these options.

Real assemblies might have 400 candidates, 15 demographic categories, and need to select 50 panellists. The maths gets complicated fast — far beyond what a spreadsheet can handle.

The good news: mathematicians have developed algorithms that handle both representativeness and fairness. Let's see how they work.