Benchmark Results¶
This page documents performance comparisons between solver backends for the sortition algorithms.
Test Environment¶
- Date: January 2025
- Platform: Linux 6.8.0-90-generic
- Python: 3.11
- Packages: highspy (HiGHS solver), mip 1.15.0 (CBC solver)
Maximin Algorithm: highspy vs mip¶
Small Pool (150 people, panel size 22)¶
uv run python -m benchmarks.profile_solvers \
--backends highspy mip \
--algorithms maximin \
--runs 5
| Backend | Mean Time | Max Memory |
|---|---|---|
| highspy | 3.25s | 0.5MB |
| mip | 1.40s | 4.6MB |
Result: MIP is 2.3x faster, but uses more memory on first run (JIT warmup).
Medium Pool (300 people, panel size 45)¶
uv run python -m benchmarks.profile_solvers \
--backends highspy mip \
--algorithms maximin \
--sizes 300 \
--runs 3
| Backend | Mean Time | Max Memory |
|---|---|---|
| highspy | 30.3s | 1.1MB |
| mip | 9.6s | 1.8MB |
Result: MIP is 3.2x faster.
Large Pool (500 people, panel size 75)¶
uv run python -m benchmarks.profile_solvers \
--backends highspy mip \
--algorithms maximin \
--sizes 500 \
--runs 3
| Backend | Mean Time | Max Memory |
|---|---|---|
| highspy | 148.1s | 1.8MB |
| mip | 41.8s | 1.9MB |
Result: MIP is 3.5x faster.
Extra Large Pool (750 people, panel size 112)¶
uv run python -m benchmarks.profile_solvers \
--backends highspy mip \
--algorithms maximin \
--sizes 750 \
--runs 3
| Backend | Mean Time | Max Memory |
|---|---|---|
| highspy | 261.1s | 7.2MB |
| mip | 119.1s | 7.4MB |
Result: MIP is 2.2x faster.
Tight Constraints (300 people, panel size 45)¶
Testing with narrow min/max gaps to stress the solvers:
uv run python -m benchmarks.profile_solvers \
--backends highspy mip \
--algorithms maximin \
--sizes 300 \
--runs 3 \
--tight-constraints
| Backend | Mean Time | Max Memory |
|---|---|---|
| highspy | 17.3s | 0.9MB |
| mip | 4.5s | 1.7MB |
Result: MIP is 3.9x faster with tight constraints.
Summary¶
Performance Comparison¶
| Pool Size | Panel Size | highspy | mip | MIP Speedup |
|---|---|---|---|---|
| 150 | 22 | 3.25s | 1.40s | 2.3x |
| 300 | 45 | 30.3s | 9.6s | 3.2x |
| 500 | 75 | 148.1s | 41.8s | 3.5x |
| 750 | 112 | 261.1s | 119.1s | 2.2x |
Key Findings¶
-
MIP (CBC) is consistently faster than highspy (HiGHS) for the maximin algorithm, by a factor of 2-4x depending on problem size.
-
Memory usage is comparable between both backends after initial warmup.
-
Scaling behaviour is similar for both backends (roughly O(n²) with pool size), but MIP maintains its performance advantage across all sizes tested.
-
Tight constraints make problems harder for both solvers, but MIP's advantage increases slightly.
Recommendations¶
-
For production use with maximin algorithm, consider using
solver_backend = "mip"if execution time is critical and the mip package is available. -
For ease of deployment,
solver_backend = "highspy"(the default) requires no additional dependencies. -
These results are specific to the maximin algorithm. Other algorithms (nash, diversimax, leximin) may show different characteristics.
Possible Explanations for Performance Difference¶
The performance gap is unexpected since HiGHS is generally considered a fast solver. Possible factors:
-
Column generation pattern: The maximin algorithm iteratively adds constraints. CBC may handle this pattern more efficiently than HiGHS.
-
Python binding overhead: The highspy bindings may have different overhead characteristics than python-mip.
-
Default parameters: HiGHS and CBC have different default solver parameters that may favour different problem types.
-
Problem structure: The committee selection ILP may have structure that CBC exploits better.
Further Work¶
Areas for additional investigation:
Algorithm Comparison¶
- Nash algorithm: Does mip maintain its advantage for Nash welfare optimization?
- Diversimax algorithm: How do the backends compare for the diversity-maximizing approach?
- Leximin algorithm: Currently requires Gurobi for the dual LP; could benefit from pure HiGHS/mip implementation.
Performance Tuning¶
- HiGHS parameters: Test different solver parameters (presolve settings, cutting planes, etc.) to see if HiGHS can be tuned for better performance on this problem type.
- Warm starting: Investigate whether warm-starting the solver between iterations could improve performance.
- Model formulation: The column generation pattern may benefit from different constraint formulations.
Scaling Analysis¶
- Larger pools: Test with 1000+ people to understand asymptotic scaling behaviour.
- More features: Test with additional demographic features to see how constraint complexity affects performance.
- Panel size ratios: Test different panel-to-pool ratios (e.g., 10%, 20%, 30%).
Memory Profiling¶
- Detailed memory analysis: Use memray to generate flamegraphs and identify memory hotspots.
- Peak vs sustained memory: Understand whether memory spikes are transient or sustained.
Real-World Validation¶
- Production datasets: Test with actual sortition datasets (with appropriate anonymisation).
- Constraint patterns: Understand which constraint patterns are common in practice and optimise for those.
Raw Data¶
Full benchmark results are saved in benchmarks/results/ as CSV and JSON files with timestamps.