Skip to content
Snippets Groups Projects
Select Git revision
21 results Searching

benchmarks

user avatar
Andrew Tulloch authored
This supercedes #2940.

PAVA runs in linear time, but the existing algorithm was scaling
approximately quadratically, due to the array pop in the inner loop
being O(N).

I implemented the O(N) version of PAVA using the decreasing subsequences
trick, and the performance is significantly faster in benchmarking.

The benchmarks in this diff (adapted from one of the existing unit
tests) show significant performance improvements - on the order of ~7x
faster for problems of size ~1000, ~30x faster for problems of size
10,000, and ~250x faster for problems of size 100,000.

On correctness - unit tests cover the isotonic regression code fairly
well, and all pass before and after the change. It's a fairly well known
algorithm with a bunch of implementations, so I think this is correct.
In coding up this algorithm I made some mistakes and the unit tests
caught the failures, which makes me more confident in the correctness
now. Still, the performance improvements are surprisingly high.

Added a benchmark script. For an example usage, run:

```
python benchmarks/bench_isotonic.py --iterations 10 --log_min_problem_size 2 --log_max_problem_size 8 --dataset logistic
```
3753563c
History
Name Last commit Last update
..