Efficient Rolling AUC-PR implementation#1543
Conversation
|
Hey, great contribution! 😄 Could you provide some benchmarks to illustrate how much the rolling AUC calculation has sped up? |
|
Hey @AdilZouitine, thanks! My team and I are the writers of the paper mentioned. In the paper, we ran several experiments with various stream datasets, comparing our prequential algorithm with the batch version (in addition to scikit-learn's batch implementation). On average, our algorithm proved to be 13 times faster, using 12 times less energy, compared to the batch algorithm (using a window of size 1000). I will implement a simple stream experiment comparing the time spent to calculate the AUC-PR using our prequential algorithm and the batch version. I'll send the link to the repository when I'm done 😃 |
|
Hey, @AdilZouitine, the benchmarks (code and results) comparing the Rolling AUC-PR and the Batch AUC-PR are presented on my benchmark-aucpr repository. The Rolling algorithm is the same as the contribution, with some unused functions removed. In the benchmarks, they are used directly in C++, i.e., without Cython/Python. |
|
Hey! Just checking on this PR, it’s been open for a while. Let me know if there’s anything blocking the review or merge, happy to make any needed changes, @MaxHalford @smastelini @AdilZouitine |
MaxHalford
left a comment
There was a problem hiding this comment.
@davidlpgomes let's go! I approve adding this to the library.
Could you just add an entry to unreleased.md before merging?
|
Hey @MaxHalford, thanks for your reply! I made the following changes:
Let me know if you need anything else! |
|
There we go! Thank you for a lot of patience ❤️ |
A C++ implementation of the Prequential/Rolling AUC-PR, it uses Cython to compile the code.
It uses a sliding window of size S, calculating the precise (i.e., not an approximation) AUC-PR with the last S seen instances.
Based on Gomes, Grégio, Alves, and Almeida, 2023.