Rewrite the rfactor scheduling directive#8490
Conversation
|
Since the failing tests are no longer related to this code, I think it's worth getting a first pass of feedback. |
|
It would be helpful to add a bit of background about why the rewrite was necessary and how/why it's better now |
I added some to the PR description |
|
Gentle review ping |
|
Is there any meaningful fuzzing that we could use here to flush out corner cases? E.g. take a histogram, do some random splits and reorders of the vars and rvars, rfactor out some of the rvars, maybe do some more splits and reorders, compute the intermediate at some loop level of the reducing func, and then realize the whole thing and see if you get the right result. |
I think we'll hit this as I start to implement autoschedulers for PyTorch |
abadams
left a comment
There was a problem hiding this comment.
Very nice. This is much more understandable.
Rewrite the rfactor scheduling directive.
The old implementation suffered from several serious issues. It duplicated substantial amounts of the logic in ApplySplit.cpp, the way it handled adapting the predicate to the reducing func was unprincipled, and it confused dims and vars in a way that could segfault. It also left the order of pure dimensions unspecified. The new implementation chooses to follow the existing dims list.
This PR corrects all of these issues while also being shorter, better organized, and hopefully more maintainable.
Fixes #7854