Update buffer reuse to use Columns as the TMEM unit instead of bytes#943
Closed
njriasan wants to merge 1 commit intofacebookexperimental:mainfrom
Closed
Update buffer reuse to use Columns as the TMEM unit instead of bytes#943njriasan wants to merge 1 commit intofacebookexperimental:mainfrom
njriasan wants to merge 1 commit intofacebookexperimental:mainfrom
Conversation
njriasan
commented
Feb 21, 2026
njriasan
commented
Feb 21, 2026
| int m = shapePerCTA[shapePerCTA.size() - 2]; | ||
| int numBuffers = shape.size() == 3 ? shape[0] : 1; | ||
| int numColumn = ceil<int>(m, 32) * ceil<int>(k, 4) * numBuffers; | ||
| int numColumn = getTmemScalesColumnsPerBuffer(m, k) * numBuffers; |
Contributor
Author
There was a problem hiding this comment.
This refactoring is to share code with the dummy layout.
Contributor
Author
|
I hit this issue when trying to share with scales and data in the MXFP8 kernel (it tried to pad too many columns). With this change the kernel is working now. |
Contributor
htyu
approved these changes
Feb 23, 2026
| Value element, int64_t currentOffset, int64_t bytesBetweenBufferGroups, | ||
| int64_t alignment, int64_t currentGroupSize, | ||
| DenseMap<Value, std::tuple<int64_t, int64_t, int64_t>> &offsetMap) { | ||
| DenseMap<Value, std::tuple<int64_t, int64_t, int64_t>> &offsetMap, |
Contributor
There was a problem hiding this comment.
nit: add a comment about what fields mean?
Contributor
Author
There was a problem hiding this comment.
This is actually in the comment above the function (line 25).
d43e900 to
5975f18
Compare
njriasan
added a commit
to njriasan/fb-experimental-triton
that referenced
this pull request
Feb 25, 2026
…facebookexperimental#943) Summary: Fixes a bug where by failing to use columns as the fundamental unit of TMEM reuse/scaling scales will allocate too many buffers, breaking reuse. Reviewed By: htyu Differential Revision: D93986455 Pulled By: njriasan
Contributor
…facebookexperimental#943) Summary: Fixes a bug where by failing to use columns as the fundamental unit of TMEM reuse/scaling scales will allocate too many buffers, breaking reuse. Reviewed By: htyu Differential Revision: D93986455 Pulled By: njriasan
5975f18 to
9cc003e
Compare
Contributor
tissue3
pushed a commit
to tissue3/triton
that referenced
this pull request
Feb 27, 2026
…acebookexperimental#943) Summary: Fixes a bug where by failing to use columns as the fundamental unit of TMEM reuse/scaling scales will allocate too many buffers, breaking reuse. Pull Request resolved: facebookexperimental#943 Reviewed By: htyu Differential Revision: D93986455 Pulled By: njriasan fbshipit-source-id: 4c7a68d6428b3143ba708f7332082ba4f35cde5b
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes a bug where by failing to use columns as the fundamental unit of TMEM reuse/scaling scales will allocate too many buffers, breaking reuse.