Add WriteOnly type to replace giving &mut [u8] access to mapped buffers.#9042
Add WriteOnly type to replace giving &mut [u8] access to mapped buffers.#9042cwfitzgerald merged 2 commits intogfx-rs:trunkfrom
WriteOnly type to replace giving &mut [u8] access to mapped buffers.#9042Conversation
|
Ok I'm going off the generated docs so these won't be comments on the code but I have some questions. Overall trust your opinion more than mine.
|
As I mentioned in the PR description, there are a couple already. I think that if we put effort into separate versioning it would make sense to see if we can contribute to one of those efforts instead of publishing yet another variation — and if not, we get to tweak it to be maximally ergonomic for writing into
I’ve never seen such a situation, and if there was one, I’d expect it to be acting on a slice rather than a
Not just
|
madsmtm
left a comment
There was a problem hiding this comment.
Re in wgpu vs. outref/wref, an option in between could be to make a new crate (possibly in a separate gfx-rs repo?) - that'd make it possible to tune the API exactly for wgpu's needs, while also making it easier to test with Miri.
cwfitzgerald
left a comment
There was a problem hiding this comment.
This is insanely high quality stuff. I definitely think this is the right approach, have one nit. For miri, we could also enable just the tests from naga/wgpu/wgpu-core/wgpu-hal initially, as those don't require the GPU at all and don't take much time to run.
2f196ef to
448f31e
Compare
|
I stayed up entirely too late and I may regret this, but there are tests now, and I think this is basically ready for review, though definitely not ready for merging. |
8334940 to
0c6e5dc
Compare
| /// | ||
| /// [mapped]: Buffer#mapping-buffers | ||
| #[track_caller] | ||
| pub fn get_mapped_range(&self) -> BufferView { |
There was a problem hiding this comment.
Need to add (if one doesn’t already exist) a test checking that one cannot get a read-access BufferView to a write mapping. Or if that is required to be possible by the spec, then things still might be fine, but some of the documentation I added needs to be changed.
There was a problem hiding this comment.
Per exploratory testing, and discussion in maintainers meeting:
- Currently, you can
get_mapped_range()(read access) on aMapMode::Write - I am told that writes to write-combining memory can effectively "show up randomly later" to reads
- Therefore, it is unsound to expose
&[u8]pointing to write-combining memory- (Unless there is some kind of fence / cache flush that could be used? But we don't need that anyway.)
- Therefore, we must prohibit
get_mapped_range()working on that memory. - The simplest API surface we can provide is to say that
get_mapped_range()never works on aMapMode::Writemapping under any conditions.- This will also simplify the user-facing model because reading a write mapping never was guaranteed to give you meaningful data.
I will update this PR to include a commit that makes this change.
There was a problem hiding this comment.
I think it would be fine if the wgpu API is more restrictive in this case. get_mapped_range_mut doesn't even exist in the spec. This validation would have to only happen in wgpu though not in wgpu-core.
There was a problem hiding this comment.
Resolution from today’s maintainers meeting: This needs looking at, but can be done separately from this PR (which means this PR doesn’t fully fix #8897) so that we can get this API change moving.
It might be that we don’t need stricter validation and can just add a barrier, thus allowing reads as long as the caller transitions explicitly by a separate get_mapped_range(). (But does anyone actually need that?)
|
I don't think this completely fixes the issue. While the CPU can no longer read, writes can still reach the GPU out of order, so |
Would it be sufficient to ensure that, for each byte of the mapped memory, at most one thread writes to that memory? If so, then I think that can be addressed (unfortunately, further complicating the API) by having separate
(On the other hand, such a write mismatch is only a correctness issue and not a Rust UB issue, since the consequences do not reveal themselves to any Rust references, so we could tell the user “if you write the same memory more than once, do it from one thread” and leave it at that. But that’s pretty subtle to comprehend.) |
|
I'm not worried about that case - submit will already validate that the buffer is unmapped (and all mapped ranges are dropped) and creates a happens before relationship between operations on the cpu -> reads on the gpu. |
cwfitzgerald
left a comment
There was a problem hiding this comment.
One small nit and some merge conflicts, but then this is g2g, really amazing work here!
| // FIXME: Now that MAP_WRITE mappings are write-only, | ||
| // the “mut” and “immutable” terminology is incorrect. |
There was a problem hiding this comment.
Do we want to tackle this here? Otherwise should be an issue.
|
Updated and rebased. Should be ready to merge. Changes:
Issues to file for followup work:
|
`WriteOnly<'a, T>` is a custom smart pointer which is stricter than `&'a mut [T]` in that it only allows writing data and not reading it. This will be part of the fix for <gfx-rs#8897>. This commit only introduces the public type and does not use it.
When accessing a buffer for writing, do not expose (or internally create) any `&'a mut [u8]` references to that memory, so as to avoid exposing write-combining memory that might visibly misbehave in atomic operations. Instead, expose the new pointer type `WriteOnly<'a, [u8]>` which offers only write operations. This can be sliced and reborrowed, allowing it to be used in most ways which `&mut [u8]` would be, hopefully with near-zero overhead. This pointer type might also allow users to opt in to uninitialized memory for performance, while confining the consequences to GPU UB rather than Rust UB. However, that is a future possibility and is not implemented in this commit other than as a consideration in defining the API of the `WriteOnly` type. This change is not a complete solution to the problem of write-combining memory because it is still possible to read after writing by calling `get_mapped_range()` after `get_mapped_range_mut()`. However, that is solvable separately, perhaps with a barrier or a validation change, and is not particularly coupled with this part of the solution.
Reopened #8897 and commented for this.
Filed #9202 for this. |
wgpu update for v29. I have tested on macos m1, m5, and windows. Linux testing is appreciated. - [x] before merge, naga_oil and dlss_wgpu need to be published, and the patches referencing their respective PRs removed from the workspace Cargo.toml ##### other PRs - naga_oil: bevyengine/naga_oil#134 - dlss_wgpu: bevyengine/dlss_wgpu#27 ##### Source of relevant changes - `Dx12Compiler::DynamicDxc` no longer has `max_shader_model` - gfx-rs/wgpu#8607 - `Dx12BackendOptions::force_shader_model` comes from: - gfx-rs/wgpu#8984 - Allow optional `RawDisplayHandle` in `InstanceDescriptor` - gfx-rs/wgpu#8012 - Add `GlDebugFns` option to disable OpenGL debug functions - gfx-rs/wgpu#8931 - Add a DX12 backend option to force a certain shader model - gfx-rs/wgpu#8984 - Migrate validation from maxInterStageShaderComponents to maxInterStageShaderVariables - gfx-rs/wgpu#8652 - gaps are now supported in bind group layouts - gfx-rs/wgpu#9034 - depth validation changed to option to match spec - gfx-rs/wgpu#8840 - SHADER_PRIMITIVE_INDEX is now PRIMITIVE_INDEX - gfx-rs/wgpu#9101 - Support for binding arrays of RT acceleration structures - gfx-rs/wgpu#8923 - Make HasDisplayHandle optional in WindowHandle - gfx-rs/wgpu#8782 - `QueueWriteBufferView` can no longer be dereferenced to `&mut [u8]`, so use `WriteOnly`. - gfx-rs/wgpu#9042 - ~bevy_mesh currently has an added dependency on `wgpu`, can we move `WriteOnly` to wgpu-types?~ (it is in wgpu-types now) - Change max_*_buffer_binding_size type to match WebGPU spec (u32 -> u64) - gfx-rs/wgpu#9146 - raw vulkan init `open_with_callback` takes Limits as argument now - gfx-rs/wgpu#8756 ## Known Issues There is currently one known issue with occlusion culling on macos, which we've decided to disable on macos by checking the limits we actually require. This makes it so that if wgpu releases a patch fix, bevy 0.19 users will benefit from occlusion culling re-enabling for them. <details><summary>More details</summary> On macos, the wpgu limits were changed to align with the spec and now put the early and late GPU occlusion culling `StorageBuffer` limit at 8, but we currently use 9. [Filed in wgpu repo](gfx-rs/wgpu#9287) ``` 2026-03-19T01:37:10.771117Z ERROR bevy_render::error_handler: Caught rendering error: Validation Error Caused by: In Device::create_bind_group_layout, label = 'build mesh uniforms GPU late occlusion culling bind group layout' Too many bindings of type StorageBuffers in Stage ShaderStages(COMPUTE), limit is 8, count was 9. Check the limit `max_storage_buffers_per_shader_stage` passed to `Adapter::request_device` ``` </details> solari working on wgpu 29: <img width="1282" height="752" alt="image" src="https://github.com/user-attachments/assets/4744faec-65c0-4a72-93e1-34a721fc26d8" /> --------- Co-authored-by: atlv <email@atlasdostal.com>
Connections
Partially addresses #8897.
Description
When accessing a buffer for writing, do not expose (or internally create) any
&'a mut [u8]references to that memory, so as to avoid exposing write-combining memory that might visibly misbehave in atomic operations. Instead, expose a new pointer typeWriteOnly<'a, [u8]>which offers only write operations. This can be sliced and reborrowed, allowing it to be used in most ways which&mut [u8]would be, hopefully with near-zero overhead.This pointer type might also allow users to opt in to uninitialized memory for performance, while confining the consequences to GPU UB rather than Rust UB. However, that is a future possibility and is not implemented in this PR other than as a consideration in defining the API of the
WriteOnlytype.There are already published libraries that implement a write-only pointer type, such as
wref,outref; however, they are lacking in operations like chunking andIntoIteratorwhich I believe are necessary to provide reasonable ergonomics for our users’ use cases. An alternative to this PR’s unsafe code would be to contribute changes to one of those libraries to meet our needs.This PR does not completely fix #8897 because we still need to handle the case of calling
get_mapped_range_mut()and writing, followed by callingget_mapped_range()and reading. Also, we should consider renaming*_mut()methods to*_write()or something — but that is polish not necessary in this PR.Testing
This PR has near-complete test coverage for the
WriteOnlytype through a mix of doctests and#[test]s. These tests are run under Miri in a new CI job.Squash or Rebase?
Rebase
Checklist
cargo fmt.taplo format.cargo clippy --tests. If applicable, add:--target wasm32-unknown-unknowncargo xtask testto run tests.CHANGELOG.mdentry.