[WIP] Offload DG method to GPUs#1485
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1485 +/- ##
==========================================
+ Coverage 88.81% 94.55% +5.74%
==========================================
Files 363 360 -3
Lines 30172 29980 -192
==========================================
+ Hits 26796 28345 +1549
+ Misses 3376 1635 -1741
Flags with carried forward coverage won't be shown. Click here to find out more.
|
| EllipsisNotation = "da5c29d0-fa7d-589e-88eb-ea29b0a81949" | ||
| FillArrays = "1a297f60-69ca-5386-bcde-b61e274b549b" | ||
| ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210" | ||
| GPUArrays = "0c68f7d7-f131-5f86-a1c3-88cf8149b2d7" |
There was a problem hiding this comment.
Change to GPUArraysCore.jl (see discussion on Julia Slack)
| get_backend(::PtrArray) = CPU() | ||
|
|
||
| function get_array_type(backend::CPU) | ||
| return Array | ||
| end |
There was a problem hiding this comment.
Those should inline anyways, but this might give the compiler even more motivation to do so
| get_backend(::PtrArray) = CPU() | |
| function get_array_type(backend::CPU) | |
| return Array | |
| end | |
| @inline get_backend(::PtrArray) = CPU() | |
| @inline get_array_type(backend::CPU) = Array |
| tmp_u = copyto!(CPU(), allocate(CPU(), eltype(u), size(u)), u) | ||
| integrate(cons2cons, tmp_u, semi; normalize = normalize) |
There was a problem hiding this comment.
Just curious: is it not possible (or feasible) to execute integration on the GPU, or is it just not implemented yet?
There was a problem hiding this comment.
Another question: if the array u already lives on the CPU, is this still a copy (I assume it is) or is it a no-op. If it forces a copy, we should consider dispatching on u, i.e., if it is our "CPU backend array type" we should keep the original implementation and only do the copy on non-CPU backends.
But this is really just something to keep in mind/store on a TODO list, not something that needs to be done right now
| # jacobian_matrix[1, 2, :, :, element] = node_coordinates[1, :, :, element] * derivative_matrix' # x_η | ||
| # jacobian_matrix[2, 2, :, :, element] = node_coordinates[2, :, :, element] * derivative_matrix' # y_η | ||
|
|
||
| tmp_derivate_matrix = copyto!(CPU(), allocate(CPU(), eltype(derivative_matrix), size(derivative_matrix)), derivative_matrix) |
There was a problem hiding this comment.
| tmp_derivate_matrix = copyto!(CPU(), allocate(CPU(), eltype(derivative_matrix), size(derivative_matrix)), derivative_matrix) | |
| tmp_derivative_matrix = copyto!(CPU(), allocate(CPU(), eltype(derivative_matrix), size(derivative_matrix)), derivative_matrix) |
Here and elsewhere?
|
Probably superseded by #2590 Close this here for now. |
Offload some parts of the DG method to GPU accelerators.
TODO:
elixir_advection_basic.jlwith 2D tree mesh: (be39bb8)uandduintegration variables (9a81463)elixir_advection_basic.jlwith 2D p4est mesh:SymbolArrays with Integers Arrays for GPU access: (92736d8)node_indicesnode_indicesnode_indicessurface_flux_values(2c6443e)contravariant_vectors(2c6443e)inverse_jacobian(2c6443e)derivative_dhat(bd2696d)