Update base CUDA image for CI to v10.0 cuDNN 7.3.1#14513
Update base CUDA image for CI to v10.0 cuDNN 7.3.1#14513marcoabreu merged 2 commits intoapache:masterfrom
Conversation
|
I can suggest to try to update to 10 instead, like here: #12850 |
|
@mxnet-label-bot add[pr-work-in-progress, Test, CI] |
b0f3580 to
e4e31c1
Compare
1aba52f to
29578b5
Compare
|
|
b7e824c to
c01d5f7
Compare
|
Could you add more info in the description of the issue, like pointers to line numbers? Is not clear to me what is the problem. |
|
Who uses the variable "CUDNN_VERSION" ? when cuda is loaded? |
Look here |
|
@larroy I've updated the PR description with the links and examples. Sorry about the confusion. |
5480ed5 to
136443e
Compare
|
trying cuda 10.1, out of curiosity |
c78796f to
007ae76
Compare
|
Trying to put the failing tests on P3 instances, which should have a higher computer and maybe more functionality available to them. |
|
I've split out the CUDNN_VERSION environment variable commit to its own PR (#14595) to see if it would get passed CI as it currently is. |
007ae76 to
724ea82
Compare
|
Updated the CI images to be based off cuda10-base and I'm manually installing a lower version of cudnn to test the theory that maybe we aren't compatible with cuDNN 7.5 |
47d3f6b to
7c0b80b
Compare
…nstalls cuDNN version 7.3.1.20
… installs cuDNN version 7.3.1.20
7c0b80b to
58a3be0
Compare
|
@mxnet-label-bot remove[pr-work-in-progress] |
|
Nice work @perdasilva @lebeg @marcoabreu |
* Updates Ubuntu GPU CI image base image to cuda10-devel and manually installs cuDNN version 7.3.1.20 * Updates CentOS 7 GPU CI image base image to cuda10-devel and manually installs cuDNN version 7.3.1.20
* Updates Ubuntu GPU CI image base image to cuda10-devel and manually installs cuDNN version 7.3.1.20 * Updates CentOS 7 GPU CI image base image to cuda10-devel and manually installs cuDNN version 7.3.1.20
Description
I believe CI isn't reporting some GPU test issues because both the version of cudnn in the environment that we set and the cudnn version set by the image are lower than what is required for some tests eg by the test.
So, as it is in the example linked above, the test just ensures an error is raised. I don't know if (m)any of the cudnn functions are being tested right now. Maybe there are bugs slipping through.
This PR bumps the base image for GPU instances to CUDA v10.0. Furthermore, it updates the test functions to use the cuDNN version variable given by the environment, and only use 7.0.3 by default, if nothing is set.