[FEA] Support for NVIDIA_TF32_OVERRIDE environment variable + handle

**Is your feature request related to a problem? Please describe.**
I have recently run some brute force KNN benchmarks with @tfeher. Here, we looked at the impact of using `1 x tf32` versus `3 x tf32` performance of brute force knn. On a representative benchmark, using `1 x tf32` resulted in a 2.5x speedup (5 seconds -> 2 seconds). This can be significant for certain workloads (but can also not be set as the default due to unknown effects of reduced numerical accuracy).

We ran into the problem how to support this use case in our current pairwise distance API. We already have two distance types for the L2 distance (expanded and unexpanded). Adding variants for every possible way of speeding up the computation could become prohibitive. CuBLAS supports the `NVIDIA_TF32_OVERRIDE` environment variable that can force `fp32` computations to be performed in `tfloat32` precision. 

**Describe the solution you'd like**
Add support for the `NVIDIA_TF32_OVERRIDE` environment in the RAFT handle. This way, algorithms can interrogate this option without having to continously inspect the environment.

In addition, make it possible to set the tf32 override programmatically. For instance, PyTorch supports the [following](https://pytorch.org/docs/stable/notes/cuda.html#tensorfloat-32-tf32-on-ampere-devices): 

```python 
# The flag below controls whether to allow TF32 on matmul. 
torch.backends.cuda.matmul.allow_tf32 = True
```

**Describe alternatives you've considered**
Adding another L2 distance type, which I think is unwise (and would not help in the case of cosine distance). Also, adding boolean flags to the pairwise distance API is going to be a mess.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Support for NVIDIA_TF32_OVERRIDE environment variable + handle #1393

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEA] Support for NVIDIA_TF32_OVERRIDE environment variable + handle #1393

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions