Description
High-performance, multi-process, GPU-accelerated library for distributed basic dense linear algebra.
NVIDIA cuBLASMp is a high-performance, multi-process, GPU-accelerated library for distributed basic dense linear algebra.
cuBLASMp is compatible with 2D block-cyclic data layout and provides PBLAS-like C APIs. By downloading and using this package you accept the terms and conditions of the associated license(s).