The title of your publication挨打

第十一章 torch.cuda 该包增加了对 CUDA 张量类型的支持，实现了与 CPU 张量相同的功能，但使用 GPU 进行计算。它是懒惰的初始化，所以你可以随时导入它，并使用 is_available()来确定系统是否支持 CUDA。 CUDA 语义中有关于使用 CUDA 的更多细节。 torch.cuda.current_blas_handle() 返回 cublasHandle_t 指针，指向当前 cuBLAS 句柄 torch.cuda.current_device() 返回当前所选设备的索引。 torch.cuda.current_stream() 返回一个当前所选的 Stream class torch.cuda.device(idx) 上下文管理器，可以更改所选设备。参数： - idx (int) – 设备索引选择。如果这个参数是负的，则是无效操作。 torch.cuda.device_count() 返回可得到的 GPU 数量。 class torch.cuda.device_of(obj) 将当前设备更改为给定对象的上下文管理器。可以使用张量和存储作为参数。如果给定的对象不是在 GPU 上分配的，这是一个无效操作。参数： - obj (Tensor or Storage) – 在选定设备上分配的对象。 torch.cuda.is_available() 返回一个 bool 值，指示 CUDA 当前是否可用。 torch.cuda.set_device(device) 设置当前设备。不鼓励使用此函数来设置。在大多数情况下，最好使用 CUDA_VISIBLE_DEVICES 环境变量。参数： - device (int) – 所选设备。如果此参数为负，则此函数是无效操作。 torch.cuda.stream(stream) 选择给定流的上下文管理器。在其上下文中排队的所有 CUDA 核心将在所选流上入队。参数： - stream (Stream) – 所选流。如果是 None，则这个管理器是无效的。 torch.cuda.synchronize() 等待当前设备上所有流中的所有核心完成。

交流集 torch.cuda.comm.broadcast(tensor, devices) 向一些 GPU 广播张量。参数： - tensor (Tensor) – 将要广播的张量 - devices (Iterable) – 一个可以广播的设备的迭代。注意，它的形式应该像（src，dst1，dst2，...），其第一个元素是广播来源的设备。返回：一个包含张量副本的元组，放置在与设备的索引相对应的设备上。 torch.cuda.comm.reduce_add(inputs, destination=None) 将来自多个 GPU 的张量相加。所有输入应具有匹配的形状。参数： - inputs (Iterable[Tensor]) – 要相加张量的迭代 - destination (int, optional) – 将放置输出的设备（默认值：当前设备）。

Turn static files into dynamic content formats.

Create a flipbook

Articles inside

The title of your publication挨打

Articles inside

第十四章 torch.utils.model_zoo

第十一章 torch.cuda

第十二章 torch.utils.ffi

第八章 Automatic differentiation package - torch.autograd

第六章 torch.nn.init

第四章 torch.nn

BLAS and LAPACK Operations

第五章 torch.nn.functional

第三章 torch.Storage

第七章 torch.optim

序列化 Serialization

第二章 torch.Tensor