Support functions for CUDA-based GPUs. More...
Classes | |
class | syten::Cuda::CudaPtr< T > |
Implementation for syten::CudaPtr. More... | |
class | syten::Cuda::CudaStream |
Implementation for syten::CudaStream. More... | |
Namespaces | |
namespace | syten |
Syten namespace. | |
namespace | syten::Cuda |
Support functions (memory allocation etc.) for CUDA-based GPUs. | |
Typedefs | |
template<typename T > | |
using | syten::CudaPtr = Cuda::CudaPtr< T > |
A pointer to a CUDA-allocated region (if SYTEN_USE_CUDA is true) or a host-allocated memory area which knows its CUDA device. More... | |
using | syten::CudaStream = Cuda::CudaStream |
Represents a CUDA stream. More... | |
Functions | |
std::uint16_t | syten::Cuda::allocator_get_max_size () |
Returns the log2 of the maximal block size of the CUDA allocator. More... | |
std::uint16_t | syten::Cuda::allocator_get_min_size () |
Returns the log2 of the minimal block size of the CUDA allocator. More... | |
void | syten::Cuda::allocator_print_status () |
Prints the status of the CUDA allocator. More... | |
std::uint16_t | syten::Cuda::allocator_set_max_size (std::uint16_t sz) |
Sets the log2 of the maximal block size of the CUDA allocator. More... | |
std::uint16_t | syten::Cuda::allocator_set_min_size (std::uint16_t sz) |
Sets the log2 of the minimal block size of the CUDA allocator. More... | |
bool | syten::Cuda::cuda_compiled () |
Returns true if CUDA support is compiled in. More... | |
bool | syten::Cuda::cuda_enabled () |
Returns true if the list of allowed devices is not empty. More... | |
std::string | syten::Cuda::version () |
Returns a version string describing the current CUDA version/compilation/enablement. More... | |
Allowed devices and allocation logic | |
std::int16_t | syten::Cuda::get_alloc_device () |
Returns the device ID of the next allocation device to use. More... | |
Vec< std::int16_t > const & | syten::Cuda::get_allowed_devices () |
Returns a list of allowed devices. More... | |
void | syten::Cuda::setup () |
Sets up CUDA to allow all existing devices and generates the associated cuBLAS handles for the calling thread. More... | |
void | syten::Cuda::setup (Vec< std::int16_t > const &devices) |
Sets up CUDA to allow the specified devices and generates the associated cuBLAS handles for the calling thread. More... | |
Memory management functions. | |
template<typename T > | |
CudaPtr< T > | syten::Cuda::alloc (std::size_t num) |
Allocates enough space for num objects of type T on the next CUDA allocation device and returns an appropriate CudaPtr. More... | |
CudaPtr< void > | syten::Cuda::alloc (std::size_t sz) |
Allocates sz bytes on the next CUDA allocation device. More... | |
CudaPtr< void > | syten::Cuda::alloc_on_device (std::size_t sz, std::int16_t device) |
Allocates sz bytes on the specified CUDA allocation device. More... | |
template<typename T > | |
CudaPtr< T > | syten::Cuda::alloc_on_device (std::size_t sz, std::int16_t device) |
Allocates enough space for num objects of type T on the specified CUDA device and returns an appropriate CudaPtr. More... | |
template<typename T > | |
void | syten::Cuda::free (CudaPtr< T > ptr) |
Frees the allocation pointed to by ptr . More... | |
void | syten::Cuda::free (CudaPtr< void > ptr) |
Frees the allocation pointed to by ptr . More... | |
template<typename T > | |
std::int16_t | syten::Cuda::host_device (CudaPtr< T > v) |
Returns the device hosting the allocation pointed to by v . More... | |
Copying objects and arrays. | |
template<typename T > | |
void | syten::Cuda::copy (const T *src, CudaPtr< T > dst, std::size_t num) |
Copies num objects of type T from src to dst . More... | |
template<typename T > | |
void | syten::Cuda::copy (const T *src, CudaPtr< T > dst, std::size_t num, CudaStream const &str) |
Copies num objects of type T from src to dst within stream str . More... | |
template<typename T > | |
void | syten::Cuda::copy (const T *src, T *dst, std::size_t num) |
Copies num objects of type T from src to dst . More... | |
template<typename T > | |
void | syten::Cuda::copy (const T *src, T *dst, std::size_t num, CudaStream const &str) |
Copies num objects of type T from src to dst within stream str . More... | |
void | syten::Cuda::copy (CudaPtr< const char > src, CudaPtr< char > dst, std::size_t num) |
Copies num bytes from src to dst . More... | |
void | syten::Cuda::copy (CudaPtr< const char > src, CudaPtr< char > dst, std::size_t num, CudaStream const &str) |
Copies num bytes from src to dst within stream str . More... | |
template<typename T > | |
void | syten::Cuda::copy (CudaPtr< const typename IdentityType< T >::type > src, CudaPtr< T > dst, std::size_t num) |
Copies num objects of type T from src to dst . More... | |
template<typename T > | |
void | syten::Cuda::copy (CudaPtr< const typename IdentityType< T >::type > src, CudaPtr< T > dst, std::size_t num, CudaStream const &str) |
Copies num objects of type T from src to dst within stream str . More... | |
template<typename T > | |
void | syten::Cuda::copy (CudaPtr< const typename IdentityType< T >::type > src, T *dst, std::size_t num) |
Copies num objects of type T from src to dst . More... | |
template<typename T > | |
void | syten::Cuda::copy (CudaPtr< const typename IdentityType< T >::type > src, T *dst, std::size_t num, CudaStream const &str) |
Copies num objects of type T from src to dst within stream str . More... | |
Device management and synchronisation. | |
Pair< Size, Size > | syten::Cuda::mem_status () |
Returns a pair of free and total device memory. More... | |
void | syten::Cuda::select_device (std::int16_t device) |
Selects the specified device for the current thread. More... | |
void | syten::Cuda::synchronise () |
Synchronizes with the current device. More... | |
Support functions for CUDA-based GPUs.