Implementation for CUDA transposition kernels - proxy file to hide kernel header if unused. More...
#include "inc/util/cuda_span.h"
#include "inc/util/span.h"
#include "inc/util/scalars.h"
#include "inc/dense/cuda_transpose_impl_cukrn.h"
Namespaces | |
namespace | syten |
Syten namespace. | |
namespace | syten::CudaDenseTensorImpl |
Implementation namespace for CUDA dense tensors. | |
Functions | |
template<Rank rank, typename Scalar > | |
void | syten::CudaDenseTensorImpl::cuda_transpose_kernel (CudaConstSpan< Scalar > inp, CudaMutSpan< Scalar > out, ConstSpan< IndexNumber > perm, ConstSpan< Index > dim, Conj do_conj) |
Wrapper around the CUDA transpose kernels which sets everything up such that the functions in cuda_transpose_impl_cukrn.h really only have to launch the kernels. More... | |
Implementation for CUDA transposition kernels - proxy file to hide kernel header if unused.