SyTen
cuda_dot_cukrn.h File Reference

A custom, naive CUDA dot product for complex scalars. More...

#include <cstdint>
#include <complex>
+ Include dependency graph for cuda_dot_cukrn.h:
+ This graph shows which files directly or indirectly include this file:

Namespaces

namespace  syten
 Syten namespace.
 
namespace  syten::Cuda
 Support functions (memory allocation etc.) for CUDA-based GPUs.
 

Functions

void syten::Cuda::cuda_dot_conj_kernel_impl (std::size_t sz, const std::complex< double > *to_be_conj_a, const std::complex< double > *b, std::complex< double > *result, void *cuda_stream)
 Calculates the scalar product of two CUDA arrays. More...
 

Variables

constexpr std::size_t cukrn_dot_elems_per_worker = 4
 Number of elements each thread adds up in the dot kernel, 4 seems to be the optimum for a Tesla P100 in a real-world test. More...
 
constexpr std::size_t cukrn_dot_threads = 16
 Number of threads per thread block for the dot kernel, 16 seems to be the optimum for Telsa P100 in a real-world test. More...
 

Detailed Description

A custom, naive CUDA dot product for complex scalars.