A custom, naive CUDA dot product for complex scalars. More...

#include <cstdint>
#include <complex>

Include dependency graph for cuda_dot_cukrn.h:

This graph shows which files directly or indirectly include this file:

Namespaces
namespace	syten
	Syten namespace.

namespace	syten::Cuda
	Support functions (memory allocation etc.) for CUDA-based GPUs.

Functions
void	syten::Cuda::cuda_dot_conj_kernel_impl (std::size_t sz, const std::complex< double > to_be_conj_a, const std::complex< double > b, std::complex< double > result, void cuda_stream)
	Calculates the scalar product of two CUDA arrays. More...

Variables
constexpr std::size_t	cukrn_dot_elems_per_worker = 4
	Number of elements each thread adds up in the `dot` kernel, 4 seems to be the optimum for a Tesla P100 in a real-world test. More...

constexpr std::size_t	cukrn_dot_threads = 16
	Number of threads per thread block for the `dot` kernel, 16 seems to be the optimum for Telsa P100 in a real-world test. More...

Detailed Description

A custom, naive CUDA dot product for complex scalars.

Namespaces