Overall Idea

Configuration takes place via the make.inc file. This file is included by the Makefile file which in turn defines the possible targets.

To get started, copy the file make.inc.example to make.inc and make any changes necessary. You will have to provide three paths to writable external directories which are contained in your LD_LIBRARY_PATH, PATH and PYTHONPATH environment variables respectively.

You will also have to provide the correct compiler/linker options to use LAPACKE and BLAS. Examples are provided in the file.

You can then call make -jN copy to compile and install all binaries and the Python bindings. On the first call, the Makefile will check whether you have a working installation of Boost and liblz4. If not, it will download/install those. If you have an installation in some nonstandard path, specify this in the make.inc file by adding to the CPPFLAGS_EXTRA or LDFLAGS_EXTRA variables.

The example file also offers you some additional toggles which you may find useful.

Available Targets

Among others, the Makefile defines the following targets:

make doc: to generate the documentation
make compile: to compile everything but don't install
make comptest: to compile everything and run all tests
make copy: to compile and install all executable binaries and the Python bindings
make bin/syten-dmrg--copy: (example) to compile and copy the syten-dmrg binary

Detailed Requirements

Required Libraries and other tools

Either LAPACKE and BLAS or the Intel MKL. If you use the Intel compiler, it is very easy to use the MKL. If you have the MKL available, it is also strongly suggested to use the MKL. Otherwise, you should install LAPACK, LAPACKE and BLAS, e.g. in the OpenBLAS version.
Python3 development headers are required, e.g. in the libpython3-dev package on Debian. The python3-config executable needs to be in your path.
Boost is used for serialisation and deserialisation, argument parsing and multiprecision floating point values. Version 1.67 works. If not available, the Makefile will download and install a version automatically.
liblz4 is used for compression. A version comes bundled with the toolkit, but it is also possible to use the system-provided version.
Doxygen can be used to generate documentation. Version 1.8.11 works flawlessly, 1.8.6 behaves a bit weirdly.
tcmalloc can and should be used with the toolkit if possible.
CMake is used by the fmt library. Install it using the module or package management system.

Suitable Compilers

GCC versions 8.2 and up should work seamlessly. Earlier GCC versions are not supported any more, the last "working" commits are tagged as pre-C++17 (for GCC < 7.2) and pre-C++20 (for GCC < 8.2) respectively. Hence, if you have an older compiler, use one of these commits: $ git checkout pre-C++17. When using GCC 9.1 or later with the Intel MKL, make sure to link using the GNU OpenMP libraries. Linking with the Intel OpenMP libraries (as was advisable before) will lead to wrong results, as some OpenMP loops are simply not executed etc.
Clang 8.0.0 should work if you let it use its own standard library, though then you will also have to compile Boost using this library. You may have to disable some warnings (see make.inc.example).
Clang 9.0.0 should work also with the GCC 8.2 standard library. You may have to disable some warnings (see make.inc.example).
The Intel ICC compiler version 19.0.1 does not yet support many of the modern C++ features used in the code.
Both Intel ICC and Clang will work if you use the pre-C++17 tagged commit in git.

tcmalloc Installation

tcmalloc is a very fast allocator; using it can speed up especially small calculations by 10-20%. However, it is not a strict requirement, using the standard allocator works equally well, just slower.

If it is already installed on your system, you can just add the following line to your make.inc file (see below):

LDFLAGS_EXTRA+=-ltcmalloc_minimal

Otherwise, you will need to manually download and extract the .so file, e.g. on Debian/Ubuntu, apt-get download libtcmalloc-minimal4 can be used to acquire the package, which can be extracted with dpkg -x. You then only have to copy the file from (e.g.) ./usr/lib/libtcmalloc_minimal.so.4.1.2 to e.g. the EXT_LIBDIR.

Note that linking in tcmalloc will lead to very confused errors from the Pyten module, if you want to use that module and your code is not error-free, it may be better not to link in tcmalloc.

Compile-Time Switches

A number of compile-time switches allow adapting the library to specific usecases, such as real-valued calculations, reduced floating point accuracy etc.

SYTEN_MAX_SYM and SYTEN_MAX_DEG

For performance reason, the data structures used to store the labels of an irrep and those used to store the irreps on a specific tensor length have a compile-time fixed length. This improves data locality and space usage. However, it means that if you want to use more symmetries or groups with a higher degree, you have to recompile the tools/binaries you want to use.

Per default, SYTEN_MAX_SYM is set to 3, i.e. tools can by default handle three symmetries (e.g. two different charges and one spin). If you want to use more symmetries, say 5, in your calculation, pass -DSYTEN_MAX_SYM=5 during compilation. You can do this with e.g. the line CPPFLAGS_EXTRA+=-DSYTEN_MAX_SYM=5 in your make.inc file. Tools compiled with a setting different from those used to build a lattice/state/DMRG file will complain when you attempt to load such a file.

SYTEN_MAX_DEG is set to 2 by default. This means that each irrep can have up to two labels. That is enough for $ \mathrm{U}(1) $, $ \mathrm{SU}(2) $ and $ \mathrm{SU}(3) $ symmetries as well as (due to an implementation quirk) $ Z_k $. Again, this is a compile-time constant, a binary compiled for at most one label cannot read a file created with another binary compiled for two labels. If you wish to use a lower setting, you can do this by e.g. adding the line CPPFLAGS_EXTRA+=-DSYTEN_MAX_DEG=1 in your make.inc file.

SYTEN_COMPLEX_SCALAR

Similar to SYTEN_MAX_SYM and SYTEN_MAX_DEG, SYTEN_COMPLEX_SCALAR allows you to define whether the scalar type used in the dense sub-blocks of tensors is real or complex. By default, it is complex, but if the operators of the problem in question can be written down in purely real terms and you do not wish to do real-time evolution, you can add CPPFLAGS_EXTRA+=-DSYTEN_COMPLEX_SCALAR=0 to your make.inc file. This is again a compile-time constant, you cannot read a complex wavefunction with a tool that is compiled for real numbers.

When this define is set to false (i.e. only real numbers are allowed) some tools will behave differently; e.g. while normally you have to pass in a $ \delta t = -10^{-3} \mathbb{i} $ to do imaginary time evolution with step-size $ 10^{-3} $, you will now have to use a value $ \delta t = 10^{-3} $ to achieve the same result. You can check whether a tool was compiled for real or complex numbers using the --version switch, the default tensor type is given as Standard scalar type.

SYTEN_SDEF_TYPE and SYTEN_SRDEF_TYPE

Again similar to the above, SYTEN_SDEF_TYPE and SYTEN_SRDEF_TYPE allow you define the types of standard dense tensor values. If you only set SYTEN_SRDEF_TYPE (using -DSYTEN_SRDEF_TYPE=float), the base numeric type will be floats. Tensors will then be complex floats (unless you set SYTEN_COMPLEX_SCALAR to a false value). If you also set SYTEN_SDEF_TYPE, that type will be used directly, but you will then also have to tell the toolkit whether it is a complex type or a real type by also defining SYTEN_COMPLEX_SCALAR.

For single-precision real calculations, use either -DSYTEN_COMPLEX_SCALAR=0 -DSYTEN_SRDEF_TYPE=float or -DSYTEN_SDEF_TYPE=float -DSYTEN_SRDEF_TYPE=float -DSYTEN_COMPLEX_SCALAR=0.

For single-precision complex calculations, use -DSYTEN_COMPLEX_SCALAR=1 -DSYTEN_SRDEF_TYPE=float or -DSYTEN_SRDEF_TYPE=float or -DSYTEN_SDEF_TYPE=std::complex<float> -DSYTEN_SRDEF_TYPE=float.

For double-precision real calculations, use either -DSYTEN_COMPLEX_SCALAR=0 -DSYTEN_SRDEF_TYPE=double or -DSYTEN_COMPLEX_SCALAR=0 or -DSYTEN_SDEF_TYPE=double -DSYTEN_SRDEF_TYPE=double -DSYTEN_COMPLEX_SCALAR=0.

Double-precision complex calculations are the default.

SYTEN_STENSOR

To use the new ‘smart’ tensor class, define the STENSOR_RANK variable in make.inc to an integer value which will be the maximal rank supported. The Makefile will then automatically set the (internally-required) defines SYTEN_STENSOR_RANKS, SYTEN_STENSOR_MAX_RANK and SYTEN_STENSOR.

SYTEN_USE_MKL and SYTEN_USE_OPENBLAS

Both the MKL and OpenBLAS offer some additional toggles to e.g. control threading. When using either of these libraries, you should hence define the associated define, e.g. by adding CPPFLAGS_EXTRA+=-DSYTEN_USE_MKL to your make.inc file.

SYTEN_USE_CUDA

Preliminary GPU support using the CUDA toolkit is currently in testing phase. To enable this, define SYTEN_USE_CUDA and add the appropriate compiler and linker flags to your make.inc. You need to link against libcudart, libcublas and libcusolver at the moment.

clang-tidy

The Makefile supports a clang-tidy target which runs clang-tidy for each translation unit (object files and executables). You can customize the call to clang-tidy by setting the CLANGTIDYFLAGS_EXTRA and CPPFLAGS_CLANGTIDY_EXTRA variables in your make.inc file. At the moment (commit 59520276) there are still many false positives or warnings about issues we don't care about too much and only very few actually useful warnings.

Specific Installation Instructions

Installation on Debian

Debian Testing comes with nearly all packages necessary to successfully compile SyTen. As such, if you have root access on the machine, it is very easy to set up. Note that while here we use OpenBLAS, it is highly advisable to use the MKL if you have it.

First, install the necessary packages:

# apt install rsync make git g++ libboost-all-dev liblz4-1 liblz4-dev libopenblas-dev libopenblas-base liblapack3 liblapack-dev liblapacke liblapacke-dev libtcmalloc-minimal4 libgoogle-perftools-dev libpython3-dev ipython3 cmake

On a bare system this will take about one gigabyte of hard disk space, but in practice most of these packages are already installed.

The following two caveats apply:

OpenBLAS in version 0.2.20 mixes up complex and double pointers. Possible solutions:
- install OpenBLAS from unstable: apt install -t unstable libopenblas-dev
- or install the Netlib BLAS (apt install libblas-dev) and point the symlink /etc/alternatives/cblas.h-x86_64-linux-gnu at /usr/include/x86_64-linux-gnu/cblas-netlib.h instead of /usr/include/x86_64-linux-gnu/cblas-openblas.h and do not use OpenBLAS explicitly (do not define the SYTEN_USE_OPENBLAS macro)
You may see ‘strange’ linking errors complaining about missing symbols such as __zgesvd_stage2 or similar. In this case, you can either
- install OpenBLAS from unstable as above
- downgrade liblapacke to version 3.7 from version 3.8

No special compiler options are needed to find BLAS and LAPACKE, but make sure to pass -lblas -llapack -llapcke as the linker options.

To compile and install, execute

$ make copy -jN

where you replace N by the number of processors available on your system or the number of gigabytes of RAM available (each g++ instance takes roughly 1GB peak RAM). After a while, everything should be compiled and copied into place. If the external binary directory is already in your path, you should then be able to execute e.g. syten-add without problems.

Table of Contents