SyTen
Contributing Guidelines

git usage

Feel free to create as many branches as needed, both in your private repository and on the current origin. Ideally, tend to prefer feature branches rather than long-living personal branches, but this is up to you.

Note that rebasing is strongly discouraged, but there is no reason not to merge as often and as happily as you like with other branches of yours (or master into one of your branches). If you feel that your work on a particular feature is finished, make sure that it compiles and runs with all currently-supported compilers and let me know that you would like to have something merged. Note that only generic tools and tensor network geometries should live on the master branch, individual lattice files are better kept in individual feature branches (unless they're useful as an example or to test a generic feature).

Coding (Style) House Rules and Guidelines

The following is a random list of things I’d like to keep observed in the code. MUST/SHALL etc. according to RFC 2119.

Documentation

  1. Each namespace, class, function and namespace or class-local variable MUST be documented.
  2. Namespace and variable documentation SHOULD be a concise one-liner. More extensive documentation MAY be added.
  3. Class and function documentation SHOULD be more extensive. The usage of a function SHOULD be obvious from its documentation (not definition) alone. The usage of a class SHOULD be obvious from its documentation and the documentation of its member variables and functions.
  4. Documentation MUST be in a form that it is found by Doxygen (version 1.8.11 or later if in doubt).
  5. Documentation must be available at the declaration of a function, not at its definition (i.e. in the header file, not the .cpp file).
  6. Each file below inc, lat or bin MUST be documented with a file directive.

Tests

  1. Shell-script driven tests MUST source init-shell and use its provided log() and plog() functions for output.
  2. Directly executable tests MUST use Test::log() for output.
  3. Tricky, pure functions with easy-to-set up arguments SHOULD be tested with a directly executable test program in test/.
  4. Model files and more complicated functions SHOULD be tested with a shell-script driven test, e.g. calculating the groundstate energy with DMRG and comparing to a known-good value, calculating correlators on that ground state etc.

Non-Stylistic Coding Conventions

  1. General syten namespace: All functions, classes, variables etc. MUST be defined in the general syten namespace. The only exception is the entry function main for binaries. Within that function, you may choose to either prefix all necessary calls by syten:: or directly call a helper function in the syten namespace (the latter is currently more popular).
  2. Common command line interface for binaries and lattice file executables (lat/*.cpp and bin/*.cpp)
    1. MUST use bpo_helper.h and in particular MUST make use of SYTEN_BPO_INIT and SYTEN_BPO_EXEC.
    2. SHOULD respect the command line switch --cache (use the HDD as cache when possible).
    3. MUST respect the command line switch --help (print a help message) and --quiet (print as little information as reasonable).
    4. MUST NOT overwrite/circumvent the other command line switches injected by SYTEN_BPO_EXEC.
  3. Header files and code organisation
    1. Template functions declared in a header file x.h MUST be defined inline in that file.
    2. Non-template functions declared in a header file x.h MUST be defined in a file x.cpp.
    3. Translation units (.cpp files) associated to a header file MAY include toolkit-internal header files apart from the header file with which they are associated.

      Remarks
      Dependency graph building now seems to work.
    4. If multiple header files are included in a file, STL headers SHOULD come first sorted alphabetically, third-party library headers SHOULD come second and SHOULD be grouped by library and toolkit-internal headers SHOULD come last and SHOULD be grouped by area of the toolkit they come from.
  4. Namespaces
    1. Functions (including operators) MUST be declared in the enclosing namespace of one of their type arguments, i.e. a function void f(T1, T2, T3) MUST be declared either in the namespace that also directly contains T1 or in the namespace that also directly contains T2 or in the namespace that also directly contains T3. This is necessary for ADL overload resolution to work correctly.
  5. Compiler support
    1. All currently-supported compilers (cf. INSTALL.md) must be able to compile the code both in debug mode and in release mode with all relevant warnings enabled.
    2. C++17 and supported parts of C++20 may be used.

      Remarks
      In particular, make generous use of auto and if constexpr.
  6. Pointer usage and memory management
    1. new and delete MUST NOT be used outside of container classes ensuring appropriate cleanup in their destructors.
    2. If a memory allocation itself or the use of the allocated memory is performance-sensitive, l_malloc() and l_free() SHOULD be used, especially if the contained type is trivially constructible and trivially destructable. If necessary, placement new MAY be used on the returned pointer. See syten::DynArray for an example.
    3. Raw pointers SHOULD not be used.
    4. Classes which have at least one member object which directly or indirectly dynamically allocates memory (e.g. a member std::vector<double> or std::array<std::vector<double>, 2>) and which are not intended as temporary calculation helpers (e.g. functors) MUST provide a MemoryUsage::MemorySize allocSize() const member function which returns the sum of the return values of MemoryUsage::allocSize() called on these member objects. For examples, see DenseTensor::allocSize() or TensorBlock::allocSize() etc.
  7. Naming conventions
    1. Struct, class and type names MUST start with a capital letter unless they are intended to be used like return-value-deducing functions.
    2. Function and variable names MUST start with a lower-case letter.
    3. Macros MUST be written in all caps.
    4. The signature of main MUST be either int main() or int main(int argc, char** argv). In the second case, the argument names MUST be argc and argv.
  8. Serialisation Versioning
    1. Classes MUST be versioned for serialisation. For simple, non-template classes, versioning MAY follow this example:

      // in file a.h
      class A {
      static constexpr unsigned int version = 1;
      template<typename Archive> // or use load/save as necessary
      void serialization(Archive& ar, unsigned int const in_version) {
      if (in_version != version) { throw OutdatedFile("A", in_version, version); }
      // continue
      }
      };
      BOOST_CLASS_VERSION(A, A::version)
      // in file a.cpp
      constexpr unsigned int A::version;
      Boost versioning.
      std::string version()
      Returns a version string describing the current CUDA version/compilation/enablement.
      Definition: cuda_support.cpp:474

      If the class is templated, but the valid template arguments are not templated again (e.g. Tensor<Rank rank>), you MAY follow the example above but use the BOOST_TEMPLATE_CLASS_VERSION macro.

      If the class is templated and valid template arguments may also be templates, the versioning MUST follow this example:

      constexpr unsigned int boost_version_MyClass = 1;
      template<typename Type>
      class MyClass : public boost::serialization::traits<Type,
      boost::serialization::object_class_info,
      boost::serialization::track_never,
      boost_version_MyClass> {
      template<typename Archive> // or use load/save as necessary
      void serialize(Archive& ar, unsigned int const in_version) {
      if (in_version != boost_version_MyClass) {
      throw OutdatedFile("MyClass", in_version, boost_version_MyClass);
      }
      // continue
      }
      };

      The key elements are: a) a single point where the version is declared (boost_version_MyClass), b) derivation from boost::serialization::traits with the version as the fourth template argument and c) the standard check in serialize.

  9. Scalar types: syten::SDef MUST be used as the default scalar type. Reasonable care SHOULD be taken not to rely on the specific type used, especially not whether the type is defined as double or std::complex<double>. Member functions MUST NOT be called on variables of this type, use free-standing functions instead. Code relying on complex scalar values MUST check the preprocessor macro SYTEN_COMPLEX_SCALAR and provide reasonable errors, ideally exiting with error code SYTEN_EC_COMPLEX_SCALAR_REQUIRED .

    syten::Index MUST be used as the index type for dense tensors (i.e. when a tuple of them is required as coordinates, dimensions etc.). For flat arrays, use syten::Size as the index type.

    Remarks
    On 64-bit systems, addition and multiplication on 32 and 64 bit numbers are nearly equally expensive, but addressing using a 64-bit number does not require a zero extend.
    User-defined literal operators _r, _c and (with complex installations) _i are available to generate SRDef, SDef and (potentially) imaginary SDef easily.
  10. Specific Details
    1. Truncation parameters (singular value threshold, total number of states, maximal blocksize) should generally be passed around as a Truncation struct.
    2. Use the macros defined in inc/util/filename_extensions.h as extensions for file names (e.g. for states, lattices etc.). ".state", ".lat" or equivalents MUST NOT be used in standard code.

Stylistic Coding Conventions

  1. Value and reference types MUST be of the form Value[ const][&]. Specifically, for any type T, write the reference to T as T& and the const reference to T as T const&.

    Remarks
    Since Doxygen doesn’t understand that const Value& and Value const& are the same, this is a strict requirement.
  2. Functions MUST be written as TYPE func([Args]); where TYPE conforms to the rule for types above.

    Remarks
    Note that a function returning a reference to T is T& f(), not T &f().
  3. The content of each opened block MUST be indentend by two spaces starting (inclusively) at the line following the opening { and ending (inclusively) at the last line with non-whitespace content apart from the closing } at or before that closing }.

    Example\n
    int main() {
    { int a = 0;
    int b = 2;
    { int c = 3;
    int d = 4; }
    }
    }
  4. Tabs MUST NOT be used.
  5. Lines SHOULD NOT be longer than 80 to 100 characters. Lines really SHOULD NOT be longer than 120 characters.
  6. Files pertaining primarily to one namespace at each level SHOULD open that namespace and define functions/variables in it, rather than using scope operators.