# Developer Guide ## Directory hierarchy, or "Where are all the files?" The main directory with the code is the `cg` subproject; the header files are in `cg/include/cg`, whereas the source files are in `cg/src`. The directory hierarchy in both of these is, for the most part, replicated. For example, one can find function declarations in `cg/include/cg/pid/eval_forces.h` and function definitions in `cg/src/cg/pid/eval_forces.cpp`. We will henceforth skip the `cg/include/cg` and `cg/src/cg` prefixes and just refer to files like `pid/eval_forces.h` and `pid/eval_forces.cpp` with the understanding that the `.h` files go into the `include/` directory, and that `.cpp` files go into the `src/` directory (for the most part, anyways). ## (Rough) Control flow of the program, Notes - Most of the "action" happens in the `simul/` directories: - `program.h`, `program.cpp` - Program instance, invoked in the `src/main.cpp` file, creates a state instance and the execution system and runs it; - `state.h`, `state_io.cpp` - State instance. `cg::simul::state` contains the *entire* state of the program at a given point (so, things like positions of the monomers, velocities, amino acid types, Verlet list etc.) `state.cpp` contains setup functions, which are invoked during various phases of the simulation. - `thread.h`, `thread_setup.cpp`, `thread_main.cpp` - Thread system. The (OpenMP) thread instances modify the `cg::simul::state` instance, but themselves are *stateless* (for example, one can\* copy the thread instances) - this separation makes the code (hopefully) more modifiable, as new features can be added to the `state` and at best one needs to then add appropriate code to the `thread`. Another major advantage (compared to just having `state` instance with various functions) is that the threads can maintain ephemeral buffers, like thread-private force buffers, which avoids costly atomic operations. Next, it also makes it less dependent on using OpenMP, if one were to ditch it in the future. - `thread_main.cpp` contains most of the procedures. In particular, there is the finite state machine for the management of the phases of the simulation, as well as the core `advance_step` function implementation which does most of the heavy lifting, computation-wise. - `thread_setup.cpp` binds the state variables to the *kernels* used by the threads to perform operations. - The kernels (for example `pid::eval_forces` or `tether::eval_forces` or `nl::legacy_update`) are function-like objects which take in the state variables (or rather the pointers thereto) and perform a given operation. To answer "Why not just the functions?": (a) It allows for different modes of execution - for example, one can implement evaluation of forces sequentially (`operator()`) or via OpenMP asynchronous procedure (`omp_async()`) or just a slice of the given task (`for_slice(from, to)`), (b) One doesn't need to pass all the parameters all the time, just need to bind them during the instantiation of the kernel. (NOTE: One needs to take care to properly initialize said kernels, as the bugs may arise if some of the variables and parameters are uninitialized) - The runtime system (`simul/runtime`) is responsible for splitting the main work of the program, i.e. evaluation of forces, into appropriate *slices*, and assigning said slices deterministically to the appropriate threads, to balance the workload. One potential issue with using `#pragma for schedule(static) nowait` is that it's unclear how the work is distributed if there are many such for-loops one after another, as is the case in the `void thread::eval_forces()` - the runtime system makes it explicit. It also allows for a more elegant implementation of vectorized procedures (see `pid::eval_forces::vect_iter` and the source file `pid/eval_forces.cpp` for details); - Most of the rest of the code is either kernels, input system (`input`), utilities (`utils`) or file format stuff (`files`); - One unorthodox/custom feature is the use of a custom-built library for heterogeneous vectors and computations. This, from the developer's perspective, mostly appears through some strange-looking declarations of data, like in `pid/bundle.h`. The documentation for this part is unfinished, but if one wants to create new datatypes, one can roughly follow how it's done in all these various already-written examples and it should work out-of-the-box.