Development

Preview of the documentation

To generate the Documentation, first instantiate the docs environment by executing the following command from the TrixiParticles.jl root directory:

julia --project=docs -e "using Pkg; Pkg.develop(PackageSpec(path=pwd())); Pkg.instantiate()"

This command only has to be run once. After that, maintain the docs environment as described under Installation.

With an instantiated docs environment, generate the docs with the following command (again from the TrixiParticles.jl root directory):

julia --project=docs --color=yes docs/make.jl

You can then open the generated files in docs/build with your webbrowser. Alternatively, run

python3 -m http.server -d docs/build

and open localhost:8000 in your webbrowser.

Release management

To create a new release for TrixiParticles.jl, perform the following steps:

Make sure that all PRs and changes that you want to go into the release are merged to main and that the latest commit on main has passed all CI tests.
Determine the currently released version of TrixiParticles.jl, e.g., on the release page. For this manual, we will assume that the latest release was v0.2.3.
Decide on the next version number. We follow semantic versioning, thus each version is of the form vX.Y.Z where X is the major version, Y the minor version, and Z the patch version. In this manual, we assume that the major version is always 0, thus the decision process on the new version is as follows:
- If the new release contains breaking changes (i.e., user code might not work as before without modifications), increase the minor version by one and set the patch version to zero. In our example, the new version should thus be v0.3.0.
- If the new release only contains minor modifications and/or bug fixes, the minor version is kept as-is and the patch version is increased by one. In our example, the new version should thus be v0.2.4.
Edit the version string in the Project.toml and set it to the new version. Push/merge this change to main.
Go to GitHub and add a comment to the commit that you would like to become the new release (typically this will be the commit where you just updated the version). You can comment on a commit by going to the commit overview and clicking on the title of the commit. The comment should contain the following text:
```
@JuliaRegistrator register
```
Wait for the magic to happen! Specifically, JuliaRegistrator will create a new PR to the Julia registry with the new release information. After a grace period of ~15 minutes, this PR will be merged automatically. A short while after, TagBot will create a new release of TrixiParticles.jl in our GitHub repository.
Once the new release has been created, the new version can be obtained through the Julia package manager as usual.
To make sure people do not mistake the latest state of main as the latest release, we set the version in the Project.toml to a development version. The development version should be the latest released version, with the patch version incremented by one, and the -dev suffix added. For example, if you just released v0.3.0, the new development version should be v0.3.1-dev. If you just released v0.2.4, the new development version should be v0.2.5-dev.

Writing GPU-compatible code

When implementing new functionality that should run on both CPUs and GPUs, follow these rules:

Data structures must be generic and parametric. Do not hardcode concrete CPU array types like Vector or Matrix in fields. Use type parameters, so the same structure can store CPU arrays and GPU arrays.
Add an Adapt.jl rule in src/general/gpu.jl. Register the new type with Adapt.@adapt_structure ..., so adapt can recursively convert all arrays inside the structure to GPU arrays. This conversion is then applied automatically inside semidiscretize.
Use @threaded for all loops. Accessing GPU arrays inside regular loops is not allowed. With a GPU backend, @threaded loops are compiled to GPU kernels.
Write type-stable code and do not allocate inside @threaded loops. This is required for GPU kernels and is also essential for fast multithreaded CPU code.

Writing fast GPU code

The following rules improve kernel performance and avoid common GPU pitfalls:

Avoid exceptions and bounds errors inside kernels. Perform all required checks before entering @threaded loops (that is, before GPU kernels). Then use @inbounds directly at the loop where bounds are guaranteed. In TrixiParticles.jl, we do not place @inbounds inside inner helper functions. Instead, mark helper functions with @propagate_inbounds so the loop-level @inbounds is propagated.
Avoid implicit Float64 literals in arithmetic. For example, prefer x / 2 over 0.5 * x so Float32 simulations stay in Float32. Verify this with @device_code, or by confirming the kernel runs on an Apple GPU (most Apple GPUs do not support Float64).
Use div_fast in performance-critical divisions, but only after benchmarking (!). It can significantly speed up kernels, but should not be applied indiscriminately. When introducing div_fast in code, add a reference to this section to document the rationale and benchmarking context, e.g., like so:
```
# Since this is one of the most performance critical functions, using fast divisions
# here gives a significant speedup on GPUs.
# See the docs page "Development" for more details on `div_fast`.
result = div_fast(dividend, divisor)
```