Announcement Release2.5.0 - mctools/ncrystal GitHub Wiki

NCrystal v2.5.0 : Major technical upgrade!

The NCrystal v2.5.0 release brings major technical improvements to NCrystal "under the hood", with around 15.000 lines of code changed in more than 200 files! Although massive, the changes are mostly technical in nature and mainly affects the C++ API, so people using NCrystal in another way (e.g. through Python, C, Geant4, McStas, ANTS2, ...), should still find that everything keeps working as before — although perhaps with slightly improved performance and with a few new useful function having been added to the Python/C API's. The motivation for the work was to bring NCrystal's code up to modern C++ standards, introduce better caching strategies, allow usage in multi-threaded applications, and in general pave the way for future planned improvements (e.g. multi-phasic materials, better or new integrations with Geant4, OpenMC, etc.). As a side benefit, some fragile and hard-to-maintain code was cleaned up, and new features were made available concerning MT-safety, handling of random number (RNG) streams, and handling of input data sources.

The following sections will highlight some of the changes. Note that as many of these changes are rather technical in nature, most of the information is intended mostly for those dealing with NCrystal's C++ API (e.g. plugin developers). However, in particular the information in the Data source handling section might be of more general interest.

Table of Contents

Multi-thread safe physics processes and random number streams

In the past, physics processes in NCrystal were always represented by C++ classes which internally took care of RNG streams and neutron state-dependent caching. This is fine in single-threaded code, but in multi-threaded code two threads might try to access or modify the same cache or RNG stream state, which will usually result in crashes (if lucky) or mysteriously wrong results. In NCrystal v2.5.0, all physics processes must be implemented in a way which gets both RNG streams and cache-storage space provided by the caller, whenever asked to do any calculations (cf. NCProcImpl.hh to see the new base-classes and NCFactImpl.hh for the factory functions creating them). This provides the one calling the code with full flexibility (but also responsibility) concerning management of RNG streams and cache storage. With these building blocks it should now be possible for integrators to integrate NCrystal correctly into multi-thread capable applications like ANTS2 or Geant4.

Of course, not all users of the NCrystal C++ API will be thrilled at the prospect of suddenly having to manage the nitty-gritty of caches and RNG states in their own code. Especially users applying NCrystal in a single-thread context could be imagined to see this as a needlessly added complication. For that reason, we also provide so-called "managed physics processes" (cf. NCProc.hh to see them) which wraps the lower-lever building blocks in classes that provide functionality similar to the old physics process classes — keeping internally per-object cache-storage and RNG stream. The downside is that each such process object can only be used from one thread at a time. But fortunately they can be cheaply "cloned" to produce independent objects, in case such multi-threaded usage is actually needed — and the user really prefers the "managed physics processes" over just using the lower-lever building blocks from NCProcImpl.hh directly.

  //Create and use scatter objects just like in the "good" old days:

  auto sc = NCrystal::createScatter("Al_sg225.ncmat;temp=300K");
  std::cout<< "Al CrossSection @ "<<ekin<<" is "<<sc.crossSectionIsotropic(ekin)<<std::endl;;

  //Create some clones (this is very cheap and automatically creates independent
  //random number streams and cache-storage for each object):

  auto sc2 = sc.clone();
  auto sc3 = sc.clone();
  auto sc4 = sc.clone();

  //Voila, it is now safe to use [sc, sc2, sc3, sc4] in a multi-threaded
  //programme (each thread should use only one of the objects of course).

The Python and C interfaces do not change much, but the processes there are now also cloneable in the samme manner as the managed processes in C++ (unfortunately, the Python GIL makes it tricky to actually benefit from multi-threaded code in Python, but that is another story...).

Note that NCrystal::createScatter above no longer returns a pointer to an object. Instead it returns a (light-weight) object by value. In essence, this object internally keeps a std::shared_ptr to the underlying physics model (i.e. the "building block" from NCProcImpl.hh), as well as data related to random number stream and cache. For safety, it can not be copied directly to create new objects, this always requires an explicit call to the clone method. But the object does of course support move-semantics, so can be moved around if necessary.

In order to support the creation of independent RNG streams upon process cloning, the random number infrastructure was extended, allowing various cloning strategies depending on the capabilities of the RNG algoritm used (see NCRNG.hh for more details). The method used by default with NCrystal's builtin Xoroshiro128+ RNG algorithm, relies on the capability of the ultra-high period algorithm to perform cheap "jumps", skipping past a very large number of random numbers, in order to reach a state from which independent numbers can be produced. Specifically the Xoroshiro128+ RNG algorithm has a period of 2¹²⁸, and is capable of cheaply jumping ahead to a state which would in principle result from consuming exactly 2⁶⁴ numbers. In practice, no thread will ever be able to consume 2⁶⁴ random numbers, guaranteeing the independence of the produced RNG streams.

General multi-thread safety

Not just the physics processes have become multi-thread safe. All of NCrystal's code have now been rewritten, so global object access is mutex protected, statics at global scope are avoided (due to initialisation order issues) in favour of function-scope statics, etc. This means NCrystal should be MT-safe not only when using physics processes during a simulation loop, but also when setting up materials and physics processes during initialisation. Thus, it is in principle possible to initialise different materials in different threads, which could potentially improve initialisation speeds dramatically in some scenarios.

A word of warning though: NCrystal has obviously not been used much in actual multi-threaded applications at this point in time, so it expected that there will be a few bugs or issues to iron out. So as always, user feedback and issue reports are very much welcomed on this (and any other) topic!

Data source handling

What was wrong previously

First a bit of background: Originally, NCrystal only supported on-disk files, and caching was done based purely on the filenames. So for instance calling createInfo("myfile.ncmat;temp=400K") multiple times would return the same object each time, without NCrystal actually doing the CPU-intensive initialisation work more than once. Most of the time this behaviour was exactly what users wanted, but some users were rightfully expecting that changing the content of myfile.ncmat dynamically and reloading it (e.g. to see the effect of varying some parameter) should also result to changes in the loaded results. Later on, NCrystal added some support for users registering in-memory virtual files, which to some degree alleviated this, but it was not a complete solution. In addition, the requested capability of optionally having the NCrystal data library embedded in the compiled binary, and the appearance of externally developed plugins (with associated data files) which could be loaded dynamically or compiled statically, lead to a rather messy code with a lot of "duct tape" holding it all together. In addition to this, there were other problems. For instance, if a user wrote createInfo("Al_sg225.ncmat"), which file was actually loaded? Nominally one might argue that this should always use the Al_sg225.ncmat file from the NCrystal data library. But what if a user had a local copy of that file in the current working directory? Surely, opening a terminal and typing ncrystal_inspectfile Al_<TAB>, tab-completing to the local file: ncrystal_inspectfile Al_sg225.ncmat, should always bring up an inspection of that local file, and NOT the file from the NCrystal data library?

How is it now solved: TextData objects and factory infrastructure

All in all, what we had worked most of the time, but had nasty surprises and was not easy to maintain in the light of e.g. plugin support. Thus, in NCrystal v2.5.0, the data source management is completely rewritten. At a technical level, factories parsing data in various formats (.ncmat, .nxs, ...) no longer opens files directly, but instead uses new TextData objects which are delivered by the same factory infrastructure as Info, Scatter, and Absorption objects. Simply speaking, this infrastructure ensures that the delivered TextData objects will get a new unique-identifier (UID) if-and-only-if the content changed. The presence of such an UID makes the whole issue of implementing robust and efficient down-stream caching (e.g. in Info object factories) trivial.

For those curious, a few words about how NCrystal determines that content of a file has changed: This is trivial for in-memory files since such file content has to always be registered explicitly. For on-disk files, NCrystal is now conservative. Thus, instead of relying exclusively on things like file names or unreliable file time stamps, NCrystal will always re-read the entire file, before comparing the contents to previously read content in order to determine if a previously provided TextData object can be reused. This implies that NCrystal keeps an internal cache of previously read file content, but of course care has been taken to keep the memory usage sensible: if data from too many files is cached, the least-recently accessed object will be discarded (in fact all object factories now use such keep-last-N-accessed strategies, including Info/Scatter/Absorption factories). As input files are mostly small, O(10kB), and at most O(100) of them are cached, the impact on memory consumption should be at most O(1Mb), and typically much much lower since the majority of use-cases does not involve opening 100 files in a single job. Note that for larger files (i.e. those with full neutron scattering kernel data which normally requires around 1Mb per file) we keep just 10 such recently accessed files.

New feature: Flexible data source control.

The new factory based infrastructure makes it well-defined which files takes precedence, and by prefixing the factory name + :: to the filename, results are restricted to that factory. For example:

auto info1 = NCrystal::createInfo("stdlib::Al_sg225.ncmat");//Only take Al_sg225.ncmat from the standard NCrystal data library
auto info2 = NCrystal::createInfo("relpath::Al_sg225.ncmat");//Only take Al_sg225.ncmat from the current working directory
auto info3 = NCrystal::createInfo("./Al_sg225.ncmat");//Only take Al_sg225.ncmat from the current working directory
auto info4 = NCrystal::createInfo("Al_sg225.ncmat");//Check all sources in order of priority (by default preferring files
                                                    //in the current working directory)
auto info5 = NCrystal::createInfo("/some/where/Al_sg225.ncmat");//Only consider a file at that absolute path

Fine-tuning data sources are also possible. The full range of options is listed in NCDataSources.hh for C++, ncrystal.h for C, or by the command "help(NCrystal)" in Python. Here are some examples from Python (assuming NCrystal has been imported with import NCrystal as NC):

  1. Add extra custom directory in the search path:
NC.addCustomSearchDirectory('/my/extra/data/dir')
  1. Only allow files from that custom directory (note that NC.removeAllDataSources() also removes any files previously registered with NC.registerInMemoryFileData(..)):
NC.removeAllDataSources()
NC.addCustomSearchDirectory('/my/extra/data/dir/')
  1. Only allow files from the standard NCrystal data library (not even absolute or relative paths to on-disk files can be used):
NC.removeAllDataSources()
NC.enableStandardDataLibrary()
  1. Only allow files specified via relative and absolute paths, ignoring standard data library and other search paths:
NC.removeAllDataSources()
NC.enableAbsolutePaths()
NC.enableRelativePaths()
  1. Disallow files specified via relative paths (to make the application independent of the current working directory):
NC.enableRelativePaths(False)

New feature: Anonymous data handling.

As a nice side-effect of the new infrastructure, it is now possible to feed anonymous text data directly to the NCrystal::MatCfg constructor in C++. At the Python level, this has made the following new function possible:

mat = NCrystal.directMultiCreate( """NCMAT v1
@CELL
    lengths 4.07825 4.07825 4.07825
    angles 90. 90. 90.
@SPACEGROUP
    225
@ATOMPOSITIONS
    Au 0. 0.5 0.5
    Au 0. 0. 0.
    Au 0.5 0.5 0.
    Au 0.5 0. 0.5
@DEBYETEMPERATURE
    Au   167
""" )

# Now mat.info, mat.scatter, and mat.absorption are NCrystal.Info,
# NCrystal.Scatter, and NCrystal.Absorption objects respectively,
# which can be used in the usual fashion:

print(mat.info.getDensity())

Note that usage of such anonymous data is not in general recommended since it prevents proper caching and can not be as easily shared with colleagues using NCrystal in other contexts as a well defined file + filename. However, it might be suitable for some use-cases, such as when analysing how automatically varying input parameters affects derived results (e.g. when fitting crystal lattice parameters to a diffraction spectrum).

New feature: in-memory load of all supported formats.

In the past, only .ncmat files could be registered in and loaded from memory, while .nxs/.laz/.lau files required physical on-disk files. Based on the new TextData infrastructure, now also .nxs, .laz and .lau files can be loaded in this manner. But of course, we generally recommend NCMAT data unless you have a particular need to load another format.

Modern C++

Since NCrystal release v2.0.0, NCrystal requires C++11 or later C++ standards for compilation. However, by release v2.4.0, most of the code in NCrystal was still originating from a time where NCrystal was required to be C++98 compatible in order to support usage on computing clusters with RHEL6-like platforms (which were still widespread a few years ago). Apart from giving our code-base a rather dated "back to the 90s" feel, this crucially meant we could not take full advantage of the benefits of Modern C++: smart pointers, move semantics, return-value-optimisations, cross-platform multi-thread support, to name a few important ones. To help people using the NCrystal C++ API understand the changes (especially plugin developers!), we will in the following give a brief overview of the changes and design decisions that are most important to be aware of.

Shared memory management.

Until now, objects requiring dynamic allocation in NCrystal were created and deallocated using explicit calls to new and delete, and shared-ownership was handled by intrusive reference-counting: All classes with need for shared ownership had to derive from the RCBase base class, and code was responsible for invoking ref() and unref() methods on the objects at the appropriate times. Although RAII helper-classes such as RCHolder were available, the overall solution was still somewhat inelegant and problematic from the point of view of exception safety and multi-thread safety.

In the new release, lifetime management is now exclusively implemented with C++11's smart pointers: Non-shared objects reside in std::unique_ptr's, shared objects reside in either std::shared_ptr or in NCrystal::shared_obj's, and initialisation is performed with std::make_unique, std::make_shared, or NCrystal::makeSO respectively. The custom NCrystal::shared_obj template class is simply a thin wrapper around a std::shared_ptr, but with compile-time and run-time guarantees preventing such an object from being empty (null). In short, where std::shared_ptr's behave like pointers (users must check that they are not empty before using them), NCrystal::shared_obj's behave more like references (users can just go ahead and use them, no need to litter the code with if (myobj!=nullptr) checks everywhere). It also means that an NCrystal::shared_obj<T> object can be passed directly to a function expecting an argument of type const T&, again simplifying code.

Usage of these smart pointers makes the chance of lifetime-related bugs in NCrystal much much lower than previously, hopefully almost non-existent.

Move semantics and return-by-value

Even better than having smart pointers manage heap memory allocation, is to simply avoid heap usage in the first place and just create objects on the stack! In C++98 this was mostly an option for small or local temporary objects, due to performance concerns. However, with move-semantics and return-value-optimisations (RVO) in C++11, there is now usually little or no overhead involved in doing so. Therefore, in NCrystal v2.5.0, many functions have been changed so that they simply provide their results as a return value, as opposed to the less semantically-clear option of modifying passed-in parameters. Contrast the old:

//Old way:
double outdir[3];
double deltaE;
myScatter->generateScattering( myEkin, myNeutronDirection, outdir, deltaE );
//outdir and deltaE have now been modified with the result of a scattering (although
//this is not immediately obvious from just looking at the code here).
std::cout<<"outcome: "<<deltaE<<" "<<outdir[0]<<" "<<outdir[1]<<" "<<outdir[2]<<std::endl;

with the new:

//New way:
auto outcome = myScatter.sampleScatter( myEkin, myNeutronDirection );
//outcome.ekin and outcome.direction now hold the state of the outgoing neutron:
std::cout<<"outcome: "<<outcome.ekin<<" "<<outcome.direction"<<std::Endl;

Because of RVO, the new way is not only more semantically clear, but it is actually just as (if not more) efficient than the old way.

In addition to this, move semantics is used to avoid costly copying of data, and many data-heavy classes in NCrystal are now marked explicitly "MoveOnly" to ensure that there is no chance of accidentally invoking an expensive copy operation:

class NCRYSTAL_API Process : private MoveOnly {
public:
   //...
};

Strongly typed interfaces

One clear benefit of C++ over many languages is that it is supports strongly typed interfaces. To understand the benefit of this, consider a function like:

  //Example function (old way):
  double debyeIsotropicMSD(double debye_temperature, double temperature, double atomic_mass);

which although simple, is a bit of a "footgun", in that a developer must be very careful to not mix up the argument order when calling it. And if in the future the arguments change order for some reason (perhaps someone is cleaning up inconsistencies with another function in the same file where the temperature is the first argument), or simple replace arguments (perhaps the interface changes from taking a temperature value in kelvin to taking an equivalent kT value), then the previously working code will suddenly be providing arguments that are invalid or in the wrong order. But, crucially, everything will still compile happily and simply provide wrong run-time results!

In the new NCrystal release, we now have strong type encapsulation for a lot of common variables. Behind the scenes, all variables that used to be a double are still stored in a double - and once compiled nothing will have changed: the strong type encapsulation is simply a compile-time guarantee. Furthermore, the underlying double can of course still be accessed on either explicit conversion requests or through the member function .dbl(), which is of course important when ultimately needing to perform calculations with the values. With the strong types in the new release, the function above now looks as:

  //Same example function (new way):
  double debyeIsotropicMSD( DebyeTemperature, Temperature, AtomMass );

Not only will any argument-misordering now result in a clear compile error, but the calling code will also be forced to be written in a more readable manner. Contrast:

  //Calling the function (old way, won't compile anymore)
  double msd = debyeIsotropicMSD(350.0, 250.0, 12.0);

with:

  //Calling the function (new way):
  double msd = debyeIsotropicMSD( DebyeTemperature{350.0}, Temperature{250.0}, AtomMass{12.0});

Yes, the second and new version is more verbose, but anyone looking at the code will immediately know what the parameters are, without having to rely on (possibly outdated) comments, or hunt down and check the correct header file where the function is defined. And if someone ever changes the argument order of the function, the calling code will immediately stop compiling.

NCrystal release v2.5.0 introduces 14 such strong type-wrappers of variables that used to be represented as raw double's, but more might be added in the future as needed:

  • NeutronWavelength, NeutronEnergy, CosineScatAngle, CrossSect, Temperature, DebyeTemperature, AtomMass, SigmaBound, SigmaFree, SigmaAbsorption, Density, NumberDensity, MosaicityFWHM, MosaicitySigma

In addition to scalars, NCrystal code also has frequent usage of mathematical vectors, in particular three dimensional: directions, crystal axes, (h,k,l) points, plane normals, unit cell positions. Currently, 7 such vector types are used throughout NCrystal, with more likely to be added in the future:

  • NeutronDirection, LabAxis, CrystalAxis, HKLPoint, LCAxis, HKLInfo::Normal, AtomInfo::Pos, and Vector.

As intended, all of these are strongly typed, and can not be implicitly converted to each other. However, they can be explicitly converted (or more correctly, "reinterpreted", since no copying of data is needed), with the .as<>() template method. This is mostly used to convert other types to and from the Vector class, which is an internal class representing generic mathematical 3-vectors, and which have a lot of mathematical utilities available which are not available for the other types. Here is for instance how two NeutronDirection's can be used as generic mathematical Vector's, in order to calculate the length of their cross product (as a silly example):

double myFct(const NeutronDirection& dir1, const NeutronDirection& dir2 ) {
  return dir1.as<Vector>().cross( dir2.as<Vector>() ).mag();
}

Finally, it should be mentioned that all the vector types above have methods available for directly converting to/from std::array<double,3> and C-style arrays (double[3]).

Modern C++ utilities

To support easy and efficient internal development of code in NCrystal, several modern C++ utilities and classes were re-implemented in NCrystal (some of these classes will be eventually down the road be replaced with standardised versions from C++17 or C++20). A few of the more useful ones are (a few were available earlier than NCrystal v2.5.0 but included here for completeness):

  • NCrystal::Optional<T>: This custom implementation of C++17's std::optional, is a class template which contains an optional contained value, i.e. a value that may or may not be present. This provides clear benefits for semantics and convenience when for instance dealing with optional arguments or return values, and does away with the need for certain "magic" values of types used to specify inavailability (e.g. putting a double to -1.0 to signify "not-available").
  • NCrystal::Variant<T1,T2>: This custom implementation of C++17's std::variant, is a class template which contains either a T1 or a T2 value, or might be empty. This is not exactly like the more powerful std::variant, which supports more than two types, but it suffices for our needs.
  • NCrystal::Span this custom implementation of C++20's std::span is a type-erasure class which can point at any contigious range of values. Thus, Span<double> can represent a std::vector<double>, a std::array<double,N>, a raw double[N] C-style array, a SmallVector<double,N> (see next item), or possibly a sub-range of any of those. A function needing to work with arrays of numbers can simply take Span<double> arguments, leaving it up to the caller what kind of actual type they want to use for their array.
  • NCrystal::SmallVector<T,N> is as the name implies a small-vector implementation (a.k.a. small-buffer optimisation (SBO)). In essence, a SmallVector<double,5> object is intended to work exactly like a std::vector<double> object, with the difference being that if there are at most 5 items in the container, no actual heap storage is needed - the items are instead stored directly on the SmallVector object itself. Thus, SmallVector's should be used wherever a vector will typically have very few elements - but needs to support arbitrarily many elements for rare use-cases. The benefit is not only that the CPU and memory overhead of a heap allocation can be saved, but also that the items are more likely to be in a low-latency memory cache when needed.
  • More Python-like iteration: Due to their usefulness (less typing, less bugs), we add custom iteration helpers similar to Python's range and enumerate functions. Thus, in NCrystal code we can do:
for ( auto i : ncrange(v.size()) )
    std::cout<< v[i] << std::endl;
for ( auto e : enumerate(v) )
    std::cout<<" Item at index "<<e.idx<<" has value "<<e.val<<std::endl;

Various other note-worthy changes

A few updates were implemented which, while unrelated to other changes mentioned on this page, might be interesting to some users. They are:

  • AtomInfo and DynamicInfo sections on Info objects (in both C++ and Python) now have convenience links, making it easy to dig out the corresponding AtomInfo object from a DynamicInfo object, and vice-versa.
  • The concept of global versus per-element Debye temperature was not very useful from either a technical or physics point of view. Mostly it just resulted in complications for code trying to access the Debye temperature of a given element. Therefore, Debye temperatures are now always just available (in C/C++/Python) as per-element values. The exception (for reasons of backwards compatiblity) is that in NCMAT files, it is still allowed to specify a global Debye temperature. However, when loaded, it will simply be transferred to all elements as a per-element Debye temperature.
  • The name and signatures of the methods used to sample a scattering have changed, so the functions named generateScattering and generateScatteringNonOriented in C++ and Python have been replaced with functions named sampleScatter and sampleScatterIsotropic (note that the "isotropic" does not refer to the scattering being isotropic, but that the material in which the scattering took place is isotropic — true for instance for a crystal powder). In C++ the methods now take strongly typed arguments (e.g. the kinetic energy of the neutron must now be of the NeutronEnergy instead of a plain double), and return their results by value. The outcome always include the final-state energy of the neutron, and either a vector NeutronDirection, or the cosine of the scattering angle, μ=cos(θ). The functions in the C-API were accordingly changed from e.g. ncrystal_genscatter to ncrystal_samplescatter, and the arguments also slightly changed.
  • Note that when using the Python interface, NCrystal's default random number stream used to be replaced by the global stream from the standard Python random module whenever the NCrystal module was imported. This is no longer the case, users can of course still perform the change manually if for some reason they prefer the previous behaviour.
⚠️ **GitHub.com Fallback** ⚠️