Main changes and differences between McStas 2.x and 3.x described at 3.0 release time (Jan 2021) - mccode-dev/McCode GitHub Wiki
Changes and differences between the McStas 2.x series and 3.0
The purpose of this page to is to serve as user documentation for users making the switch from McStas 2.x series to the 3.0 series.
Most changes are to the back-end code-generator and will not be visible to users apart from obvious benefits like being able to use the power of GPUs for simulations.
Installation
Our install docs are now available on the McCode GitHub page at https://github.com/McStasMcXtrace/McCode/tree/master/INSTALL-McStas
To install a full McStas 3.0 suite please install the "mcstas-suite-*-ng" packages suffixed with ng (for next generation), e.g. mcstas-suite-python-ng. This to allow firendly coexistence with the 2.x releases. and not force an automatic update on users.This is important since some changes in 3.0 break backwards compatibility - in particular for user who have written components themselves.
GPU acceleration is provided on the basis of the OpenACC programming paradigm. Thus, to take advantage of this feature users are required to install the NVIDIA c-compiler (available at https://developer.nvidia.com/hpc-sdk). It also implies that, at present, GPU-acceleration is only available on Linux systems.
Caveat emptor: Although a lot of care and effort has gone into proofing McStas 3.0, the changes to the code-generator are profound, and it is not unlikely that you will experience issues. As always, should you experience issue reach out to your friendly development team, and we will try to help you as best we can.
New concepts in McStas 3.0
Funneled mode
Changes to the McStas grammar and syntax
- OUTPUT PARAMETERS have been superceded by the new understanding of DECLAREd variables (see below), and are now uneccessary. Hence any OUTPUT PARAMETERS(...) line is now ignored, and has been removed from library components.
- DECLARE variables have a slightly more strict syntax than what is allowed by the c-standard. Each decalration has to be on its own line. Here's what we mean:
DECLARE %{
double myvar1;
double myvar2;
int myint1;
int myint2;
%}
is allowed, whereas
DECLARE %{
double myvar1,myvar2;
int myint1,myint2;
%}
is not.
Variables declared in DECLARE should not be used as lvalues inside the TRACE section of either INSTRUMENT or COMPONENTs, i.e. assignments to such variables should be restricted to INITIALIZE and SAVE sections. If an assignment is nevertheless present in TRACE the code will likely crash or yield unpredictable results when used in conjunction with GOU-acceleration.
-
USERVARS. This is a new concept to address some issues arising from the new massively parallel threading model required for GPU acceleration. A USERVAR is a variable that is attached to the particle being traced through the instrument, and hence is allowed to change its value along the particle trace. This is in direct opposition to DECLARE variables which should not. As an important example use is user variables in combination with Monitor_nD-instances. Variables (not predefined) to be monitored by Monitor_nD must be declared as USERVARS. Please also see below for the syntax.
-
DEFINITION PARAMETERS are no longer allowed. All component parameters must now be SETTING PARAMETERS. This allows the new code-generator to greatly cut the size of the code generated (in some extreme cases a factor of 5) greatly. To facilitate (most of) the uses of DEFINITION parametere we have enriched the grammar to include also
vector
andstring
types as SETTING PARMATERS. Like so:
SETTING PARAMETERS( a=0.1, MCNUM b=1,.2 double c=3.4, int d=5, string e="hello", vector f={6,7.8} )
-
A few new macros and functions have been added: --
INSTRUMENT_GETPAR("thing")
which will extract the values of the instrument parameter. Note that the parameter name should be given as a string, i.e. quoted. --COMP_GETPAR("")
Akin to the MC_GETPAR macro available for Mcstas 1.x/2.x --MC_GETPAR3() -- 'particle_get_var(_particle p, const char name, int *return)
Function to extract the value of a particle state parameter (such asx
, ´vx´, or ´p´) or USERVARS at the time of the function call. Note that the parameter name should be supplied as a string. -
NOACC and CPU COMPONENT keywords, signalling that a component is not thread safe and will not run on GPUs. NOACC is used inside COMPONENT code, whereas CPU COMPONENT is used in .instr-files. For instance, a non-GPU component might have the following header
DEFINE COMPONENT foo
SETTING PARAMETERS( a=0, b=1, string c="d")
NOACC
If an instrument which includes this component is compiled with OpenACC, this will trigger running McStas in FUNNELed mode (see below). The simulation up to the point of foo
will be run GPU-side, after which foo will be traced for all remaining rays using CPU-side calculations. Should any components remain to be run, rays will be transferred back to the GPU-device(s) again and computed there until the end (or until another NOACC component is encountered.
The same effect can be targeted from the .instr-file by prepending the component instance with CPU:
CPU COMPONENT psd0 = PSD_monitor( ... )
AT(0,0,0) RELATIVE PREVIOUS
While PSD_monitor is ready for use on GPUs, in this case it will be forced to run CPU-side.
Raw ChangeLOG as included in the McStas distribution
Changes in McStas 3.0, December 15th, 2020
McStas 3.0 is the first official release in the 3.x series, with a modernised code-generator and support for GPU acceleration on NVIDIA cards.
Thanks:
- Thanks to all members of the joint McStas-McXtrace team, you guys ROCK!
- A special thanks to Jakob Garde who has continued to contribute (unpaid!) to the 3.0 efforts even after leaving DTU.
- Thanks to Guido Juckeland (HZDR,DE) and Sebastian Alfthan (CSC,FI) who were behind the GPU Hackathons we participated in
- Thanks to our NVIDIA mentors Vishal Metha, Christian Hundt and Alexey Romanenko
Installation:
- Our install docs are now available on the McCode GitHub page at https://github.com/McStasMcXtrace/McCode/tree/master/INSTALL-McStas
- The meta-packages for Debian/Ubuntu and RedHat/Centos/Fedora are named e.g. mcstas-suite-python-ng for 'next generation' for coexistance with the 2.x series packages
- OpenACC GPU (and CPU multicore) acceleration is at the time of release ONLY supported on Linux systems, as this is the only platform targeted by the NVIDIA HPC package. Versions 20.x should all work. On Windows 64bit systems, support is promised to arrive with WSL 2.0 (i.e. via a Linux-layer), but may also become supported with a targeted release by NVIDIA. macOS is unfortunately not supported by NVIDIA HPC acceleration.
Main new features and changes:
- New code-generation scheme based on functions instead of #defines, which brings
- Much improved compilation-times, the code is better suited for modern compilers
- In most cases a speed-up of order 20%
- The neutron _particle is now represented by a struct
- The component types and instances are also represented by structs
- In the generic TRACE function of a given component type, the _comp var is short-hand for "whatever the component instance is"
- New instrument section of USERVARS %{ double example_flag; %} which enriches the _particle struct
- In component DECLARE blocks, assignments can no longer be done and all declarations must be listed independently, i.e double a; is OK, double a,b; is not. Variables in this scope are automatically so-called "OUTPUT PARAMETERS" (we may deprecate that keyword completely for the official McStas 3.0 release)
- Components no longer support DEFINITION PARAMETERS, instead the SETTING PARAMETERS must be used, which now includes a vector and string type supplementing the (default) double/MCNUM and int types.
- New macros have been added for
- INSTRUMENT_GETPAR(parameter_name)
- COMP_GETPAR(component, parameter_name) which is similar to the legacy MC_GETPAR
- MC_GETPAR3(component_class, component_name,parameter_name)
- The function particle_getvar(_particle,"variable",success) provides component-access to the instrument USERVARS, e.g. in Monitor_nD.
- Further, the new cogen implements support for Nvidia GPU's, for details see point 2 below.
- Support for OpenACC acceleration on NVIDIA GPU's on Linux systems
- #pragma driven, inserted by the code-generation, but also implemented in libs and comps
- Speedups measured using top-notch NVIDIA V100 datacenter cards are in the range of 10-600 with respect to a single-core of a modern CPU, see the figure in the below link. It was generated for an "ideally" parallel instrument. (https://camo.githubusercontent.com/96fcceec70d0761f8709eda4de7bcbde5597aee6/687474703a2f2f746d702e6d63737461732e6f72672f563130302d73706565647570732e706e67)
- Platform support / compiler configuration:
- Required compiler for GPU/OpenACC: NVIDIA HPC SDK 20.x or newer. Community edition works fine
- Required GPU hardware: NVIDIA Tesla card + configured driver
- Windows: At this point UNSUPPORTED for GPU/OpenACC since NVIDIA does not yet ship a package for this platform. Support should come with WSL 2.0 or via native support from NVIDIA.
- macOS: At this point UNSUPPORTED for OpenACC since NVIDIA does not ship a package for this platform.
- Linux: Full acceleration support with GPU, and with CPU/multicore.
- Install the compiler and put it on your system PATH. Install and configure Nvidia drivers for your card.
- We hope that GCC will offer better support for OpenACC in the near future.
- Tool support
- On Linux and macOS mcrun is preconfigured so that mcrun -c --openacc compiles with:
- Linux: nvc -ta:tesla,managed,deepcopy -DOPENACC
- Linux: You may configure for use on CPU/multicore via: nvc -ta:multicore -DOPENACC
- The --funnel option can be used to launch the FUNNEL simulation flow, see description below.
- For both of the above, adding -Minfo:accel will output verbose information on parallelisation
- In mcgui, the mcrun --openacc configuration can be selected via the preferences
- Both mcgui and mcrun allow combining --openacc and --mpi if you have multiple GPU's available. The n'th mpi process will attempt to use the k'th GPU, where k = #available GPU's % #MPI nodes.
- Special McStas 3.0 grammar for mixed CPU/GPU mode:
- A "FUNNEL" mode has been added, which allows
- Mixed GPU/CPU mode, were sections of the instrument are executed on each device type, with copying of neutron-bunches back and forth.
- When this instrument grammar is specified, it signifies that the component should be executed on CPU rather than GPU. CPU SPLIT 10 COMPONENT Sample = Something()
- Sections before and after that are not marked CPU will be executed on GPU.
- If a component includes the NOACC token in the component header, the CPU-mode is forced through the compilation, as it signifies that the component does NOT support GPU. This is for the time being the case for Union_master. (Support is expected to come with McStas 3.1)
- Interoperability with McStas 2.7
- Support for MCPL event interchange has been added through MCPL_input and MCPL_output components, that work both on CPU and GPU for McStas 3.0. Note however that targeting GPU, MCPL_input reads ALL particle events durin INITIALIZE and MCPL_output writes ALL particle events during SAVE, whereas when using CPU in 3.0 or 2.7, reads and writes happen during the TRACE flow.
- Known limitations
- The Union subsystem works on CPU only for now, but can be used in the mixed GPU/CPU funnnel mode as mentioned above. Union_master is a NOACC component.
- The same solution is applied in use of the NCrystal_sample and will eventually come for Sample_nxs.
- Not all features of all components correspond to those from McStas 2.7, but all essential components have beenfully ported from the 2.7 tree to the 3.0 tree. Hence, some parts distributed with McStas 2.7 will either not exist in the 3.0 release or may not function, due to either: (1) very specialised features (2) maintainability issues or (3) use ofcomplex algorithms.
- Notable examples of unsupported / non-functional components or instruments are:
- The scatter_logger and shielding_logger components / instruments
- The Vitess_chopperfermi component
- The FZJ_BenchmarkSfin2 instrument / SANS_benchmark2 component
- The MCPL_merge instrument
- SEMSANS_instrument
- Sample_nxs
- Generally, most components/instruments are now ported to our OpenACC based GPU-technology, but you
likely may find combinations of use that slipped through our not fully exhaustive test-suite. Missing
support may come in the form of either
- Code that does not compile
- Instruments that segfaults during execution
- Instruments or components that produce obscure results
- At the time of release, the nightly tests http://new-nightly.mccode.org/ show that
- McStas 3.0 ships with 211 instruments that succesfully compiles
- These instruments use 147 of our components
- We don't ship an updated set of manuals for McStas 3.0, but essential documentation is available on the McCode GitHub wiki https://github.com/McStasMcXtrace/McCode/wiki
Tools:
- The tools for McStas 3.0 correspond to those distributed with McStas 2.7, apart from the above-mentioned mcrun options for --openacc and --funnel.
- On macOS (from 11.0 Big Sur onwards), mcgui will assume light/dark mode with the system settings. (The change came from using the system python3 with our app/miniconda-distributed Qt libs etc.)
- We now no longer officially support the perl/PGPLOT backend, these may or may not work on your system.
- The mcformat utility has been deprecated.
Platforms:
- Nothing really new to report here. We still support 64bit Windows 10, all recent 64bit macOS including 11.0 Big Sur, Debian-based and RPM-based distros. (RPMs are built on/for CentOS, you may get varying milage elsewhere.)
Libraries:
- Updated version 2.2.2 NCrystal library from T. Kittelmann (ESS) and X.X. Cai (CSNS), distributed with McStas on Unix platforms only. With McStas 3.0, NCrystal is more tightly integrated and should run without using ncrystal_preparemcstasdir.
- MCPL library from the same authors now included at v. 1.3.2
We hope you will enjoy this new release!!!