2025.7.3 Meeting Notes - parthenon-hpc-lab/parthenon GitHub Wiki

Tentative agenda

JM:

Removing "next step" from parameter input
Running into some strange issues on the CI machine
Working on adding docstrings, PR coming soon (possibly some conflicts with TOML PR)
LR: Saw a similar CI issue, but didn't diagnose

BP:

Opened up a PR that optionally allows inputs to use TOML
Keeping backwards compatibility is tricky
Allows strong typing, dates, nested blocks, etc.
Need to "tomlize" certain aspects of current input files to use them with TOML
Has backward compatibility with tests, but need to test with downstreams
Since things are strongly typed, this creates some backward compatibility issues, but maybe worth going in this direction
Also PR for output history headers on restart, this is required if history outputs are changed on restarts
JM: Surprised that TOML switch over didn't reduce LOC more. Also a little concerned about how it is included via cmake, but not clear there is a better way though. Maybe just include the header directly in our repo?

LR:

AR:

Fixed issue with scatter views on OpenMP
Fixed issues with unified par dispatch, got it working for Forrest (more in Forrest's update)
Subpack PR ready for review soon

FG:

Working on testing register pressure and timing in AthenaPK using unified par dispatch. Looks like no change in register pressure on Hopper 100s between current Parthenon and Adam's unified par dispatch.
Tricky to track down changes in register pressure though
Currently AthenaPK seeing a 1/70th slowdown moving to this PR

BB:

PR adding build info to parthenon output ready to go. (LR will review, JM will review as well)
Going to get time dependent boundary conditions working

PM:

No updates, reviewed coalesced comms PR
Do we need dev execution space fences? Maybe required for when you are using multiple streams since par_for maybe don't run sequentially.
FG: We don't run on the default stream, but unsure how that is set.
JM: Does Kokkos dev execution space default stream
PM: Would like a formal reference showing that what we are doing is safe.
Some links from the chat:
- https://kokkos.org/kokkos-core-wiki/ProgrammingGuide/ProgrammingModel.html
- https://kokkos.org/kokkos-core-wiki/API/core/execution_spaces.html#functionality
- https://docs.nvidia.com/cuda/cuda-runtime-api/stream-sync-behavior.html
- https://developer.nvidia.com/blog/gpu-pro-tip-cuda-7-streams-simplify-concurrency/ CUDA Runtime API :: CUDA Toolkit Documentation GPU Pro Tip: CUDA 7 Streams Simplify Concurrency | NVIDIA Technical Blog

Assigned reviewers to each PR
LR: Need to decide on default behavior of coalesced comms, probably worth it to add a test that runs the same problem in a bunch of different configurations (comms, pack_size, restart from different outputs) for a few steps to test for bitwise exactness.