Debugging - GRTLCollaboration/engrenage GitHub Wiki
This code may be the simplest NR code I could imagine but it took a lot of work to debug from its initial iteration! I share the details with you as an honest assessment of how much work testing a new code can be. Don't underestimate it!
How did I debug?
Just rereading the code caught a few mistakes but it would have been impossible to debug without specific tests.
The most efficient way to start was to calculate specific tensor quantities like the Ricci tensor, as here, for which I had obtained analytic solutions already in Mathematica. These quickly identified some errors in the specific tensor quantities (mostly trivial sign and addition errors).
Once these were fixed the time evolution still seemed wrong. So I wrote a test of the RHS equations for a non trivial BH solution (ingoing Eddington Finkelstein coordinates), as here. This identified the last of the errors, including a typo in one of the equations in the papers I was copying. Lesson learned - rederive all equations from scratch! I am grateful to Ulrich Sperhake for suggesting this test.
At this point I had a strange bump in the aij evolution in the centre. I spent a lot of time debugging, thinking that factors of 1/r were spoiling my evolution, and basically rewriting the code so they were all explicitly factored out. This did not change anything! It turned out I just needed more Kreiss Oliger dissipation to stabilise things at the inner grid points. I found this by noticing that turning off the dissipation made the problem worse. Of course it would have been better to find this first, but I don't regret eliminating the 1/r issue from my line of enquiries, I learned a lot in the process about the rescaling and my next guess turned out to be good. The lesson here is that very often the bug isn't what you think it is, so always keep an open mind to other things.
Finally my performance was still pretty terrible and I needed very high resolutions to get good results. The outer boundaries were also spoiling things for me - implementing the correct fall off with r for the variables helped a bit, but I decided that I needed to implement logarithmic coordinates as in other spherically symmetric codes to be able to use only a small number of points efficiently. Fortunately Thomas Baumgarte had very kindly shared his code with me and I was able to copy almost exactly his implementation of the finite difference stencils (at the expense of lowering their order). This worked very nicely almost straight away, which shows the value of copying existing tested code wherever you can.
There may be other bugs I haven't yet detected, or ways to improve the code further, but seeing the correct collapse of the lapse in the BH example with only a few minutes runtime was a very happy moment for me! Whilst there were times where I was a bit annoyed, I never doubted that I would get the code working eventually. There is a large but finite number of things that can be wrong, and a systematic elimination of each will eventually get you to a working code, even if it takes a long time. Final lesson: don't give up!