RabbitMQ version to use - aiidateam/aiida-core GitHub Wiki

Summary

Unfortunately, RabbitMQ with versions >= 3.8.15 have reduced some default timeouts, making it in practice incompatible with AiiDA and its design: AiiDA will work, but any workflow running for longer than 30 minutes (or whatever timeout is set by RabbitMQ) will except.

For more details, you can check issue #5105 (and optionally the related #5278 and #5300).

When using RabbitMQ >=3.8.15 you may edit a configuration file to change the timeout configured by default (see below). Otherwise, you can try installing RabbitMQ <=3.8.14 (e.g. 3.7.28). Since this version is somewhat dated, we offer some suggestions on how to install it.

In the medium term, we may need to find a different solution (probably requiring to drop RabbitMQ and replace it with something else, that will require some development effort).

Using a recent RabbitMQ version

Ubuntu instructions

  • Become the root user (sudo su root)

  • Create /etc/rabbitmq/advanced.config with content (see docs)

    %% advanced.config
    [
      {rabbit, [
        {consumer_timeout, undefined}
      ]}
    ].
    
  • Restart RabbitMQ via

    service rabbitmq-server restart
    
  • Check that the configuration was properly picked up

    sudo rabbitmqctl environment | grep consumer_timeout
    

MacOS (Homebrew)

These instructions assume you have installed RabbitMQ using Homebrew (recommended).

  • Create the /opt/homebrew/etc/rabbitmq/rabbitmq file with the following content (see docs):

    # 1000 hours in milliseconds (increase if you expect your workflows to run longer)
    consumer_timeout = 3600000000
    
  • Restart RabbitMQ via

    brew services restart rabbitmq
    

Both Ubuntu & MacOS - wrap up

Finally, run rabbitmq-diagnostics status (you may need to open a new terminal first) and check that your new configuration file is listed under the "Config files" section.

TBD: If you already started AiiDA before, it might have already created queues with the small timeouts and won't allow you to change those and fail (aiida-core docker container: remove the ~/.rabbitmq folder )

For more details, see also this issue comment.

Once RabbitMQ is properly configured, the warning of the version that is not supported can be suppressed through:

verdi config set warnings.rabbitmq_version false

Using an old RabbitMQ version

In general, if you have a version of RabbitMQ < 3.8.15 (or you can install one on your computer in some way), then you are OK and you don't need to read further.

If however you are on a recent OS or distribution, the simplest way to use a compatible version of RabbitMQ is to run it via docker.

Instructions

  1. Install docker e.g. via your package manager or following instructions here
  2. run the following command in the command line:
    docker run --detach  --hostname aiida-rabbitmq --name aiida-rabbitmq-server --restart=unless-stopped --publish=127.0.0.1:5671:5671 --publish=127.0.0.1:5672:5672 --mount=type=volume,src=rabbitmq-volume,dst=/var/lib/rabbitmq rabbitmq:3.7.28
    

Some explanation on the command

  • It's important to specify a --hostname, see https://hub.docker.com/_/rabbitmq
  • We give a name to the container aiida-rabbitmq-server to easily identify it
  • We set the restart policy to unless-stopped (e.g. on failure or on reboot it restarts, if it was running before).
  • I publish the 2 relevant ports
  • I mount the data that RabbitMQ needs to store from a named docker volume (rabbitmq-volume), to persist messages between restarts
  • I start a 3.7 version (the latest available 3.7.28 at the moment of writing)

Note that the first time docker will have to pull the image; after that, subsequent restarts will be very fast.

Troubleshooting

  • if, when starting, the command complains that the container named aiida-rabbitmq-server already exists, use docker kill aiida-rabbitmq-server (if still running and you really want to restart it), and then docker rm aiida-rabbitmq-server to remove the old one.

    • Another option is instead to not specify a --restart in the command, but put --rm to clean up the container automatically upon stopping. Then one does not need to run docker rm, but needs to restart the service at every failure or reboot.
  • If docker complains that the ports (5671, 5672) are already used, it means you already have some other rabbitmq server running in your machine (probably a system-wide installation). Either stop it (and make sure it does not restart at the next reboot); or change the ports in the command above to bind the ports to other ones. However, remember then to also configure AiiDA to connect to the correct Rabbitmq ports!