Troubleshooting - Surpr1se0/mosaic5G-docs GitHub Wiki

Intro

This page contains all of the important information regarding relevant errors that have occurred during the configuration and implementation of the different versions of the OpenAirInterface Architecture.

1. No configuration file found

After executing the docker-compose up gnb command, while in XenOrchestra, the following error occurs:

===================================
/proc/sys/kernel/core_pattern=|/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E
No configuration file found: please mount to /opt/oai-gnb/etc/gnb.conf
[INFO tini (1)] Spawned child process '/opt/oai-gnb/bin/entrypoint.sh' with pid '7'
[INFO tini (1)] Main child exited normally (with status '255')

Essentially it states that the gnb.conf configuration file is not being found/mounted for the corresponding container as desired.

The gnb.conf configuration file is not being found/mounted to the container as it should be

Resolution:

  • Shut down the container
docker-compose down oai-gnb
  • Use the previously mentioned gnb image: instead of develop
  • Give docker permissions to be able to read the file (optional and not tested) through chmod
  • change the format of .yaml file inside docker-compose.yaml in gnb section to .conf instead of .yaml extension in both files.
  • Reconnect the container.
  • Check the logs:
docker logs oai-gnb
  • Additionally, check the amf logs to ensure that enodeb is already connected.

2. DNS not being able to resolve name addresses

DNS cannot resolve addresses over the interface created by the OAI 5G deployment. Other than that, it works as expected. Despite this, it is possible to connect to digital interfaces and different networks, as evidenced in the tests mentioned above.

Attempt no 1:


These were the steps followed to try to resolve or diagnose the problem:

  • Check if the problem could be with DHCP, the virtual machine (and not the docker instance):
sudo apt install isc-dhcp-client
sudo journalctl -u systemd-networkd | grep ens33
sudo journalctl -u systemd-networkd
sudo systemctl restart systemd-networkd
sudo nano /etc/netplan/01-network-manager-all.yaml

network:
version: 2
renderer: networkd
Ethernet:
ens33:
dhcp4: yes

sudo netplan apply
sudo dhclient -v ens33

# essentially we force dhcp
# from the client making a DHCP request to obtain
# an IP address with new DNS information

Docker-compose was run again but without resolution.

  • Check the resolv.conf and named.conf file to check the DNS servers the instances are created with - the file is automatically generated by docker-compose and cannot be edited.
  • Trying to insert options into docker-compose, as the documentation suggests - proved impossible as there is no information on which parameters we should insert, and in which components of docker-compose.
  • Try running other simulations - the problem does not exist as they do not mention attempts to communicate with external devices.
  • Force recording a known DNS server in the resolv.conf file via echo > - also without success.
  • Try to run the simulation with a different version of Ubuntu - the rest of the components do not run

Attempt no 2:


  • docker-compose.yaml, in the ue section, put the following, and restart the container.
dns:
  - 8.8.8.8
  • Despite this, we continue to have connectivity with ext-dn.

  • Make the following change:

oai-smf:
  container_name: "rfsim5g-oai-smf"
  image: oaisoftwarealliance/oai-smf:v2.0.0
  environment:
  - TZ=Europe/Paris
  volumes:
  - ./mini_nonrf_config.yaml:/openair-smf/etc/config.yaml depends_on:
  - hi-amf
  networks:
  public_net:
  ipv4_address: 192.168.71.133

#VERSUS

oai-smf:
 container_name: "rfsim5g-oai-smf"
 image: oaisoftwarealliance/oai-smf:v2.0.0
 environment:
 - TZ=Europe/Paris
 - DEFAULT_DNS_IPV4_ADDRESS=172.21.3.100
 volumes:
 - ./mini_nonrf_config.yaml:/openair-smf/etc/config.yaml depends_on:
 - hi-amf
 networks:
 public_net:
 ipv4_address: 192.168.71.133
  • Even so, we don't have a positive result.

  • Go directly to mini_nonrf_config.yaml - smf volume configuration file. Change one of the DNS values, do docker-compose down && up to restart. It didn't work.

  • Do iptables -L to see the host's routing rules. Insert the following rule:

iptables -A FORWARD -p icmp --icmp-type echo-request -j ACCEPT
  • Additionally check tail -f /var/log/ufw.log to check if there was any blocking. If applicable, do ufw disable. There is no longer any blocking but the ping still does not work.

  • ip route inside the ue container. Enable packet forwarding via sudo sysctl -w net.ipv4.ip_forward=1. Configure NAT sudo iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE.

  • Still without positive results.

Attempt no 3:


Let's look at the architecture, but more specifically all of the connections.

  • NR UE: 12.1.1.1
  • UPF: 12.1.1.2
    • 71.134 - incoming form demo_private.
    • 72.134 - connection to SGI e EXT-DN

Let's check the pings:

  • 12.1.1.1 → 12.1.1.2 Works

  • From 12.1.1.1 → 192.168.72.134 Works

Then:

  • Ping from NR-UEEXT_DN Works

But:

  • Ping from NR-UEInternet Does not Work

The UPF has 3 interfaces:

  • ETH0: connection to demo_private.
  • ETH1: connection to demo_publicoutwards and extdn
  • TUN0: connection com o UE

💡According to the UPF container: all traffic that does not have a defined route is being sent by the gw X.X71.129, through the eth0 interface.

If UE sends a ping to google.com:

UPF gets:

In ETH0 we can also see this:

In ETH1 we get:

We don't see any traffic come through...

With this said, insert the following commands:

ip route del default via 192.168.71.129 dev eth0
ip route add 12.1.1.0/24 via 192.168.72.134 dev eth1

Masquerade incoming packets from the UE with destiny to ext-dn:

sudo iptables -t nat -A POSTROUTING -s 12.1.1.0/24 -o eth1 -j MASQUERADE

Let's add the route, in the UPF, so tht everytime traffic not specified in the bellow networks get out by default behaviour in the X.X.72.135, which leads to ext-dn, through the correct interface:

ip route add default via 192.168.72.135 dev eth1

3. UE authentication problems

When running the disaggregated version of the architecture with gNB separated into CU and DU, the UE cannot be authenticated into the network, oai-amf docker logs show up as:

[UE] [ue] [info ] |    Index    |      Status      |       Global ID       |       UE Name       |               PLMN             |
[UE] [ue] [info ] |      1      |    UNREGISTERED     |         0x0       |         ue-rfsim        |            208, 99             |

Firstly we thought that this problem could be derived from the fact that the UE was not being provisioned correctly in the SQL, AMF or SMF. This was disproven by the fact the AMF could receive all of the correct parameters of UE. What fixed this problem was ditching the manual implementation of the UE and CN, and using a docker-compose solution to instantiate the elements of the network.

4. Container not Starting

The oai-gNB in a docker-compose does not run, showing this error:

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/compose/cli/main.py", line 81, in main
    command_func()
  File "/usr/lib/python3/dist-packages/compose/cli/main.py", line 203, in perform_command
    handler(command, command_options)
  File "/usr/lib/python3/dist-packages/compose/metrics/decorator.py", line 18, in wrapper
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/compose/cli/main.py", line 1186, in up
    to_attach = up(False)
                ^^^^^^^^^
  File "/usr/lib/python3/dist-packages/compose/cli/main.py", line 1166, in up
    return self.project.up(
           ^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/compose/project.py", line 697, in up
    results, errors = parallel.parallel_execute(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/compose/parallel.py", line 108, in parallel_execute
    raise error_to_reraise
  File "/usr/lib/python3/dist-packages/compose/parallel.py", line 206, in producer
    result = func(obj)
             ^^^^^^^^^
  File "/usr/lib/python3/dist-packages/compose/project.py", line 679, in do
    return service.execute_convergence_plan(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/compose/service.py", line 579, in execute_convergence_plan
    return self._execute_convergence_recreate(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/compose/service.py", line 499, in _execute_convergence_recreate
    containers, errors = parallel_execute(
                         ^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/compose/parallel.py", line 108, in parallel_execute
    raise error_to_reraise
  File "/usr/lib/python3/dist-packages/compose/parallel.py", line 206, in producer
    result = func(obj)
             ^^^^^^^^^
  File "/usr/lib/python3/dist-packages/compose/service.py", line 494, in recreate
    return self.recreate_container(
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/compose/service.py", line 612, in recreate_container
    new_container = self.create_container(
                    ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/compose/service.py", line 330, in create_container
    container_options = self._get_container_create_options(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/compose/service.py", line 921, in _get_container_create_options
    container_options, override_options = self._build_container_volume_options(
                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/compose/service.py", line 960, in _build_container_volume_options
    binds, affinity = merge_volume_bindings(
                      ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/compose/service.py", line 1548, in merge_volume_bindings
    old_volumes, old_mounts = get_container_data_volumes(
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/compose/service.py", line 1579, in get_container_data_volumes
    container.image_config['ContainerConfig'].get('Volumes') or {}
    ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
KeyError: 'ContainerConfig'

This problem happens when you use the docker-compose implementation without rebuilding the Docker Image previously referenced. It can generate this type of problem because the code/build is not present in the system and is being referenced by the docker-compose file.

5. RAN Nodes with same ID

The FlexRIC logs show the following when connecting to the CU and DU:

[E2AP]: E2 SETUP-REQUEST rx from PLMN 208.99 Node ID 3584 RAN type ngran_gNB_CU
nearRT-RIC: /flexric/src/lib/msg_hand/reg_e2_nodes.c:174: add_reg_e2_node: Assertion it_node == end_node && "Trying to add an already existing E2 Node"' failed.
Aborted (core dumped)

In order to start both nodes of the gNB in the FlexRIC we must add a different ID, inside the same PLMN to the CU and DU configurations. After making this small change, the FlexRIC receives both of the nodes correctly, not generating any errors.

6. Address Sanitizer in UPF

rfsim5g-oai-upf     | AddressSanitizer:DEADLYSIGNAL
rfsim5g-oai-upf     | AddressSanitizer:DEADLYSIGNAL
rfsim5g-oai-upf     | AddressSanitizer:DEADLYSIGNAL
rfsim5g-oai-upf     | AddressSanitizer:DEADLYSIGNAL
rfsim5g-oai-upf     | AddressSanitizer:DEADLYSIGNAL
rfsim5g-oai-upf     | AddressSanitizer:DEADLYSIGNAL
rfsim5g-oai-upf     | AddressSanitizer:DEADLYSIGNAL
rfsim5g-oai-upf     | AddressSanitizer:DEADLYSIGNAL

This problem shows up normally after instantiating the UPF. Apparently it means that the process behind the UPF has sufered a crash due to a segmentation fault. This could be related to the build or code of the UPF, or even the IP address allocation inside the network being shared by the swarm nodes.

These type of problems can happen when the components are started in the incorrect order, so adding the depends_on directive to the docker-compose.yaml file efficiently solves this problem.

Additionally, tinkering with the "ASAN_OPTIONS" on the upf section of the docker-compose file can also help. Either deactivate it or activate it.

This link exposes a solution -> sudo sysctl -w vm.mmap_rnd_bits=28

Docker Documentation states that: "You should create overlay networks with /24 blocks (the default), which limits you to 256 IP addresses", and that also helped the problem.

This also can be a memory allocation problem, check if you have a good amount of RAM available using free -h.

7. Using Absolute Paths for container volumes

This error comes after instantiating the gNB, using the docker-compose, where we mounted the volumes using absolutes paths, instead of relative paths to the docker-compose file. Docker can generate errors like this when this happens.

ERROR: for oai-gnb  Cannot start service oai-gnb: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error mounting "/root/docker-gnb/etc/gnb.conf" to rootfs at "/opt/oai-gnb/etc/gnb.conf": create mountpoint for /opt/oai-gnb/etc/gnb.conf mount: cannot create subdirectories in "/var/lib/docker/overlay2/20239959ec090e62fbfbf19c8dbfa721ab5597059d7bf0b318958649bcc0b447/merged/opt/oai-gnb/etc/gnb.conf": not a directory: unknown: Are you trying to mount a directory onto a file (or vice-versa)? Check if the specified host path exists and is the expected type
ERROR: Encountered errors while bringing up the project.

To fix it:

# Instead of using: 
~/docker-gnb/etc/gnb.conf

# Use: 
../../etc/gnb.conf

8. Incorrect Network Labels for E1 Docker Compose

The error can look like this:

network rfsim5g-oai-core-net was found but has incorrect label com.docker.compose.network set to "rfsim5g-oai-core-net" (expected: "core_net)

When starting the docker compose file located in `/5g_rfsimulator_e1, the labels for the docker networks can have problems. Make sure every label is the same name and if not, attempt at creating the networks externally:

docker network create --driver bridge --attachable --subnet=192.168.78.0/28 rfsim5g-oai-ue-net

Inside the docker-compose file:

  rfsim5g-oai-ue-net:
    name: rfsim5g-oai-ue-net
    external: true

Use this for every network.

9. Integrity Failed in CUUP:

This problem can happen when attempting to ping the UPF through the tunnel interface oai_tun0, in the e1_split simulation.

The error that shows up in the CUUP looks like this:

[PDCP] E discard NR PDU, integrity failed
Erro no SDAP ([SDAP] E /oai-ran/openair2/SDAP/nr_sdap/nr_sdap.c:49:sdap_data_req: Entity not found with ue: 0x0 and pdusession id: 0)

This essentially means that the integrity checks (mechanisms) between the UE and gNB, in this case the CUCP are not the same. The algorithm for integrity as well as the one for encryption is not synced properly with the CUUP and this would be causing the problem. You head into the configuration file for the cucp in: ../../conf_files/gnb-cucp.sa.f1.conf and make sure you are using:

  # preferred ciphering algorithms
  # the first one of the list that an UE supports in chosen
  # valid values: nea0, nea1, nea2, nea3
  ciphering_algorithms = ( "nea0", "nea2" );

  # preferred integrity algorithms
  # the first one of the list that an UE supports in chosen
  # valid values: nia0, nia1, nia2, nia3
  integrity_algorithms = ( "nia2", "nia0" );

  # setting 'drb_ciphering' to "no" disables ciphering for DRBs, no matter
  # what 'ciphering_algorithms' configures; same thing for 'drb_integrity'
  drb_ciphering = "yes";
  drb_integrity = "yes";

10. Docker Networks not working between Swarm nodes:

Docker Swarm overlay networks can sometimes have weird bugs, not syncing properly or propagating the changes to the network throughout the network. Sometimes, even random endpoints can be created witht the same IP address as other containers, which can impede the containers to start. The best way to fix this problems is to:

  • First check the journalctl -e -u docker logs and check for errors

  • Check the systemctl status docker for errors

  • Restart the containers until the services are healthy

  • Enable important modules used by Swarm:

    # Check if you have the modules
    lsmod | grep ip_vs
    
    # Add the modules
    sudo modprobe ip_vs
    sudo modprobe ip_vs_rr
    sudo modprobe ip_vs_wrr
    sudo modprobe ip_vs_sh
    sudo modprobe nf_conntrack
    
    # Check the changes
    lsmod | grep ip_vs
    
    # Make permanent adjustments
    echo "ip_vs" | sudo tee -a /etc/modules-load.d/ipvs.conf
    echo "ip_vs_rr" | sudo tee -a /etc/modules-load.d/ipvs.conf
    echo "ip_vs_wrr" | sudo tee -a /etc/modules-load.d/ipvs.conf
    echo "ip_vs_sh" | sudo tee -a /etc/modules-load.d/ipvs.conf
    echo "nf_conntrack" | sudo tee -a /etc/modules-load.d/ipvs.conf
    
    # Reboot the system
    reboot
    

Helpful Links that tackle the same problems:

Lastly, and more importantly, the docker-swarm does not like small subnet networks, so make sure every network is at least /24.

11. E2 Agent not compatible with gNB nodes:

UPDATE as of 24/03/2025 : This is no longer needed and an updated procedure is mentioned in Custom Docker Images. The FlexRIC container did not work with the default images of RAN components due to incorrect permissions being in place in the SMs of the named volume "SharedXAppLibs".

This problem can occur if you are not using the E2 custom built docker images that are mentioned in Custom Docker Images.

You must built your own CUUP and gNB (which can be the DU and CUCP) with compatibility for E2 Agent in order to properly connect to the flexRIC. Follow those tutorials in the wiki pages.

12. SCTP connection failed:

This problem most commonly occurs in the DU upon starting it. It can be easily fixed by, in the build commands or environment commands in the docker-compose file adding the CUCP IP instead of --MACRLCs.[0].local_n_address cucp.

This problem can also occur when the UE is not able to find the DU IP so make sure you have the correct IP set under the environment variables.

13. Synchronization Failures in UE:

This problem occurs if you are using the -E flag which stands for:

Apply three-quarter of sampling frequency, (example 23.04 Msps for LTE 20MHz) to reduce the data rate on USB/PCIe transfers (only valid for some bandwidths).

This flag must be used in all of the containers or not. Cannot be used only in the gNB or UE, otherwise you will get syncing issues.

14. Package Python TK is Missing:

This can happen while attempting to build gNB from the sources. It means that the package is obsolete or removed from the repositories. Try to use:

sudo apt-get update
sudo apt-get install python3-tk

If this does not work:

# Try to see where tk and python are beinf referenced inside the build files and attempt to change it's version to the one updated
```conf
cd ~/openairinterface5g
grep -R "python-tk" .

Go inside the cmake files and look for tools/build_helper. In there change the following line:

        $SUDO apt-get -y install python3-tk  $boost_libs_ubuntu libusb-1.0-0-dev