Troubleshooting - Surpr1se0/mosaic5G-docs GitHub Wiki
Intro
This page contains all of the important information regarding relevant errors that have occurred during the configuration and implementation of the different versions of the OpenAirInterface Architecture.
1. No configuration file found
After executing the docker-compose up gnb
command, while in XenOrchestra, the following error occurs:
===================================
/proc/sys/kernel/core_pattern=|/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E
No configuration file found: please mount to /opt/oai-gnb/etc/gnb.conf
[INFO tini (1)] Spawned child process '/opt/oai-gnb/bin/entrypoint.sh' with pid '7'
[INFO tini (1)] Main child exited normally (with status '255')
Essentially it states that the gnb.conf
configuration file is not being found/mounted for the corresponding container as desired.
The gnb.conf configuration file is not being found/mounted to the container as it should be
Resolution:
- Shut down the container
docker-compose down oai-gnb
- Use the previously mentioned gnb image: instead of
develop
- Give docker permissions to be able to read the file (optional and not tested) through
chmod
- change the format of
.yaml
file inside docker-compose.yaml in gnb section to .conf instead of .yaml extension in both files. - Reconnect the container.
- Check the logs:
docker logs oai-gnb
- Additionally, check the
amf
logs to ensure that enodeb is already connected.
2. DNS not being able to resolve name addresses
DNS cannot resolve addresses over the interface created by the OAI 5G deployment. Other than that, it works as expected. Despite this, it is possible to connect to digital interfaces and different networks, as evidenced in the tests mentioned above.
Attempt no 1:
These were the steps followed to try to resolve or diagnose the problem:
- Check if the problem could be with DHCP, the virtual machine (and not the docker instance):
sudo apt install isc-dhcp-client
sudo journalctl -u systemd-networkd | grep ens33
sudo journalctl -u systemd-networkd
sudo systemctl restart systemd-networkd
sudo nano /etc/netplan/01-network-manager-all.yaml
network:
version: 2
renderer: networkd
Ethernet:
ens33:
dhcp4: yes
sudo netplan apply
sudo dhclient -v ens33
# essentially we force dhcp
# from the client making a DHCP request to obtain
# an IP address with new DNS information
Docker-compose was run again but without resolution.
- Check the
resolv.conf
andnamed.conf
file to check the DNS servers the instances are created with - the file is automatically generated by docker-compose and cannot be edited. - Trying to insert options into docker-compose, as the documentation suggests - proved impossible as there is no information on which parameters we should insert, and in which components of docker-compose.
- Try running other simulations - the problem does not exist as they do not mention attempts to communicate with external devices.
- Force recording a known DNS server in the
resolv.conf
file viaecho >
- also without success. - Try to run the simulation with a different version of Ubuntu - the rest of the components do not run
Attempt no 2:
- docker-compose.yaml, in the
ue
section, put the following, and restart the container.
dns:
- 8.8.8.8
-
Despite this, we continue to have connectivity with
ext-dn
. -
Make the following change:
oai-smf:
container_name: "rfsim5g-oai-smf"
image: oaisoftwarealliance/oai-smf:v2.0.0
environment:
- TZ=Europe/Paris
volumes:
- ./mini_nonrf_config.yaml:/openair-smf/etc/config.yaml depends_on:
- hi-amf
networks:
public_net:
ipv4_address: 192.168.71.133
#VERSUS
oai-smf:
container_name: "rfsim5g-oai-smf"
image: oaisoftwarealliance/oai-smf:v2.0.0
environment:
- TZ=Europe/Paris
- DEFAULT_DNS_IPV4_ADDRESS=172.21.3.100
volumes:
- ./mini_nonrf_config.yaml:/openair-smf/etc/config.yaml depends_on:
- hi-amf
networks:
public_net:
ipv4_address: 192.168.71.133
-
Even so, we don't have a positive result.
-
Go directly to
mini_nonrf_config.yaml
- smf volume configuration file. Change one of the DNS values, dodocker-compose down && up
to restart. It didn't work. -
Do
iptables -L
to see the host's routing rules. Insert the following rule:
iptables -A FORWARD -p icmp --icmp-type echo-request -j ACCEPT
-
Additionally check
tail -f /var/log/ufw.log
to check if there was any blocking. If applicable, doufw disable
. There is no longer any blocking but the ping still does not work. -
ip route
inside theue
container. Enable packet forwarding viasudo sysctl -w net.ipv4.ip_forward=1
. Configure NATsudo iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
. -
Still without positive results.
Attempt no 3:
Let's look at the architecture, but more specifically all of the connections.
NR UE
: 12.1.1.1UPF
: 12.1.1.2- 71.134 - incoming form demo_private.
- 72.134 - connection to SGI e EXT-DN
Let's check the pings:
-
12.1.1.1 → 12.1.1.2 Works
-
From 12.1.1.1 → 192.168.72.134 Works
Then:
- Ping from
NR-UE
→EXT_DN
Works
But:
- Ping from
NR-UE
→Internet
Does not Work
The UPF
has 3 interfaces:
- ETH0: connection to
demo_private
. - ETH1: connection to
demo_public
→ outwards andextdn
- TUN0: connection com o
UE
💡According to the UPF
container: all traffic that does not have a defined route is being sent by the gw
X.X71.129
, through the eth0
interface.
If UE
sends a ping to google.com
:
UPF
gets:
In ETH0
we can also see this:
In ETH1
we get:
We don't see any traffic come through...
With this said, insert the following commands:
ip route del default via 192.168.71.129 dev eth0
ip route add 12.1.1.0/24 via 192.168.72.134 dev eth1
Masquerade incoming packets from the UE
with destiny to ext-dn
:
sudo iptables -t nat -A POSTROUTING -s 12.1.1.0/24 -o eth1 -j MASQUERADE
Let's add the route, in the UPF
, so tht everytime traffic not specified in the bellow networks get out by default behaviour in the X.X.72.135
, which leads to ext-dn
, through the correct interface:
ip route add default via 192.168.72.135 dev eth1
3. UE authentication problems
When running the disaggregated version of the architecture with gNB separated into CU and DU, the UE cannot be authenticated into the network, oai-amf
docker logs show up as:
[UE] [ue] [info ] | Index | Status | Global ID | UE Name | PLMN |
[UE] [ue] [info ] | 1 | UNREGISTERED | 0x0 | ue-rfsim | 208, 99 |
Firstly we thought that this problem could be derived from the fact that the UE was not being provisioned correctly in the SQL, AMF or SMF. This was disproven by the fact the AMF could receive all of the correct parameters of UE. What fixed this problem was ditching the manual implementation of the UE and CN, and using a
docker-compose
solution to instantiate the elements of the network.
4. Container not Starting
The oai-gNB in a docker-compose does not run, showing this error:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/compose/cli/main.py", line 81, in main
command_func()
File "/usr/lib/python3/dist-packages/compose/cli/main.py", line 203, in perform_command
handler(command, command_options)
File "/usr/lib/python3/dist-packages/compose/metrics/decorator.py", line 18, in wrapper
result = fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/compose/cli/main.py", line 1186, in up
to_attach = up(False)
^^^^^^^^^
File "/usr/lib/python3/dist-packages/compose/cli/main.py", line 1166, in up
return self.project.up(
^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/compose/project.py", line 697, in up
results, errors = parallel.parallel_execute(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/compose/parallel.py", line 108, in parallel_execute
raise error_to_reraise
File "/usr/lib/python3/dist-packages/compose/parallel.py", line 206, in producer
result = func(obj)
^^^^^^^^^
File "/usr/lib/python3/dist-packages/compose/project.py", line 679, in do
return service.execute_convergence_plan(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/compose/service.py", line 579, in execute_convergence_plan
return self._execute_convergence_recreate(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/compose/service.py", line 499, in _execute_convergence_recreate
containers, errors = parallel_execute(
^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/compose/parallel.py", line 108, in parallel_execute
raise error_to_reraise
File "/usr/lib/python3/dist-packages/compose/parallel.py", line 206, in producer
result = func(obj)
^^^^^^^^^
File "/usr/lib/python3/dist-packages/compose/service.py", line 494, in recreate
return self.recreate_container(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/compose/service.py", line 612, in recreate_container
new_container = self.create_container(
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/compose/service.py", line 330, in create_container
container_options = self._get_container_create_options(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/compose/service.py", line 921, in _get_container_create_options
container_options, override_options = self._build_container_volume_options(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/compose/service.py", line 960, in _build_container_volume_options
binds, affinity = merge_volume_bindings(
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/compose/service.py", line 1548, in merge_volume_bindings
old_volumes, old_mounts = get_container_data_volumes(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/compose/service.py", line 1579, in get_container_data_volumes
container.image_config['ContainerConfig'].get('Volumes') or {}
~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
KeyError: 'ContainerConfig'
This problem happens when you use the docker-compose implementation without rebuilding the Docker Image previously referenced. It can generate this type of problem because the code/build is not present in the system and is being referenced by the
docker-compose
file.
5. RAN Nodes with same ID
The FlexRIC logs show the following when connecting to the CU and DU:
[E2AP]: E2 SETUP-REQUEST rx from PLMN 208.99 Node ID 3584 RAN type ngran_gNB_CU
nearRT-RIC: /flexric/src/lib/msg_hand/reg_e2_nodes.c:174: add_reg_e2_node: Assertion it_node == end_node && "Trying to add an already existing E2 Node"' failed.
Aborted (core dumped)
In order to start both nodes of the gNB in the FlexRIC we must add a different ID, inside the same PLMN to the
CU
andDU
configurations. After making this small change, the FlexRIC receives both of the nodes correctly, not generating any errors.
6. Address Sanitizer in UPF
rfsim5g-oai-upf | AddressSanitizer:DEADLYSIGNAL
rfsim5g-oai-upf | AddressSanitizer:DEADLYSIGNAL
rfsim5g-oai-upf | AddressSanitizer:DEADLYSIGNAL
rfsim5g-oai-upf | AddressSanitizer:DEADLYSIGNAL
rfsim5g-oai-upf | AddressSanitizer:DEADLYSIGNAL
rfsim5g-oai-upf | AddressSanitizer:DEADLYSIGNAL
rfsim5g-oai-upf | AddressSanitizer:DEADLYSIGNAL
rfsim5g-oai-upf | AddressSanitizer:DEADLYSIGNAL
This problem shows up normally after instantiating the UPF
. Apparently it means that the process behind the UPF
has sufered a crash due to a segmentation fault. This could be related to the build or code of the UPF
, or even the IP address allocation inside the network being shared by the swarm nodes.
These type of problems can happen when the components are started in the incorrect order, so adding the
depends_on
directive to thedocker-compose.yaml
file efficiently solves this problem.
Additionally, tinkering with the
"ASAN_OPTIONS"
on theupf
section of the docker-compose file can also help. Either deactivate it or activate it.
This link exposes a solution ->
sudo sysctl -w vm.mmap_rnd_bits=28
Docker Documentation states that: "You should create overlay networks with /24 blocks (the default), which limits you to 256 IP addresses", and that also helped the problem.
This also can be a memory allocation problem, check if you have a good amount of RAM available using
free -h
.
7. Using Absolute Paths for container volumes
This error comes after instantiating the gNB, using the docker-compose, where we mounted the volumes using absolutes paths, instead of relative paths to the docker-compose
file. Docker can generate errors like this when this happens.
ERROR: for oai-gnb Cannot start service oai-gnb: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error mounting "/root/docker-gnb/etc/gnb.conf" to rootfs at "/opt/oai-gnb/etc/gnb.conf": create mountpoint for /opt/oai-gnb/etc/gnb.conf mount: cannot create subdirectories in "/var/lib/docker/overlay2/20239959ec090e62fbfbf19c8dbfa721ab5597059d7bf0b318958649bcc0b447/merged/opt/oai-gnb/etc/gnb.conf": not a directory: unknown: Are you trying to mount a directory onto a file (or vice-versa)? Check if the specified host path exists and is the expected type
ERROR: Encountered errors while bringing up the project.
To fix it:
# Instead of using:
~/docker-gnb/etc/gnb.conf
# Use:
../../etc/gnb.conf
8. Incorrect Network Labels for E1 Docker Compose
The error can look like this:
network rfsim5g-oai-core-net was found but has incorrect label com.docker.compose.network set to "rfsim5g-oai-core-net" (expected: "core_net)
When starting the docker compose file located in `/5g_rfsimulator_e1, the labels for the docker networks can have problems. Make sure every label is the same name and if not, attempt at creating the networks externally:
docker network create --driver bridge --attachable --subnet=192.168.78.0/28 rfsim5g-oai-ue-net
Inside the docker-compose
file:
rfsim5g-oai-ue-net:
name: rfsim5g-oai-ue-net
external: true
Use this for every network.
9. Integrity Failed in CUUP:
This problem can happen when attempting to ping the UPF through the tunnel interface oai_tun0
, in the e1_split
simulation.
The error that shows up in the CUUP looks like this:
[PDCP] E discard NR PDU, integrity failed
Erro no SDAP ([SDAP] E /oai-ran/openair2/SDAP/nr_sdap/nr_sdap.c:49:sdap_data_req: Entity not found with ue: 0x0 and pdusession id: 0)
This essentially means that the integrity checks (mechanisms) between the UE and gNB, in this case the CUCP are not the same. The algorithm for integrity as well as the one for encryption is not synced properly with the CUUP and this would be causing the problem. You head into the configuration file for the cucp
in: ../../conf_files/gnb-cucp.sa.f1.conf
and make sure you are using:
# preferred ciphering algorithms
# the first one of the list that an UE supports in chosen
# valid values: nea0, nea1, nea2, nea3
ciphering_algorithms = ( "nea0", "nea2" );
# preferred integrity algorithms
# the first one of the list that an UE supports in chosen
# valid values: nia0, nia1, nia2, nia3
integrity_algorithms = ( "nia2", "nia0" );
# setting 'drb_ciphering' to "no" disables ciphering for DRBs, no matter
# what 'ciphering_algorithms' configures; same thing for 'drb_integrity'
drb_ciphering = "yes";
drb_integrity = "yes";
10. Docker Networks not working between Swarm nodes:
Docker Swarm overlay networks can sometimes have weird bugs, not syncing properly or propagating the changes to the network throughout the network. Sometimes, even random endpoints can be created witht the same IP address as other containers, which can impede the containers to start. The best way to fix this problems is to:
-
First check the
journalctl -e -u docker
logs and check for errors -
Check the
systemctl status docker
for errors -
Restart the containers until the services are healthy
-
Enable important modules used by Swarm:
# Check if you have the modules lsmod | grep ip_vs # Add the modules sudo modprobe ip_vs sudo modprobe ip_vs_rr sudo modprobe ip_vs_wrr sudo modprobe ip_vs_sh sudo modprobe nf_conntrack # Check the changes lsmod | grep ip_vs # Make permanent adjustments echo "ip_vs" | sudo tee -a /etc/modules-load.d/ipvs.conf echo "ip_vs_rr" | sudo tee -a /etc/modules-load.d/ipvs.conf echo "ip_vs_wrr" | sudo tee -a /etc/modules-load.d/ipvs.conf echo "ip_vs_sh" | sudo tee -a /etc/modules-load.d/ipvs.conf echo "nf_conntrack" | sudo tee -a /etc/modules-load.d/ipvs.conf # Reboot the system reboot
Helpful Links that tackle the same problems:
- https://github.com/moby/moby/issues/24170
- https://forums.docker.com/t/newbie-step-by-step-guide-to-connect-ip-in-multiple-containers-on-separate-hosts/138362/3
- https://docs.docker.com/engine/network/tutorials/overlay/
- https://forums.docker.com/t/overlay-network-not-working-not-working-between-two-containers-part-ii/116222/3
- https://github.com/moby/moby/issues/37338
- https://forums.docker.com/t/not-able-to-attach-container-to-overlay-network-with-ipv6-enabled-but-works-fine-ip4/144534
- https://gitlab.eurecom.fr/oai/openairinterface5g/-/issues/727
- https://gitlab.eurecom.fr/oai/openairinterface5g/-/tree/develop/ci-scripts/yaml_files/5g_rfsimulator_e1?ref_type=heads
Lastly, and more importantly, the docker-swarm does not like small subnet networks, so make sure every network is at least
/24
.
11. E2 Agent not compatible with gNB nodes:
UPDATE as of 24/03/2025 : This is no longer needed and an updated procedure is mentioned in Custom Docker Images. The FlexRIC container did not work with the default images of RAN components due to incorrect permissions being in place in the SMs of the named volume "SharedXAppLibs".
This problem can occur if you are not using the E2 custom built docker images that are mentioned in Custom Docker Images.
You must built your own CUUP and gNB (which can be the DU and CUCP) with compatibility for E2 Agent in order to properly connect to the flexRIC
. Follow those tutorials in the wiki pages.
12. SCTP connection failed:
This problem most commonly occurs in the DU upon starting it. It can be easily fixed by, in the build commands or environment commands in the docker-compose
file adding the CUCP IP instead of --MACRLCs.[0].local_n_address cucp
.
This problem can also occur when the UE is not able to find the DU IP so make sure you have the correct IP set under the environment variables.
13. Synchronization Failures in UE:
This problem occurs if you are using the -E
flag which stands for:
Apply three-quarter of sampling frequency, (example 23.04 Msps for LTE 20MHz) to reduce the data rate on USB/PCIe transfers (only valid for some bandwidths).
This flag must be used in all of the containers or not. Cannot be used only in the gNB or UE, otherwise you will get syncing issues.
14. Package Python TK is Missing:
This can happen while attempting to build gNB from the sources. It means that the package is obsolete or removed from the repositories. Try to use:
sudo apt-get update
sudo apt-get install python3-tk
If this does not work:
# Try to see where tk and python are beinf referenced inside the build files and attempt to change it's version to the one updated
```conf
cd ~/openairinterface5g
grep -R "python-tk" .
Go inside the cmake
files and look for tools/build_helper
. In there change the following line:
$SUDO apt-get -y install python3-tk $boost_libs_ubuntu libusb-1.0-0-dev