ns3 performance issues - nps-ros2/ns3_testbed GitHub Wiki

ns-3 network modeling is CPU-bound when modeling multicast DDS traffic using 10 Wifi ad-hoc networks in network namespaces (nns).

  • Multicast traffic is demanding in an ad-hoc DDS network: communication is transmitted to all other nodes.
  • ns-3 has network routing and transmission physics modeling to tend to.
  • ns-3 binds to Linux tap devices.
  • The OS manages routing for the host and all network namespaces.

Papers

I did not find any papers that identified a significant ns-3 ad-hoc Wifi performance bottleneck. However, 802.11b ad-hoc Wifi is known as not scalable.

ns-3 Documentation

ns-3 describes their realtime scheduler at https://www.nsnam.org/docs/release/3.12/manual/html/realtime.html. In realtime mode, there are two clocks: the simulation clock and the machine clock. During the handling of an event, the simulation clock is frozen. If the next event is in the machine clock future, the simulation clock waits for the machine clock in order to keep ns-3 in step with real time (Due to OS-specific time granularity, ns-3 uses a combination of sleep and busy-wait to start events at the correct wall time). If the next event is in the machine clock past, then ns-3 cannot keep up.

We can choose what to do if ns-3 cannot keep up with realtime by setting a mode:

  • Mode BestEffort (default): Process events until ns-3 catches up with realtime.
  • Mode HardLimit: Abort if a tolerance threshold is exceeded, default 100ms.

ns-3 Experiment

I updated our ns3_mobility.cpp program to use HardLimit mode by adding this line:

ns3::Config::SetDefault("ns3::RealtimeSimulatorImpl::SynchronizationMode",ns3::EnumValue(ns3::RealtimeSimulatorImpl::SYNC_HARD_LIMIT));

then ran our nodes. Our ns-3 program crashed immediately with this report:

msg="RealtimeSimulatorImpl::ProcessOneEvent (): Hard real-time limit exceeded (jitter = 100004653)", file=../src/core/model/realtime-simulator-impl.cc, line=379
terminate called without an active exception
Aborted (core dumped)

verifying that ns-3 does not keep up with machine time.

ns-3 Examples

  • https://www.nsnam.org/doxygen/wifi-simple-adhoc-grid_8cc_source.html

    This example simulates a 5X5 grid of nodes talking 802.11b in ad-hoc mode. It programmatically generates its own traffic and runs purely in simulation time. As such, it does not use tap devices and does not simulate network flows in real-time (does not use ns3::RealtimeSumulatorImpl).

Performance Tests

In all tests described below we reduce frequency to 1 second for ten nodes and also, significantly, reconfigure ns-3 node proximity to 10 meters instead of 30 so that all nodes are always in range. There is significant overall degradation when nodes go out of range or when network traffic is connecting or gets behind.

The runs stabilize after about 45 seconds once all connections are registered.

We examine performance using these approaches:

  • CPU and network utilization
  • GNU gprof profiler, which identifies time spent in various function calls.
  • Wireshark, which shows network traffic.

CPU and Network Load

Here we see the impact of starting ten nodes, and we see that CPU and network loads become reasonable after about 45 seconds:

10_1_per_sec

  • The CPU history shows that it takes 45 seconds for the network to stabilize.
  • The network traffic shows no traffic before the nodes start, excessive traffic for 45 seconds, then stable traffic after that.

If we tune ns-3 to use the constant position mobility model instead of the random walk mobility model, the network stabilizes in about 25 seconds, indicating the mobility model contributes some to performance degradation but is not a significant burden.

GNU gprof Profiler

Set up CMake

cd build
cmake -DCMAKE_CXX_FLAGS=-pg -DCMAKE_EXE_LINKER_FLAGS=-pg -DCMAKE_SHARED_LINKER_FLAGS=-pg ..

Run profiler

  • Compile to run for 3 minutes.

  • Run.

  • Generate profile:

    gprof ns3_mobility > z2.stats
    

Partial result for flat profile:

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
 28.52      3.41     3.41     1801     0.00     0.00  ns3::Time::~Time()
 16.59      5.39     1.98  9505264     0.00     0.00  ns3::Time::Time(ns3::int64x64_t const&)
  5.44      6.04     0.65 134644027     0.00     0.00  ns3::Time::PeekResolution()
  4.65      6.59     0.56  9506659     0.00     0.00  ns3::Time::PeekInformation(ns3::Time::Unit)
  4.48      7.13     0.54  5923199     0.00     0.00  ns3::SimpleRefCount<ns3::Object, ns3::ObjectBase, ns3::ObjectDeleter>::Ref() const
  4.31      7.64     0.52  9970792     0.00     0.00  ns3::SimpleRefCount<ns3::Object, ns3::ObjectBase, ns3::ObjectDeleter>::Unref() const
  3.69      8.08     0.44  9505320     0.00     0.00  ns3::int64x64_t::int64x64_t(long double)

Partial result for call graph:

		     Call graph (explanation follows)


granularity: each sample hit covers 2 byte(s) for 0.08% of 11.94 seconds

index % time    self  children    called     name
                0.00    0.01       1/1798        main [12]
                0.00   10.71    1797/1798        ns3::MakeEvent<ns3::NodeContainer const&, ns3::NodeContainer>(void (*)(ns3::NodeContainer const&), ns3::NodeContainer)::EventFunctionImpl1::Notify() [2]
[1]     89.7    0.00   10.71    1798         interval_function(ns3::NodeContainer const&) [1]
                0.15    4.99    1800/1801        ns3::Seconds(double) [3]
                3.40    0.00    1800/1801        ns3::Time::~Time() [5]
                0.18    0.88    1800/1800        ns3::EventId ns3::Simulator::Schedule<ns3::NodeContainer const&, ns3::NodeContainer>(ns3::Time const&, void (*)(ns3::NodeContainer const&), ns3::NodeContainer) [11]
                0.00    0.56    1800/1800        ns3::EventId::~EventId() [25]
                0.00    0.20    1798/7197        ns3::NodeContainer::NodeContainer(ns3::NodeContainer const&) [19]
                0.16    0.00    1800/1800        std::setprecision(int) [54]
                0.00    0.07    1800/7202        ns3::NodeContainer::~NodeContainer() [41]
                0.01    0.06   17997/90074       ns3::Ptr<ns3::Node>::~Ptr() [33]
                0.01    0.03   16198/16198       ns3::Ptr<ns3::RandomWalk2dMobilityModel> ns3::Object::GetObject<ns3::RandomWalk2dMobilityModel>() const [68]
                0.00    0.00    1800/1800        ns3::Ptr<ns3::ConstantPositionMobilityModel> ns3::Object::GetObject<ns3::ConstantPositionMobilityModel>() const [75]
                0.00    0.00   16198/16198       ns3::Ptr<ns3::RandomWalk2dMobilityModel>::~Ptr() [78]
                0.00    0.00    1800/1800        ns3::Ptr<ns3::ConstantPositionMobilityModel>::~Ptr() [81]
                0.00    0.00   17998/17998       ns3::Ptr<ns3::Node>::operator->() [100]
                0.00    0.00   16199/16199       ns3::Ptr<ns3::RandomWalk2dMobilityModel>::operator->() [101]
                0.00    0.00    1800/1800        ns3::Ptr<ns3::ConstantPositionMobilityModel>::operator->() [134]
-----------------------------------------------
                                                 <spontaneous>
[2]     89.7    0.00   10.71                 ns3::MakeEvent<ns3::NodeContainer const&, ns3::NodeContainer>(void (*)(ns3::NodeContainer const&), ns3::NodeContainer)::EventFunctionImpl1::Notify() [2]
                0.00   10.71    1797/1798        interval_function(ns3::NodeContainer const&) [1]
-----------------------------------------------
                0.00    0.00       1/1801        main [12]
                0.15    4.99    1800/1801        interval_function(ns3::NodeContainer const&) [1]
[3]     43.1    0.16    4.99    1801         ns3::Seconds(double) [3]
                0.39    4.61 9513031/9513031     ns3::Time::FromDouble(double, ns3::Time::Unit) [4]
-----------------------------------------------
                0.39    4.61 9513031/9513031     ns3::Seconds(double) [3]
[4]     41.8    0.39    4.61 9513031         ns3::Time::FromDouble(double, ns3::Time::Unit) [4]
                0.03    3.26 9504481/9504481     ns3::Time::From(ns3::int64x64_t const&, ns3::Time::Unit) [6]
                0.08    1.24 9506902/9506902     ns3::int64x64_t::int64x64_t(double) [8]
-----------------------------------------------
                0.00    0.00       1/1801        main [12]
                3.40    0.00    1800/1801        interval_function(ns3::NodeContainer const&) [1]
[5]     28.5    3.41    0.00    1801         ns3::Time::~Time() [5]
-----------------------------------------------
                0.03    3.26 9504481/9504481     ns3::Time::FromDouble(double, ns3::Time::Unit) [4]
[6]     27.5    0.03    3.26 9504481         ns3::Time::From(ns3::int64x64_t const&, ns3::Time::Unit) [6]
                1.98    0.06 9505264/9505264     ns3::Time::Time(ns3::int64x64_t const&) [7]
                0.56    0.65 9506659/9506659     ns3::Time::PeekInformation(ns3::Time::Unit) [10]
                0.01    0.00 9508892/9508892     ns3::int64x64_t::int64x64_t(ns3::int64x64_t const&) [72]
                0.01    0.00 9507115/9507115     ns3::operator*=(ns3::int64x64_t&, ns3::int64x64_t const&) [74]
-----------------------------------------------
                1.98    0.06 9505264/9505264     ns3::Time::From(ns3::int64x64_t const&, ns3::Time::Unit) [6]
[7]     17.0    1.98    0.06 9505264         ns3::Time::Time(ns3::int64x64_t const&) [7]
                0.06    0.00 9506725/9506725     ns3::int64x64_t::GetHigh() const [62]
-----------------------------------------------
                0.08    1.24 9506902/9506902     ns3::Time::FromDouble(double, ns3::Time::Unit) [4]
[8]     11.1    0.08    1.24 9506902         ns3::int64x64_t::int64x64_t(double) [8]
                0.44    0.80 9505320/9505320     ns3::int64x64_t::int64x64_t(long double) [9]
-----------------------------------------------

Summary: ns-3 spends time fussing with events, intervals, and ns3::Time.

Wireshark

The graph below shows an even network load on all 10 networks and shows that it takes approximately 45 seconds to stabilize:

wireshark_10_1_sec

A packet capture of nns2 across this interval captures about 2.5MB in 7,548 packets:

wireshark_10_protocol_stats

Of this:

  • 79.6% of the packets are Real-Time Publish-Subscribe Wire Protocol and are DDS publish-subscribe overhead (DDS model: https://en.wikipedia.org/wiki/Data_Distribution_Service).
  • 11.2% of the packets are Address Resolution Protocol for link layer MAC address discovery.
  • 9.0% of the packets are data.

This table shows that there were 163.4 packets per second over 46.193 seconds and packets averaged 341 bytes each:

10_basic_properties

This table shows UDP packet statistics for each of the ten robots:

wireshark_10_ports

Notes:

  • The Ground Station robot at 10.0.0.1 transmits 962 packets while the other nine transmit about 550 packets.
  • There are three ports per robot. Of these, one port receives nothing, one port receives about 350 packets, and one port receives 987 packets for the GS and about 88 for each robot.

This table shows packets per second:

wireshark_10_ppsec

There are around 200 packets per second until the system stabilizes at about 40 packets per second. 40pps makes sense: 10 robots * 2 packets/sec * 2 for ack = 40pps.

Wireshark Single-Shot

By having nine robots transmit only once to the GS:

  • UDP and ARP get set up in 22.4 seconds.
  • 4913 packets are generated.
  • None of the actual robot data reaches the GS.

Here is the packet distribution:

wireshark_2_50_udp

This IO graph shows packet load then no packet traffic after 22 seconds:

wireshark_2_50_io

No robot subscription data makes it to the GS. Network traffic is primarily DDS and ARP overhead:

wireshark_2_50_protocol_stats

Further study

We have two goals:

  1. Reduce initialization overload: start robots at intervals instead of all at once.
  2. Determine whether ns-3 can keep up once the ad hoc infrastructure is established and stable.

Stagger start times

Know when ns-3 Cannot Keep Up

Here we change ns-3's fail code to instead print a warning. Ref. https://github.com/nps-ros2/ns3_gazebo/wiki/Installing-ns-3

  • Change ~/repos/ns-3-allinone/ns-3.29/src/core/model/realtime-simulator-impl.cc line 378 to std::cerr << "Hard real-time limit exceeded" << std::endl; instead of fail.

  • Rebuild:

    cd ~/repos/ns-3-allinone/ns-3.29
    ./waf build
    

Run

  • Start ns-3

    cd ~/gits/ns3_gazebo/ns3_testbed/ns3_mobility/build
    ./ns3_mobility -c 10
    
  • Start GUI

    cd ~/gits/ns3_gazebo/ns3_testbed/ns3_testbed_gui
    ./tg.py
    
  • Start System Monitor and/or Wireshark

  • Start robots, pick robot configuration file

    sudo /bin/bash
    cd ~/gits/ns3_gazebo/ns3_testbed/csv_setup
    ros2 run cpp_testbed_runner testbed_runner -c 10 -s ex1sec.csv -n -p
    

Results

  • Robots initialize at 1 second intervals.
  • ns-3 prints a warning when ns-3 cannot keep up, but does not warn faster than 10Hz.
  • Nine robots transmit at 1Hz.
  • Robots start at 1 sec. intervals. ns-3 gets behind after 2 or 3 robots get started.
  • It takes perhaps 80 seconds for traffic to stabilize.
  • Usually ns-3 keeps up within 100ms but occasionally gets momentarily behind.

Todo

  • ns-3 prints coordinates to stdout using multiple << calls. This should be optimized to make just one << call. Then remeasure.

ns3 Infrastructure Mode Failure

Wifi infrastructure mode does not work. Ping does not work. See https://github.com/nps-ros2/ns3_gazebo/wiki/ns-3-Wifi-Infrastructure-Tap-Problem.

This section provides links and examines a hostapd effort that did not work.

  • GS attempts to ping Sta: destination host unreachable.
  • STA attempts to ping GS: ping request hangs.

Examples in ns-3 place packets directly onto the WifiNetDevice. Maybe device configuration above ns-3 is required. Below we attempt this configuration by configuring the virtual Ethernet devices inside the network namespaces. These devices are above the Tap devices.

Here we describe configuring two virtual Ethernet devices for Wifi Infrastructure mode: wifi_veth1 as the Access Point (AP) and wifi_veth2 as a station (Sta).

Configuration consists of creating several configuration files and running tools hostapd and wpa_supplicant giving configuration files as parameters.Setup for nns1:

ip link add wifi_veth1 type veth peer name wifi_vethb1
ip address add 10.0.0.2/9 dev wifi_vethb1
ip link set wifi_veth1 netns nns1
ip netns exec nns1 ip addr add 10.0.0.1/9  dev wifi_veth1
ip link add name wifi_br1 type bridge
ip link set wifi_br1 up
ip link set wifi_vethb1 up
ip netns exec nns1 ip link set wifi_veth1 up
ip link set wifi_vethb1 master wifi_br1
ip tuntap add wifi_tap1 mode tap
ip addr flush dev wifi_tap1
ip address add 10.0.0.3/9 dev wifi_tap1
ip link set wifi_tap1 up
ip link set wifi_tap1 master wifi_br1

Setup for nns2:

ip link add wifi_veth2 type veth peer name wifi_vethb2
ip address add 10.0.0.5/9 dev wifi_vethb2
ip link set wifi_veth2 netns nns2
ip netns exec nns2 ip addr add 10.0.0.4/9  dev wifi_veth2
ip link add name wifi_br2 type bridge
ip link set wifi_br2 up
ip link set wifi_vethb2 up
ip netns exec nns2 ip link set wifi_veth2 up
ip link set wifi_vethb2 master wifi_br2
ip tuntap add wifi_tap2 mode tap
ip addr flush dev wifi_tap2
ip address add 10.0.0.6/9 dev wifi_tap2
ip link set wifi_tap2 up
ip link set wifi_tap2 master wifi_br2

Tool hostapd is not installed by default, so install it:

sudo apt install hostapd

Some links:

We will create configuration files in ~/gits/ns3_gazebo/ns3_testbed2/config.

Create hostapd.conf file for each virtual Ethernet device, ref http://sources.buildroot.org/iwd/git/doc/8021x-wired-testing.txt.

hostapd_nns1.conf:

interface=wifi_veth1
ssid=wifi-default
driver=wired
ieee8021x=1
use_pae_group_addr=1
eap_server=1
eap_user_file=hostapd.eap_user
ca_cert=newcertca.crt
server_cert=newcertca.crt

hostapd_nns2.conf:

interface=wifi_veth2
ssid=wifi-default
driver=wired
ieee8021x=1
use_pae_group_addr=1
eap_server=1
eap_user_file=hostapd.eap_user
ca_cert=newcertca.crt
server_cert=newcertca.crt

These need files hostapd.eap_user and newcertca.crt.

Create hostapd.eap_user, ref. http://sources.buildroot.org/iwd/git/doc/8021x-wired-testing.txt. Put in this or equivalent:

# Phase 1 users
*	PEAP
# Phase 2
"test"	MSCHAPV2	"password"	[2]

Create newcertca.crt, ref. https://github.com/sensepost/hostapd-mana/wiki/Creating-PSK-or-EAP-Networks:

openssl genrsa -out server.key 2048
openssl req -new -sha256 -key server.key -out csr.csr
openssl req -x509 -sha256 -days 365 -key server.key -in csr.csr -out server.pem
ln -s server.pem newcertca.crt

Try

Note: This failed. It is included for completeness. Start ns-3:

cd ~/gits/ns3_gazebo/ns3_testbed2/ns3_mobility/build
./ns3_mobility2 -c 2

Run this from your nns window or use sudo and configure veth before moving it to nns. From nns window:

nns1:

cd ~/gits/ns3_gazebo/ns3_testbed2/config
ip netns exec nns1 /bin/bash
hostapd hostapd_nns1.conf

nns2:

cd ~/gits/ns3_gazebo/ns3_testbed2/config
ip netns exec nns2 /bin/bash
hostapd hostapd_nns2.conf

Now for wpa_supplicant, ref. http://sources.buildroot.org/iwd/git/doc/8021x-wired-testing.txt:

Create wpa_supplicant.conf configuration file:

ap_scan=0
fast_reauth=1
network={
    ssid="wifi-default"
    scan_ssid=0
    key_mgmt=IEEE8021X
    eap=PEAP
    phase2="auth=MSCHAPV2"
    identity="test"
    password="password"
    ca_cert="newcertca.crt" # replace with your CA certificate path
}

Run wpa_supplicant from nns:

wpa_supplicant -i <veth1> -c <nns_specific_wpa_supplicant.conf

Simulating Wifi Traffic

Here we examine network traffic flows where traffic is generated in ns-3 and network flows stay within the ns-3 environment. Traffic does not flow through Tap devices to network ports outside ns-3.

ns-3 provides about 40 Wireless configuration examples in ~/repos/ns-allinone-3.30.1/ns-3.30.1/examples. ns-3 also provides seven turorials. Examples include wifi-simple-adhoc.cc which uses 80211b and takes input parameters including the physical layer mode, the signal strength, packet size, number of packets, and packet transmit interval. Tutorials include third.cc which builds a network that includes Wifi Infrastructure nodes.

Using ns-3's examples we can experiment with ad-hoc and infrastructure modes. They output to a pcap file for each node.

We could copy work "Implementation and Validation of an IEEE 802.11 ah Module for ns-3" to see if we get their results, but I discourage this. We would need to work with a branch of ns-3 such as https://github.com/duraraxbaccano/802.11ah-ns3 which came from ns-3 v3.23 (we are using ns-3 v3.31).

Instead, I recommend that we use 802.11n which is integrated into ns-3 along with settings identified in Thesis "Mapping ad hoc communications network of a large number fixed-wing UAV swarm" by Alexis Pospischil, https://apps.dtic.mil/dtic/tr/fulltext/u2/1045965.pdfhttps://gitlab.nps.edu, discussed at https://gitlab.nps.edu/ros2/ros2_cybersecurity_group/-/wikis/Large-Swarm.

Although 802.11ah may offer more range, we use 802.11n because this is a more common protocol and it reflects Alexis' UAV swarm.

ns-3 Sockets

Rather than connecting ns-3 Node objects to Tap devices, we install ns3::InternetStackHelper and ns3::Ipv4AddressHelper, and create an ns3::Ipv4InterfaceContainer. Unless we need to capture data for the application to consume, we do not need to create source and sink ns3::Socket objects. Interval-based transmit is performed using ns3::Simulator::ScheduleWithContext which generates UDP traffic at assigned nodes. Packets are picked up at receiving nodes that are assigned to sockets for callback.

Prototype

Goal: have ns-3 simulate the ad hoc Wifi network defined in the Thesis.

Steps:

  • Build ns-3's ad hoc Wifi example.
  • Add multiple nodes and have them broadcast (not point-to-point) to other nodes.
  • Create 51 nodes and have them broadcast messages at periodicity per Thesis.
  • Consider staggering transmit times to avoid packet collisions. Current performance is unrealistic.
  • Fix mobility model and antenna strength.
⚠️ **GitHub.com Fallback** ⚠️