ns3 performance issues - nps-ros2/ns3_testbed GitHub Wiki
ns-3 network modeling is CPU-bound when modeling multicast DDS traffic using 10 Wifi ad-hoc networks in network namespaces (nns).
- Multicast traffic is demanding in an ad-hoc DDS network: communication is transmitted to all other nodes.
- ns-3 has network routing and transmission physics modeling to tend to.
- ns-3 binds to Linux tap devices.
- The OS manages routing for the host and all network namespaces.
I did not find any papers that identified a significant ns-3 ad-hoc Wifi performance bottleneck. However, 802.11b ad-hoc Wifi is known as not scalable.
- Here is a performance and scalability evaluation of the ns-3 distributed scheduler: https://dl.acm.org/citation.cfm?id=2263079. Note: ad-hoc Wifi scalability is not examined.
ns-3 describes their realtime scheduler at https://www.nsnam.org/docs/release/3.12/manual/html/realtime.html. In realtime mode, there are two clocks: the simulation clock and the machine clock. During the handling of an event, the simulation clock is frozen. If the next event is in the machine clock future, the simulation clock waits for the machine clock in order to keep ns-3 in step with real time (Due to OS-specific time granularity, ns-3 uses a combination of sleep and busy-wait to start events at the correct wall time). If the next event is in the machine clock past, then ns-3 cannot keep up.
We can choose what to do if ns-3 cannot keep up with realtime by setting a mode:
- Mode
BestEffort
(default): Process events until ns-3 catches up with realtime. - Mode
HardLimit
: Abort if a tolerance threshold is exceeded, default 100ms.
I updated our ns3_mobility.cpp
program to use HardLimit
mode by adding this line:
ns3::Config::SetDefault("ns3::RealtimeSimulatorImpl::SynchronizationMode",ns3::EnumValue(ns3::RealtimeSimulatorImpl::SYNC_HARD_LIMIT));
then ran our nodes. Our ns-3 program crashed immediately with this report:
msg="RealtimeSimulatorImpl::ProcessOneEvent (): Hard real-time limit exceeded (jitter = 100004653)", file=../src/core/model/realtime-simulator-impl.cc, line=379
terminate called without an active exception
Aborted (core dumped)
verifying that ns-3 does not keep up with machine time.
-
https://www.nsnam.org/doxygen/wifi-simple-adhoc-grid_8cc_source.html
This example simulates a 5X5 grid of nodes talking 802.11b in ad-hoc mode. It programmatically generates its own traffic and runs purely in simulation time. As such, it does not use tap devices and does not simulate network flows in real-time (does not use
ns3::RealtimeSumulatorImpl
).
In all tests described below we reduce frequency to 1 second for ten nodes and also, significantly, reconfigure ns-3 node proximity to 10 meters instead of 30 so that all nodes are always in range. There is significant overall degradation when nodes go out of range or when network traffic is connecting or gets behind.
The runs stabilize after about 45 seconds once all connections are registered.
We examine performance using these approaches:
- CPU and network utilization
- GNU
gprof
profiler, which identifies time spent in various function calls. - Wireshark, which shows network traffic.
Here we see the impact of starting ten nodes, and we see that CPU and network loads become reasonable after about 45 seconds:
- The CPU history shows that it takes 45 seconds for the network to stabilize.
- The network traffic shows no traffic before the nodes start, excessive traffic for 45 seconds, then stable traffic after that.
If we tune ns-3 to use the constant position mobility model instead of the random walk mobility model, the network stabilizes in about 25 seconds, indicating the mobility model contributes some to performance degradation but is not a significant burden.
cd build
cmake -DCMAKE_CXX_FLAGS=-pg -DCMAKE_EXE_LINKER_FLAGS=-pg -DCMAKE_SHARED_LINKER_FLAGS=-pg ..
-
Compile to run for 3 minutes.
-
Run.
-
Generate profile:
gprof ns3_mobility > z2.stats
Partial result for flat profile:
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
28.52 3.41 3.41 1801 0.00 0.00 ns3::Time::~Time()
16.59 5.39 1.98 9505264 0.00 0.00 ns3::Time::Time(ns3::int64x64_t const&)
5.44 6.04 0.65 134644027 0.00 0.00 ns3::Time::PeekResolution()
4.65 6.59 0.56 9506659 0.00 0.00 ns3::Time::PeekInformation(ns3::Time::Unit)
4.48 7.13 0.54 5923199 0.00 0.00 ns3::SimpleRefCount<ns3::Object, ns3::ObjectBase, ns3::ObjectDeleter>::Ref() const
4.31 7.64 0.52 9970792 0.00 0.00 ns3::SimpleRefCount<ns3::Object, ns3::ObjectBase, ns3::ObjectDeleter>::Unref() const
3.69 8.08 0.44 9505320 0.00 0.00 ns3::int64x64_t::int64x64_t(long double)
Partial result for call graph:
Call graph (explanation follows)
granularity: each sample hit covers 2 byte(s) for 0.08% of 11.94 seconds
index % time self children called name
0.00 0.01 1/1798 main [12]
0.00 10.71 1797/1798 ns3::MakeEvent<ns3::NodeContainer const&, ns3::NodeContainer>(void (*)(ns3::NodeContainer const&), ns3::NodeContainer)::EventFunctionImpl1::Notify() [2]
[1] 89.7 0.00 10.71 1798 interval_function(ns3::NodeContainer const&) [1]
0.15 4.99 1800/1801 ns3::Seconds(double) [3]
3.40 0.00 1800/1801 ns3::Time::~Time() [5]
0.18 0.88 1800/1800 ns3::EventId ns3::Simulator::Schedule<ns3::NodeContainer const&, ns3::NodeContainer>(ns3::Time const&, void (*)(ns3::NodeContainer const&), ns3::NodeContainer) [11]
0.00 0.56 1800/1800 ns3::EventId::~EventId() [25]
0.00 0.20 1798/7197 ns3::NodeContainer::NodeContainer(ns3::NodeContainer const&) [19]
0.16 0.00 1800/1800 std::setprecision(int) [54]
0.00 0.07 1800/7202 ns3::NodeContainer::~NodeContainer() [41]
0.01 0.06 17997/90074 ns3::Ptr<ns3::Node>::~Ptr() [33]
0.01 0.03 16198/16198 ns3::Ptr<ns3::RandomWalk2dMobilityModel> ns3::Object::GetObject<ns3::RandomWalk2dMobilityModel>() const [68]
0.00 0.00 1800/1800 ns3::Ptr<ns3::ConstantPositionMobilityModel> ns3::Object::GetObject<ns3::ConstantPositionMobilityModel>() const [75]
0.00 0.00 16198/16198 ns3::Ptr<ns3::RandomWalk2dMobilityModel>::~Ptr() [78]
0.00 0.00 1800/1800 ns3::Ptr<ns3::ConstantPositionMobilityModel>::~Ptr() [81]
0.00 0.00 17998/17998 ns3::Ptr<ns3::Node>::operator->() [100]
0.00 0.00 16199/16199 ns3::Ptr<ns3::RandomWalk2dMobilityModel>::operator->() [101]
0.00 0.00 1800/1800 ns3::Ptr<ns3::ConstantPositionMobilityModel>::operator->() [134]
-----------------------------------------------
<spontaneous>
[2] 89.7 0.00 10.71 ns3::MakeEvent<ns3::NodeContainer const&, ns3::NodeContainer>(void (*)(ns3::NodeContainer const&), ns3::NodeContainer)::EventFunctionImpl1::Notify() [2]
0.00 10.71 1797/1798 interval_function(ns3::NodeContainer const&) [1]
-----------------------------------------------
0.00 0.00 1/1801 main [12]
0.15 4.99 1800/1801 interval_function(ns3::NodeContainer const&) [1]
[3] 43.1 0.16 4.99 1801 ns3::Seconds(double) [3]
0.39 4.61 9513031/9513031 ns3::Time::FromDouble(double, ns3::Time::Unit) [4]
-----------------------------------------------
0.39 4.61 9513031/9513031 ns3::Seconds(double) [3]
[4] 41.8 0.39 4.61 9513031 ns3::Time::FromDouble(double, ns3::Time::Unit) [4]
0.03 3.26 9504481/9504481 ns3::Time::From(ns3::int64x64_t const&, ns3::Time::Unit) [6]
0.08 1.24 9506902/9506902 ns3::int64x64_t::int64x64_t(double) [8]
-----------------------------------------------
0.00 0.00 1/1801 main [12]
3.40 0.00 1800/1801 interval_function(ns3::NodeContainer const&) [1]
[5] 28.5 3.41 0.00 1801 ns3::Time::~Time() [5]
-----------------------------------------------
0.03 3.26 9504481/9504481 ns3::Time::FromDouble(double, ns3::Time::Unit) [4]
[6] 27.5 0.03 3.26 9504481 ns3::Time::From(ns3::int64x64_t const&, ns3::Time::Unit) [6]
1.98 0.06 9505264/9505264 ns3::Time::Time(ns3::int64x64_t const&) [7]
0.56 0.65 9506659/9506659 ns3::Time::PeekInformation(ns3::Time::Unit) [10]
0.01 0.00 9508892/9508892 ns3::int64x64_t::int64x64_t(ns3::int64x64_t const&) [72]
0.01 0.00 9507115/9507115 ns3::operator*=(ns3::int64x64_t&, ns3::int64x64_t const&) [74]
-----------------------------------------------
1.98 0.06 9505264/9505264 ns3::Time::From(ns3::int64x64_t const&, ns3::Time::Unit) [6]
[7] 17.0 1.98 0.06 9505264 ns3::Time::Time(ns3::int64x64_t const&) [7]
0.06 0.00 9506725/9506725 ns3::int64x64_t::GetHigh() const [62]
-----------------------------------------------
0.08 1.24 9506902/9506902 ns3::Time::FromDouble(double, ns3::Time::Unit) [4]
[8] 11.1 0.08 1.24 9506902 ns3::int64x64_t::int64x64_t(double) [8]
0.44 0.80 9505320/9505320 ns3::int64x64_t::int64x64_t(long double) [9]
-----------------------------------------------
Summary: ns-3 spends time fussing with events, intervals, and ns3::Time
.
The graph below shows an even network load on all 10 networks and shows that it takes approximately 45 seconds to stabilize:
A packet capture of nns2 across this interval captures about 2.5MB in 7,548 packets:
Of this:
- 79.6% of the packets are Real-Time Publish-Subscribe Wire Protocol and are DDS publish-subscribe overhead (DDS model: https://en.wikipedia.org/wiki/Data_Distribution_Service).
- 11.2% of the packets are Address Resolution Protocol for link layer MAC address discovery.
- 9.0% of the packets are data.
This table shows that there were 163.4 packets per second over 46.193 seconds and packets averaged 341 bytes each:
This table shows UDP packet statistics for each of the ten robots:
Notes:
- The Ground Station robot at 10.0.0.1 transmits 962 packets while the other nine transmit about 550 packets.
- There are three ports per robot. Of these, one port receives nothing, one port receives about 350 packets, and one port receives 987 packets for the GS and about 88 for each robot.
This table shows packets per second:
There are around 200 packets per second until the system stabilizes at about 40 packets per second. 40pps makes sense: 10 robots * 2 packets/sec * 2 for ack = 40pps.
By having nine robots transmit only once to the GS:
- UDP and ARP get set up in 22.4 seconds.
- 4913 packets are generated.
- None of the actual robot data reaches the GS.
Here is the packet distribution:
This IO graph shows packet load then no packet traffic after 22 seconds:
No robot subscription data makes it to the GS. Network traffic is primarily DDS and ARP overhead:
We have two goals:
- Reduce initialization overload: start robots at intervals instead of all at once.
- Determine whether ns-3 can keep up once the ad hoc infrastructure is established and stable.
-
Change
src/testbed_runner.cpp
to sleep after starting each robot thread, ref. https://stackoverflow.com/questions/4184468/sleep-for-milliseconds. -
Rebuild:
cd ~/gits/ns3_gazebo/ns3_testbed/cpp_testbed_runner colcon build
Here we change ns-3's fail code to instead print a warning. Ref. https://github.com/nps-ros2/ns3_gazebo/wiki/Installing-ns-3
-
Change
~/repos/ns-3-allinone/ns-3.29/src/core/model/realtime-simulator-impl.cc
line 378 tostd::cerr << "Hard real-time limit exceeded" << std::endl;
instead of fail. -
Rebuild:
cd ~/repos/ns-3-allinone/ns-3.29 ./waf build
-
Start ns-3
cd ~/gits/ns3_gazebo/ns3_testbed/ns3_mobility/build ./ns3_mobility -c 10
-
Start GUI
cd ~/gits/ns3_gazebo/ns3_testbed/ns3_testbed_gui ./tg.py
-
Start System Monitor and/or Wireshark
-
Start robots, pick robot configuration file
sudo /bin/bash cd ~/gits/ns3_gazebo/ns3_testbed/csv_setup ros2 run cpp_testbed_runner testbed_runner -c 10 -s ex1sec.csv -n -p
- Robots initialize at 1 second intervals.
- ns-3 prints a warning when ns-3 cannot keep up, but does not warn faster than 10Hz.
- Nine robots transmit at 1Hz.
- Robots start at 1 sec. intervals. ns-3 gets behind after 2 or 3 robots get started.
- It takes perhaps 80 seconds for traffic to stabilize.
- Usually ns-3 keeps up within 100ms but occasionally gets momentarily behind.
- ns-3 prints coordinates to stdout using multiple
<<
calls. This should be optimized to make just one<<
call. Then remeasure.
Wifi infrastructure mode does not work. Ping does not work. See https://github.com/nps-ros2/ns3_gazebo/wiki/ns-3-Wifi-Infrastructure-Tap-Problem.
This section provides links and examines a hostapd
effort that did not work.
- GS attempts to ping Sta: destination host unreachable.
- STA attempts to ping GS: ping request hangs.
Examples in ns-3 place packets directly onto the WifiNetDevice. Maybe device configuration above ns-3 is required. Below we attempt this configuration by configuring the virtual Ethernet devices inside the network namespaces. These devices are above the Tap devices.
Here we describe configuring two virtual Ethernet devices for Wifi Infrastructure mode: wifi_veth1
as the Access Point (AP) and wifi_veth2
as a station (Sta).
Configuration consists of creating several configuration files and running tools hostapd
and wpa_supplicant
giving configuration files as parameters.Setup for nns1:
ip link add wifi_veth1 type veth peer name wifi_vethb1
ip address add 10.0.0.2/9 dev wifi_vethb1
ip link set wifi_veth1 netns nns1
ip netns exec nns1 ip addr add 10.0.0.1/9 dev wifi_veth1
ip link add name wifi_br1 type bridge
ip link set wifi_br1 up
ip link set wifi_vethb1 up
ip netns exec nns1 ip link set wifi_veth1 up
ip link set wifi_vethb1 master wifi_br1
ip tuntap add wifi_tap1 mode tap
ip addr flush dev wifi_tap1
ip address add 10.0.0.3/9 dev wifi_tap1
ip link set wifi_tap1 up
ip link set wifi_tap1 master wifi_br1
Setup for nns2:
ip link add wifi_veth2 type veth peer name wifi_vethb2
ip address add 10.0.0.5/9 dev wifi_vethb2
ip link set wifi_veth2 netns nns2
ip netns exec nns2 ip addr add 10.0.0.4/9 dev wifi_veth2
ip link add name wifi_br2 type bridge
ip link set wifi_br2 up
ip link set wifi_vethb2 up
ip netns exec nns2 ip link set wifi_veth2 up
ip link set wifi_vethb2 master wifi_br2
ip tuntap add wifi_tap2 mode tap
ip addr flush dev wifi_tap2
ip address add 10.0.0.6/9 dev wifi_tap2
ip link set wifi_tap2 up
ip link set wifi_tap2 master wifi_br2
Tool hostapd
is not installed by default, so install it:
sudo apt install hostapd
Some links:
- ns-3 Wifi AP state machine API,
ns3::ApWifiMac
, https://www.nsnam.org/doxygen/classns3_1_1_ap_wifi_mac.html, Wifi MAC high Sta API,ns3::StaWifiMac
, https://www.nsnam.org/doxygen/classns3_1_1_sta_wifi_mac.html, 802.11 SSID information element,ns3::Ssid
, https://www.nsnam.org/doxygen/classns3_1_1_ssid.html#details. - ns-3
WifiNetDevice
documentation: https://www.nsnam.org/reviews/2016/socis-final/wifi-user.html, https://www.nsnam.org/reviews/2016/socis-final/wifi-user.html. - Wifi source code examples, ns3's
/examples/wireless/
directory. - wpa_supplicant: https://wiki.archlinux.org/index.php/WPA_supplicant
- hostapd.conf template: http://web.mit.edu/freebsd/head/contrib/wpa/hostapd/hostapd.conf
- Wireless netrowk configuration: https://wiki.archlinux.org/index.php/Wireless_network_configuration#Get_the_name_of_the_interface
- Configuring veth: http://sources.buildroot.org/iwd/git/doc/8021x-wired-testing.txt
- About implementing 802.11ah for ns-3: https://www.researchgate.net/publication/301328811_Implementation_and_validation_of_an_IEEE_80211ah_module_for_NS-3#pf5
- Generate ca_cert ca.pem certificate: https://github.com/sensepost/hostapd-mana/wiki/Creating-PSK-or-EAP-Networks
- wpa_supplicant/hostapd doc: https://w1.fi/wpa_supplicant/devel.
wpa_supplicant.conf
: https://linux.die.net/man/5/wpa_supplicant.conf - WifiMacHelper, ttps://www.nsnam.org/doxygen/classns3_1_1_wifi_mac_helper.html: "By default, it creates an Adhoc MAC layer without QoS".
We will create configuration files in ~/gits/ns3_gazebo/ns3_testbed2/config
.
Create hostapd.conf
file for each virtual Ethernet device, ref http://sources.buildroot.org/iwd/git/doc/8021x-wired-testing.txt.
hostapd_nns1.conf
:
interface=wifi_veth1
ssid=wifi-default
driver=wired
ieee8021x=1
use_pae_group_addr=1
eap_server=1
eap_user_file=hostapd.eap_user
ca_cert=newcertca.crt
server_cert=newcertca.crt
hostapd_nns2.conf
:
interface=wifi_veth2
ssid=wifi-default
driver=wired
ieee8021x=1
use_pae_group_addr=1
eap_server=1
eap_user_file=hostapd.eap_user
ca_cert=newcertca.crt
server_cert=newcertca.crt
These need files hostapd.eap_user
and newcertca.crt
.
Create hostapd.eap_user
, ref. http://sources.buildroot.org/iwd/git/doc/8021x-wired-testing.txt. Put in this or equivalent:
# Phase 1 users
* PEAP
# Phase 2
"test" MSCHAPV2 "password" [2]
Create newcertca.crt
, ref. https://github.com/sensepost/hostapd-mana/wiki/Creating-PSK-or-EAP-Networks:
openssl genrsa -out server.key 2048
openssl req -new -sha256 -key server.key -out csr.csr
openssl req -x509 -sha256 -days 365 -key server.key -in csr.csr -out server.pem
ln -s server.pem newcertca.crt
Note: This failed. It is included for completeness. Start ns-3:
cd ~/gits/ns3_gazebo/ns3_testbed2/ns3_mobility/build
./ns3_mobility2 -c 2
Run this from your nns window or use sudo and configure veth before moving it to nns. From nns window:
nns1:
cd ~/gits/ns3_gazebo/ns3_testbed2/config
ip netns exec nns1 /bin/bash
hostapd hostapd_nns1.conf
nns2:
cd ~/gits/ns3_gazebo/ns3_testbed2/config
ip netns exec nns2 /bin/bash
hostapd hostapd_nns2.conf
Now for wpa_supplicant
, ref. http://sources.buildroot.org/iwd/git/doc/8021x-wired-testing.txt:
Create wpa_supplicant.conf
configuration file:
ap_scan=0
fast_reauth=1
network={
ssid="wifi-default"
scan_ssid=0
key_mgmt=IEEE8021X
eap=PEAP
phase2="auth=MSCHAPV2"
identity="test"
password="password"
ca_cert="newcertca.crt" # replace with your CA certificate path
}
Run wpa_supplicant
from nns:
wpa_supplicant -i <veth1> -c <nns_specific_wpa_supplicant.conf
Here we examine network traffic flows where traffic is generated in ns-3 and network flows stay within the ns-3 environment. Traffic does not flow through Tap devices to network ports outside ns-3.
ns-3 provides about 40 Wireless configuration examples in ~/repos/ns-allinone-3.30.1/ns-3.30.1/examples
. ns-3 also provides seven turorials. Examples include wifi-simple-adhoc.cc
which uses 80211b and takes input parameters including the physical layer mode, the signal strength, packet size, number of packets, and packet transmit interval. Tutorials include third.cc
which builds a network that includes Wifi Infrastructure nodes.
Using ns-3's examples we can experiment with ad-hoc and infrastructure modes. They output to a pcap file for each node.
We could copy work "Implementation and Validation of an IEEE 802.11 ah Module for ns-3" to see if we get their results, but I discourage this. We would need to work with a branch of ns-3 such as https://github.com/duraraxbaccano/802.11ah-ns3 which came from ns-3 v3.23 (we are using ns-3 v3.31).
Instead, I recommend that we use 802.11n which is integrated into ns-3 along with settings identified in Thesis "Mapping ad hoc communications network of a large number fixed-wing UAV swarm" by Alexis Pospischil, https://apps.dtic.mil/dtic/tr/fulltext/u2/1045965.pdfhttps://gitlab.nps.edu, discussed at https://gitlab.nps.edu/ros2/ros2_cybersecurity_group/-/wikis/Large-Swarm.
Although 802.11ah may offer more range, we use 802.11n because this is a more common protocol and it reflects Alexis' UAV swarm.
Rather than connecting ns-3 Node objects to Tap devices, we install ns3::InternetStackHelper
and ns3::Ipv4AddressHelper
, and create an ns3::Ipv4InterfaceContainer
. Unless we need to capture data for the application to consume, we do not need to create source and sink ns3::Socket
objects. Interval-based transmit is performed using ns3::Simulator::ScheduleWithContext
which generates UDP traffic at assigned nodes. Packets are picked up at receiving nodes that are assigned to sockets for callback.
Goal: have ns-3 simulate the ad hoc Wifi network defined in the Thesis.
Steps:
- Build ns-3's ad hoc Wifi example.
- Add multiple nodes and have them broadcast (not point-to-point) to other nodes.
- Create 51 nodes and have them broadcast messages at periodicity per Thesis.
- Consider staggering transmit times to avoid packet collisions. Current performance is unrealistic.
- Fix mobility model and antenna strength.