Scenario Demo - nps-ros2/ns3_gazebo GitHub Wiki
We use lessons learned from the ROS2 ns3 Gazebo Demos 1-3 to build this demo.
Here we demonstrate defining an ad-hoc Wifi communication using a CSV file and testing network flows using ns-3 and a test launcher script.
Here are the sections of this page:
- Setup - Set up the environment to run the demos.
- Demo 1: Proof of concept. Here we show how we put together the ns-3 network simulator program, the testbed GUI showing performance, and the program that starts the robots.
- Demo 2: Scenario builder. Here we implement a parser for building the scenario from a CSV file.
- Demo 3: C++. Due to bottleneck performance, we reimplement robot code in C++ and observe the same bottleneck.
- Demo 4: Simtime mode. We test using simulation time instead of realtime, but DDS runs in realtime when managing DDS reliable protocol.
- Demo 5: Latency. We perform more experiments to evaluate measured latency.
Setup
Setup for this demo includes the following installation:
- The ROS2 environment, see https://github.com/nps-ros2/ns3_gazebo/wiki/Installing-the-ROS2-Environment
- The ns-3 (ns-3.29) Network Simulator, see https://github.com/nps-ros2/ns3_gazebo/wiki/Installing-ns-3
- Setup for Network Namespaces and network devices, see https://github.com/nps-ros2/ns3_gazebo/wiki/Installing-Network-Namespaces-and-Network-Topology
- Building the ns3 program and the ROS2 nodes
- Setting up global variables
- Defining your robot scenario
Once setup is complete we run the talker-listener demo between separate network namespaces and between Gazebo and the host.
Install the ROS2 environment
Install the ROS2 environment, see https://github.com/nps-ros2/nps-ros2-examples/wiki/Installing-the-ROS2-Environment
Install ns-3
Install the ns-3 network simulator, see https://github.com/nps-ros2/ns3_gazebo/wiki/Installing-ns-3.
Install Network Namespaces and Network Topology
Install network namespaces and their network devices as described at https://github.com/nps-ros2/ns3_gazebo/wiki/Installing-Network-Namespaces-and-Network-Topology. In this example, use --count 20
to support 1 Ground Station and 29 Robots. Specifically:
cd ~/gits/ns3_gazebo/scripts
sudo ./nns_setup.py setup -c 20
Build the ns-3 Mobility Program
Compile the ns3_mobility
program that will provide a stationary antenna for nns1
and moving antenna locations for nodes nns2
and above:
cd ~/gits/ns3_gazebo/ns3_testbed/ns3_mobility
mkdir build
cd build
cmake ..
make
Build ROS2 Nodes
Build the ROS2 GS and robot testbed nodes:
cd ~/gits/ns3_gazebo/ns3_testbed/ns3_testbed_nodes
colcon build
Set Global Variables for ROS2 Nodes
Add this to .initrc
or run it directly:
source ~/gits/ns3_gazebo/ns3_testbed/ns3_testbed_nodes/install/local_setup.bash
Set Global Variables for ns-3
Define global variables so the ns3_mobility
program can find ns-3 components when it runs. Add this to your .initrc
file:
# ns-3 compatibility
export LD_LIBRARY_PATH=~/repos/ns-3-allinone/ns-3.29/build/lib:$LD_LIBRARY_PATH
export PATH=$HOME/repos/ns-3-allinone/ns-3.29/build/src/fd-net-device:$HOME/repos/ns-3-allinone/ns-3.29/build/src/tap-bridge:$PATH
Define your Robot Scenario
Define your robot communication and QoS setup as described in https://github.com/nps-ros2/ns3_gazebo/wiki/Defining-your-Robot-Scenario.
Demo 1 proof-of-concept
The ns3 testbed demo consists of these parts, each run in a separate command window:
- The ns-3 network simulator program supporting mobility and Wifi for 20 nodes.
- The testbed GUI showing communication latency and loss.
- The Ground Station and the robots, communicating as defined by settings in your spreadsheet.
Start the ns-3 Program:
cd ~/gits/ns3_gazebo/ns3_testbed/ns3_mobility/build
./ns3_mobility -c 20
Start the Testbed GUI:
cd ~/gits/ns3_gazebo/ns3_testbed/ns3_testbed_gui
./tg.py
Start the GS and Swarm Robots
- A root shell is required to start robot nodes within their own network namespaces.
- Use your own CSV swarm setup file or use the default CSV setup file at
~/gits/ns3_gazebo/ns3_testbed/csv_setup/example1.csv
.
Start the root shell and the robots (R1, R2, R3, etc. depending on count) using the testbed_runner
program:
sudo /bin/bash
cd ~/gits/ns3_gazebo/ns3_testbed/cpp_testbed_runner/build/cpp_testbed_runner
./testbed_runner.py -n -p -c 20
testbed_runner.py
supports several options:
-c
The count of ROS2 nodes to start.-s
The CSV scenario setup file.-n
Run the nodes in network namespaces instead of the system network space.-p
Send received subscription metadata to the pipe that the GUI is listening to.-v
Verbose, print out received subscription metadata and some other diagnostics.
Demo 2: Scenario builder
- We run multiple nodes, for example one GS and nine robots, each in their own network namespace.
- My 4-CPU system does not keep up as manifested by CPU workload and pauses in ns-3 output.
Input Data
Robot data and QoS settings are configured for each ROS2 node. This example configures 39 robots:
Traffic Flow Monitor
Received packet traffic is refreshed once per second. Index is the message number for the message. Size is the size of the message received, in bytes. Latency is the delay in microseconds. Nine robots are monitored:
CPU Workload
This 4-CPU system does not always keep up, seen by one CPU always working at 100%:
ns-3 Positions
The ten GS and robot x y z positions are refreshed at 10Hz. GS is stationary. Some Robots move out of range. This is seen in the output of the ns3 mobility program:
Comments
- Uses Linux
pipe
to flow data from GS to pipe. - Uses custom codec to encode/decode data through pipe.
- GUI consumes data from pipe once per second.
- If a row in the GUI is blank, we zero the row.
- Latency is in microseconds and is calculated as
current_time - data_timestamp
.
Demo 3: C++ Implementation
To ensure the lack of scalability is not a result of the Python implementation, we port the robot code from Python to C++. Results shown indicate that there is still a bottleneck on one CPU.
Setup
Setup is similar to setup for part 1 except for the part about ROS2 Nodes.
Build
Instead of building Python nodes at ns3_testbed_nodes
we build C++ nodes at cpp_testbed_runner
:
cd ~/gits/ns3_gazebo/ns3_testbed/cpp_testbed_runner
colcon build
Set Global Variables
Add this to .initrc
or run it directly:
source ~/gits/ns3_gazebo/ns3_testbed/cpp_testbed_runner/install/local_setup.bash
Start the C++ Robots
Launch the robots using the C++ testbed_runner
program from the directory where Colcon built it, specifying to use pipes and network namespaces and to start 30 of them:
sudo /bin/bash
cd ~/gits/ns3_gazebo/ns3_testbed/cpp_testbed_runner/build/cpp_testbed_runner
./testbed_runner -p -n -c 30
Results
My 4-CPU system still does not keep up. Ten nodes:
CPU Workload
System monitor:
htop
:
Demo 4: Simtime Mode
We may be able to simulate the steady-state portion using ns-3 in simulation time mode instead of real-time mode.
We still require network namespaces because DDS must run outside ns-3 (ns-3 does not simulate DDS). DDS may use wall time for watchdog and discovery so we cannot properly simulate DDS timing unless we port DDS to use ns-3 simulation time instead of wall time.
The approach is to export ns-3 simulation time into the robots. Robots act on this time, not on wall time:
- Robots transmit at intervals based on simulation time.
- Robots receive data and timestamp receipt based on simulation time.
We would use memory mapped IO (MMIO) for interprocess communication. ns-3 writes simulation time to MMIO memory on each update of its simulation timer. Robots read this MMIO memory to obtain simulation time instead of wall time.
Here is a brief diagram of this approach:
Note that DDS will still use wall time, so DDS will incorrectly retransmit packets if wall time times out and simulation time would have not timed out. We could port DDS to read simulation time instead of wall time so DDS can be modeled accurately, too.
Goals
- We do not need to measure DDS configuration time. We can measure performance after network discovery has been set up.
Design
ns-3 design
-
Copy
~/repos/ns-3-allinone
to~/repos/ns-3-custom
-
Use class
shared_simetime_t
with interfacesuint64_t t()
andset_t(uint64_t)
allowing interfaces so processes can set or read simulation time viasizeof(uint64_t)
bytes of shared memory named/testbed_shared_simtime
. ns-3 will write time in default units of nanoseconds. Robots will read. Seens3_testbed2/ns3_simtime_support
for class, header, test program, and changes to ns-3.29. -
Change file
ns-3.29/src/core/model/realtime-simulator-impl.cc
and.h
so the scheduler also callsset_t(t)
when it updates its simulation time. To simplify code management, we put this class right into these files. You may copy these changed files from the repository:cd ~/repos/ns-3-custom/ns-3.29/src/core/model cp ~/gits/ns3_gazebo/ns3_testbed2/ns3_simtime_support/changed-ns-3.29-files/realtime-simulator-impl.cc . cp ~/gits/ns3_gazebo/ns3_testbed2/ns3_simtime_support/changed-ns-3.29-files/realtime-simulator-impl.h .
-
Rebuild ns-3, ref. https://github.com/nps-ros2/ns3_gazebo/wiki/Installing-ns-3:
cd ~/repos/ns-3-custom/ns-3.29 ./waf configure --enable-sudo ./waf build
-
Change
.bashrc
so ns-3 accesses ns-3-custom:export LD_LIBRARY_PATH=~/repos/ns-3-custom/ns-3.29/build/lib:$LD_LIBRARY_PATH export PATH=$HOME/repos/ns-3-custom/ns-3.29/build/src/fd-net-device:$HOME/repos/ns-3-allinone/ns-3.29/build/src/tap-bridge:$PATH
Question: Will simulation time update when there is no network traffic?
Robot design
Change robot to read custom timestamp instead of chrono. Specifically, replace function _now()
with shared_simtime_t::t()
. Units are in nanoseconds. This change is available in ns3_testbed_simtime/cpp_testbed_runner
. Build it:
cd ~/gits/ns3_gazebo/ns3_testbed_simtime/cpp_testbed_runner
colcon build
Run
Start ns-3
Start ns-3, keep range length short
cd ~/gits/ns3_gazebo/ns3_testbed/ns3_mobility/build
./ns3_mobility -c 5 -l 2
Start the Testbed GUI
cd ~/gits/ns3_gazebo/ns3_testbed/ns3_testbed_gui
./tg.py
Start Wireshark
sudo wireshark
Define your Configuration
This configuration sets GS R1 to listen and for R2..R30 to transmit 500 bytes "odometry" at 1 Hz. Note that we do not use all 30 robots, the number is set in the ns-3 ns3_mobility
program.
Publish,,,,,,,
Node,Subscription,Frequency,Size,History,Depth,Reliability,Durability
R2-30,odometry,1,500,keep_last,0,reliable,volatile
Subscribe,,,,,,,
Node,Subscription,History,Depth,Reliability,Durability,,
R1,odometry,keep_last,0,reliable,volatile,,
R1,image,keep_last,0,reliable,volatile,,
Start the Robots
sudo /bin/bash
cd ~/gits/ns3_gazebo/ns3_testbed_simtime/cpp_testbed_runner/build/cpp_testbed_runner
./testbed_runner -p -n -c 10
Results
Upper Limit
- 6 robots (5 publishers and 1 subscriber) transmitting 500 bytes and 2500 bytes at 10 Hz.
- 10 robots (9 publishers and 1 subscriber) transmitting 500 bytes at 1 Hz.
Latency
Here is latency for ten robots, specifically, 9 publishers and 1 subscriber, 500 bytes odometry
at 1Hz, using simtime timestamps:
Here is latency using walltime instead of simtime timestamps:
Notes
- Our simulation time implementation may not be necessary. We can watch the latency in the GUI. Shortly after all robots have registered, latency will either continue to increase or will slowly decrease to a fraction of a second.
- There is significantly more latency jitter when running ns-3 and robots on simtime rather than running using walltime. Which is correct?
- All messages from all robots are sent at once. It would be more realistic to stagger transmission times between robots. All at once is worst-case.
Demo 5: Latency
Latency in part 4 seems high. This page examines latency with five and then with two nodes, distance between robots is 0 to 2 meters.
Rather than testing with 1 transmitter and 4 receivers, I used 4 transmitters and 1 receiver to fit the data capture code. The first results are for 5 robots, the third result is for 2 robots.
5 Robots
4 transmit 1 receive
All Transmit at the Same Time.
4 transmit 1 receive, transmitting at the same time averages 1.5 and more milliseconds latency:
Staggered Transmission
4 transmit 1 receive, staggered starting times averages about 1.5ms latency:
2 Robots
1 transmit 1 receive
1 transmit 1 receive exhibited intermittent results. Sometimes delay was 0.7 ms, sometimes 1.0 ms:
Conclusions
- The latency we measured may be appropriate.
- The inconsistency in delay in the 2 Robots scenario may be troubling
- We can try more experiments: take larger samples and test with all swarm sizes from 2 to 10.
Extended run
Here is latency over 15 minutes for R1-R2 communication. Note popular latency regions and periodicity:
Here is latency using the original setup: