Generating your own files to run the neural network - lasseufpa/5gm-lidar GitHub Wiki

Generate the data and the neural network input and output

To generate the inputs and outputs you'll need the InSite's (noOverlappingTx4m_s1130) and Blensor's results (blensor_scans), given below:

Blensor Scans: https://nextcloud.lasseufpa.org/s/iNPZPbD22LneZ2B

Insite Simulation: https://nextcloud.lasseufpa.org/s/MDgwnpHYgk9kQMx

1. How to convert InSite into MIMO channels

Assume the InSite data is at the folder D:\insitedata\noOverlappingTx4m_s1130 (where 1130 is the SUMO random seed). Another alternative is to name as D:\insitedata\simul45_20180915 (format is year, month, day).

We have the following subtasks:

  1. Channel generation: Convert InSite channel data in a database (episodedata.db) and also into files that can be easily read by Python (npz) and Matlab (hdf5)
  2. Lists: Create two lists of text files. The first indicates all valid and invalid receivers, while the second deals only with the valid receivers and also inform the LOS and NLOS ones.

Channel generation

  1. First, you'll need to convert Insite info into a database, for that, you'll need the code found in this link (5gm-data).

It is recommended to rename any episodedata.db previously generated to avoid any info loss.
Now, in the cloned folder (5gm-data), run:

cd 5gm-data python todb.py D:\insitedata\noOverlappingTx4m_s1130

In the end, you'll know how many episodes were processed and should take note of that because it will be requested, using the provided data it should look like:

Processed episode: 2086 scene: 1, total 2086
Warning: could not find file  D:\insitedata\noOverlappingTx4m_s1130\run02086\study\model.paths.t001_01.r002.p2m  Stopping...

Processed  2086  scenes (RT simulations)

In the provided data, we only have one scene per episode.

A file episodedata.db was created in the current folder. Move it to its final folder.

  1. Now, we'll write the ray information as both npz and hdf5 files in the folder ./insitedata. Note that all data is written and if the Tx / Rx pair is not valid, NaN is used.

Make sure the episodedata.db is in the current folder, maybe you have different .db files:

D:\github\5gm-data>dir *.db
 O volume na unidade D é Data
 O Número de Série do Volume é 1CBF-8747

 Pasta de D:\github\5gm-data

19/09/2018  17:55       189.898.752 episodedata.db
19/09/2018  17:55       189.898.752 episodedata_correct_tx4m.db
11/06/2018  23:34       568.399.872 episodedata_e116.db
23/07/2018  16:53       382.160.896 episodedata_e119.db
12/09/2018  23:25       225.923.072 episodedata_flat_new_marcus.db
10/09/2018  23:20       162.029.568 episodedata_longepisodes.db
15/08/2018  11:47       189.792.256 episodedata_no_cir_overlap_tx4m.db
12/08/2018  02:40       143.196.160 episodedata_no_cir_overlap_tx5m.db
15/09/2018  17:56       225.796.096 episodedata_simul45_flat_new_marcus.db

Make sure you are using the right episodedata.db (newly created) and edit convert5gmv1ToChannels.py to indicate e.g. the correct number of scenes per episode (one, in this example) and the output folder. If it is .\insitedata you may want to delete all files before creating new (old files would be overwritten but if there are less new files than older, you may mix files from 2 distinct simulations in the same folder):

del .\insitedata\*
mkdir .\insitedata

Then, run: python convert5gmv1ToChannels.py

Recall that episode.db knows how many rays were obtained (sometimes a number smaller than the maximum of e.g. 25 rays), and the arrays saved by the script will use NaN when the number of rays is smaller than the maximum. One can always check the number of rays by looking at the two lists that will be created later. At the end, you will find .hdf5 for Matlab and .npz for Python in the output folder and be able to see how many LOS, NLOS and total valids (Sum) one has:

Start time =  62580000  and sampling period =  0.1  seconds
Episode: 2085 out of 2086
['flow7.5020', 'flow7.5021', 'flow7.5022', 'flow7.5023', 'flow8.2489', 'flow8.2493', 'flow9.2527', 'flow9.2528', 'flow9.2529', 'flow9.2531']
 Done: 100.00% Scene: 1 time per scene: 0:06:09.742866 time to finish: 0:00:00
==> Wrote file ./insitedata/urban_canyon_v2i_5gmv1_rays_e2085.npz
==> Wrote file ./insitedata/urban_canyon_v2i_5gmv1_rays_e2085.hdf5
numLOS =  6482
numNLOS =  4712
Sum =  11194

Create 4 list

We will use only four lists as text (in fact CSV) files. We show how to generate them.
In the 5gm-data folder, you may:

  1. Write the 1st list by redirecting the stdout with

python ak_generateInfoList.py > list1_valids_and_invalids.csv

Note that the output will depend on the episodedata.db that is in the current folder. You then need to edit this csv file to eliminate its first and last rows (that should be similar to the ones below).

First lines:

########## Important ##########
Will try to open database in file (should be in your current folder):  episodedata.db
Successfully opened  episodedata.db
##############################

Last lines:

numValidChannels =  11194
numInvalidChannels =  9666
Sum =  20860
numLOS =  6482
numNLOS =  4712
Sum =  11194
  1. Then we need a second list, which includes information only for the valid receivers. Run:

python ak_generateInSitePlusSumoList.py D:\insitedata\noOverlappingTx4m_s1130 > list2_only_valids.csv

where the first argument (argv[1]) is the InSite output folder. Then we can specialize the list of valid receivers to other lists with only LOS and NLOS:

grep LOS=0 list2_only_valids.csv > list2_only_validsNLOS.csv
grep LOS=1 list2_only_valids.csv > list2_only_validsLOS.csv

If you have wc (you are using e.g. Linux), you can check if the results below match numLOS and numNLOS indicated in list1_valids_and_invalids.csv (see above):

wc *.csv
  20860   20860 1255435 list1_valids_and_invalids.csv
  11194   11194 1282362 list2_only_valids.csv
   6482    6482  741028 list2_only_validsLOS.csv
   4712    4712  541334 list2_only_validsNLOS.csv
  43248   43248 3820159 total

Now you have the channels and the 4 lists. You can continue to use Python to do beam-selection or work with LIDAR PCDs with Matlab. We assume the latter case in the next section.

3. How to convert Blensor output into obstacle matrices

This process is Similar to channel generation using the insite files, but instead, it will generate the so-called obstacle matrices, using the LIDAR output from Blensor.

  1. First, we need the episodedata.db corresponding to the RT simulations, copy it or create a symbolic link of it in the 5gm-lidar folder.

Now, in 5gm-lidar folder, run generateMatrixChannels.py

python generateMatrixChannels.py > matrixChannels.csv

Note that the output will depend on the episodedata.db that is in the current folder. You then need to edit this csv file to eliminate its first and last rows (that should be similar to the ones below).

First lines:

########## Important ##########
Will try to open database in file (should be in your current folder):  episodedata.db
Successfully opened  episodedata.db

Last lines:

numValidChannels =  11194
numInvalidChannels =  9666
Sum =  20860
numLOS =  6482
numNLOS =  4712
Sum =  11194

Also, make sure to add to the created CSV above the following header:

Val,EpisodeID,SceneID,VehicleArrayID,VehicleName,x,y,z,LOS

Note that the downloaded Blensor files should be inside a folder in their zips, you must run lnScans.py to create a symbolic link of these files inside the inSite folder using lnScans.py script

python lnScans.py insiteFolder BlensorScansFolder

Now the obstacles matrices can be created using the readPCD.py script. Note that this script utilizes python2 instead of python3

  • epi_begin: is the starting episode (0 in our provided data).
  • epi_end: is the ending episode (2085 in our provided data).
  • You can choose between 3D or 2D matrices (use capital letter to avoid mistakes).
  • You can use info with or without noise (1 for noise data and 0 for noiseless data).
    After choosing the parameters, replace them in the line below and run:
python2 readPCD.py epi_begin epi_end 3D/2D 1/0

After running the script above, you should have a folder such as 'obstacles_new_3D' with the obstacle matrices inside npz files organized per episode.

4. Using Python and MatLab to process MIMO channels and generate beam-selection outputs

1.Codebook design

The codebooks are generated with Matlab. Note that this code utilizes functions from https://github.com/aldebaro/dsp-telecom-book-code, so you should download this repository and run ak_setPath.m to make it work properly. With MatLab, run:

D:\github\5gm-lidar\matlab\upa_codebook_creation.m

Assume we ran it twice (changing the number of Nrx and Ntx in lines 17 and 18) to generate:

>> upa_codebook_creation
Wrote upa_codebook_16x16_N832.mat
>> upa_codebook_creation
Wrote upa_codebook_4x4_N52.mat

We used Nrx and Nxt as 16 and 4 in the example.

  1. Generating the equivalent channel gains that depend on the beams:

Assuming the convert5gmv1ToChannels.py already wrote the output files (e.g. in folder .\insitedata), then edit getBestBeamsFromChannelRays.py to change the parameters bellow that suit the current simulation:

insiteCSVFile = 'D:/github/5gm-data/list2_only_valids.csv'
txCodebookInputFileName = 'D:/github/5gm-lidar/matlab/upa_codebook_16x16_N832.mat'
rxCodebookInputFileName = 'D:/github/5gm-lidar/matlab/upa_codebook_4x4_N52.mat'
numEpisodes = 2086  # total number of episodes
outputFolder = 'D:/github/5gm-data/outputnn/'

Make sure the output folder (e.g. D:\github\5gm-data\outputnn) does not contain files from previous simulations:

del outputnn\*

Note that two files will be written in the output folder: a) output_e_XXX.npz and b) outputs_positions_e_XXX.hdf5. The first one is npz while the second is hdf5. Another difference is that while the first has only the array episodeOutputs with the equivalent channels per beam pair, the hdf5 have array episodeOutputs and array receiverPositions with the positions of all receivers.

Evaluate the beams:

python getBestBeamsFromChannelRays.py

This script informs when it ends some statistics and the histogram for each codebook (Tx and Rx). Confirm the statistics match your numLOS and numNLOS informed previously, and copy the histogram values to place on the Matlab script that prunes (reduces the sizes) of the codebooks.

Edit matlab\upa_codebook_prune_unused.m and copy the histograms generated to the script.

tx_indices_histogram = [ 0 0 0 0 0 0 0 0 82 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 403 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 336 0 0 0 0 0 0 0 0 0 0 0 0 0 0 190 1 591 0 0 0 0 0 0 0 0 0 0 0 0 2 396 0 788 1 0 0 0 0 0 0 0 0 0 0 2 283 37 0 67 455 2 0 0 0 0 0 0 0 1 3 168 55 0 0 0 95 228 3 0 0 0 0 1 12 83 98 10 0 0 0 0 0 5 126 84 7 0 70 91 63 12 0 0 0 0 0 0 0 0 0 7 78 87 0 0 0 16 15 0 0 0 0 0 0 0 8 3 0 0 0 0 0 0 0 0 0 0 0 0 16 25 0 0 0 0 0 0 0 0 0 0 0 0 0 19 93 0 0 0 0 0 0 0 0 0 0 0 0 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 7 0 9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 230 3 0 0 0 0 0 216 0 0 0 0 0 0 0 55 4 201 2 0 0 0 0 381 0 0 0 0 0 1 0 13 0 21 132 5 1 0 0 218 0 0 0 0 1 56 0 0 0 0 13 157 7 0 0 612 0 0 0 0 79 18 0 0 0 0 0 13 369 6 0 777 0 0 0 0 17 0 0 0 0 0 0 0 1 436 23 40 0 0 0 0 0 0 0 0 0 0 0 0 1 132 589 202 0 0 0 0 0 0 0 0 0 0 0 0 0 2 209 175 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 72 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 14 0 0 0 0 0 0 0 0 0 0 30 0 0 0 0 0 0 0 0 0 0 9 0 0 1 142 2 0 0 0 0 2 0 0 0 0 0 1 0 0 169 5 0 0 0 0 0 0 0 0 0 0 0 0 0 43 ];
rx_indices_histogram = [ 0 0 758 0 0 9 528 26 318 454 0 428 0 217 1271 168 52 5 88 92 0 0 0 0 3 2 0 0 0 0 0 0 0 0 0 0 252 148 0 84 1 177 0 1 7 3806 0 3 944 1029 0 323 ];
  1. Prune the codebooks based on the statistics (histograms):

At D:\github\5gm-lidar\matlab, run

upa_codebook_prune_unused.m

to generate

Wrote tx_upa_codebook_16x16_N832_valid.mat with 109 codewords
Wrote rx_upa_codebook_4x4_N52_valid.mat with 28 codewords
prob of most popular Tx index=0.070395
prob of most popular Rx index=0.34

Now edit D:\github\5gm-data\getBestBeamsFromChannelRays.py to indicate the new '_valid' (add as suffix) codebooks (add tx and rx as prefixs)

txCodebookInputFileName = 'D:/github/5gm-lidar/matlab/tx_upa_codebook_16x16_N832_valid.mat'
rxCodebookInputFileName = 'D:/github/5gm-lidar/matlab/rx_upa_codebook_4x4_N52_valid.mat'
  1. Run getBestBeamsFromChannelRays.py again

Run getBestBeamsFromChannelRays.py again to create final npz’s with smaller codebooks (files in the output folder will be overwritten)

D:\github\5gm-lidar>python getBestBeamsFromChannelRays.py

Note the new histograms do not have zeros:

total numOfInvalidChannels =  9666
total numOfValidChannels =  11194
Sum =  20860
total numNLOS =  4712
total numLOS =  6482
Sum =  11194
tx_indices_histogram = [ 82 403 336 190 1 591 2 396 788 1 2 283 37 67 455 2 1 3 168 55 95 228 3 1 12 83 98 10 5 126 84 7 70 91 63 12 7 78 87 16 15 8 3 16 25 19 93 17 7 9 4 10 3 1 230 3 216 55 4 201 2 381 1 13 21 132 5 1 218 1 56 13 157 7 612 79 18 13 369 6 777 17 1 436 23 40 1 132 589 202 2 209 175 3 72 1 3 10 14 30 9 1 142 2 2 1 169 5 43 ];
rx_indices_histogram = [ 758 9 528 26 318 454 428 217 1271 168 52 5 88 92 3 2 252 148 84 1 177 1 7 3806 3 944 1029 323 ];

The files output_e_XXX.npz and outputs_positions_e_XXX.hdf5 have the results for each beam pair.

  1. Generate ML outputs

Edit createBeamsOutputsAsNpz.py. You will need to inform the codebook sizes (version "valids").

cd D:\gits\lasse\software\5gm-lidar
python createBeamsOutputsAsNpz.py

This saves file beams_output.npz that is the output of the neural nets.

5. Running the neural network with the generated inputs and outputs (Beam-Selection):

  1. First, you'll need to convert the obstacles matrices into a single file. For that, run (replacing obstacleMatricesFolder with your obstacle matrices folder):
    5gm-lidar> python createInputFromLIDAR.py obstacleMatricesFolder beams_input.npz

It will save beams_input.npz, our neural network input.

  1. Now, you can run the neural network with: 5gm-lidar> python classifierTopKBeams.py