nlu training - socrob/mbot_documentation GitHub Wiki
You must complete the NLU Setup before following this document.
Generating input and output data
GPSR
cd into the gpsr folder:
$ cd $ROS_WORKSPACE/mbot_natural_language_processing/mbot_nlu_training/common/src/mbot_nlu_training/gpsr
generate data for intent training:
$ python /intention_NN/gpsr_data_generator.py
generate data for slots training:
$ python /intention_NN/gpsr_data_generator.py
two files are generated in respective folders:
input / inputs_slot_filling
output / outputs_slot_filling
input are the phrases and output is the ground truth intent or label corresponding to the phrases. The data generator takes a bit of time in less powerful PCs. (tip: Try to run it in Harode or Batatinha)
For ERL data generation, do the above steps in ERL training folder:
$ cd $ROS_WORKSPACE/mbot_natural_language_processing/mbot_nlu_training/common/src/mbot_nlu_training/erl
Setting up the PC for training
This is usually done in Harode or Batatinha.
-
tensorflow set up using virtual environment === Create a Virtualenv environment by issuing the following command:
$ virtualenv --system-site-packages -p python3 tensorflow
This needs virtual-env pre installed for it to work. Ask some one with admin privilages to set it up in case not. For reference: tensorFlow installation reference
Activate the Virtualenv environment by issuing one of the following commands:
$ source ~/tensorflow/bin/activate
The preceding source command should change your prompt to the following:
(tensorflow)$
Ensure pip ≥8.1 is installed:
(tensorflow)$ easy_install -U pip
Issue the following command to install TensorFlow in the active Virtualenv environment:
(tensorflow)$ pip3 install --upgrade tensorflow-gpu==1.4.0
When you are done using TensorFlow, you may deactivate the environment by invoking the deactivate function as follows:
(tensorflow)$ deactivate
-
Uninstalling tensorFlow ===
To uninstall TensorFlow, simply remove the tree you created. For example:$ rm -r ~/tensorflow
-
Install progressbar ===
Progress bar is used in the training script for feedback.
Install progressbar by invoking the following command:
(tensorflow)$ easy_install -U progressbar2
- Run the training ===
Copy the mbot_nlu_training folder to the training PC using scp.
The training for intent or slots is started by running the train_nn_model.py, which can be found in the gpsr/intention_NN or gpsr/slots_NN folders (mentioned above) respectively.
Before running the code make sure you have the GPU index specified at the last of the training file in the environment variable [CUDA_VISIBLE_DEVICES]:
(tensorflow)$ python3 train_nn_model.py
- Results ===
The classifiers from the training are stored after each epoch in latest_intent_classifier and latest_slots_classifiers respectively.
- Tips ===
- The intent and slots training can be executed in parallel. For this create two virtual environments with separate names. Run intent from virtual environment 1 and slots from virtual environment 2
- Use glances to see the resources consumed by the process
- Use tmux to create sessions for training. This way the training is securely running even though your PC is off/lost connection.
- The training script currently has functionality to select the GPU for training. This is done using the env variable [CUDA_AVAILABLE_DEVICES] at the bottom of the training script.