04 02 sentiment intent analysis - VforVitorio/F1_Strat_Manager GitHub Wiki
Sentiment and Intent Analysis
Relevant source files
- scripts/NLP_radio_processing/N00_data_labeling.ipynb
- scripts/NLP_radio_processing/N02_sentiment_analysis_vader.ipynb
- scripts/NLP_radio_processing/N03_bert_sentiment.ipynb
- scripts/NLP_radio_processing/NLP_utils/N03_bert_sentiment.py
- scripts/NLP_radio_processing/NLP_utils/N04_radio_info.py
- scripts/NLP_radio_processing/NLP_utils/N05_ner_models.py
- scripts/NLP_radio_processing/NLP_utils/N06_model_merging.py
- scripts/NLP_radio_processing/images/The-RoBERTa-model-architecture.jpg
Sentiment Analysis Model
The system uses a fine-tuned RoBERTa-base model to classify radio messages into three sentiment categories: positive, neutral, and negative. This sentiment information helps the strategy engine understand the emotional context of communications, which can inform decisions about driver management and race approach.
Model Architecture
RoBERTa (Robustly Optimized BERT Pretraining Approach) was selected for its superior performance on sentiment analysis tasks and ability to understand domain-specific language like F1 radio communications.
- scripts/NLP_radio_processing/N03_bert_sentiment.ipynb46-86
- scripts/NLP_radio_processing/NLP_utils/N03_bert_sentiment.py34-109
Model Training
The sentiment model was fine-tuned on a dataset of 530 manually labeled F1 radio messages with the following distribution:
Sentiment | Count | Percentage |
---|---|---|
Neutral | 379 | 71.5% |
Negative | 101 | 19.1% |
Positive | 50 | 9.4% |
The model was trained using a cross-entropy loss function with class weighting to handle the class imbalance. The training process involved: |
- Splitting data into train (70%), validation (15%), and test (15%) sets
- Tokenizing with a maximum sequence length of 128 tokens
- Fine-tuning the pre-trained RoBERTa-base model
- Early stopping based on validation loss
- scripts/NLP_radio_processing/N03_bert_sentiment.ipynb202-211
- scripts/NLP_radio_processing/N03_bert_sentiment.ipynb303-337
- scripts/NLP_radio_processing/N00_data_labeling.ipynb482-483
Prediction Function
The model uses the following function to predict sentiment from text:
Intent Classification
The intent classification model categorizes radio messages based on their communicative purpose, using a fine-tuned RoBERTa-large model.
Intent Categories
The system recognizes five distinct intent types:
Intent Type | Description | Example |
---|---|---|
INFORMATION | Factual updates about race conditions | "Hamilton is 2 seconds behind" |
PROBLEM | Messages indicating issues | "My left wing is broken" |
ORDER | Direct instructions to the driver | "Box this lap for softs" |
WARNING | Alerts about potential issues | "Watch your fuel consumption" |
QUESTION | Queries requiring driver input | "How are the tyres feeling?" |
- scripts/NLP_radio_processing/NLP_utils/N04_radio_info.py90-116
- scripts/NLP_radio_processing/NLP_utils/N06_model_merging.py90-91
Model Implementation
The intent classifier uses the same architectural approach as the sentiment model but with RoBERTa-large as the base model for improved performance on this more complex task:
- scripts/NLP_radio_processing/NLP_utils/N06_model_merging.py167-206
- scripts/NLP_radio_processing/NLP_utils/N06_model_merging.py296-340
Integrated Radio Analysis Pipeline
The system combines sentiment analysis, intent classification, and named entity recognition into a unified pipeline to extract comprehensive information from each radio message.
Pipeline Architecture
Output Format
The pipeline generates a standardized JSON output containing the original message and complete analysis:
- scripts/NLP_radio_processing/NLP_utils/N06_model_merging.py387-406
- scripts/NLP_radio_processing/NLP_utils/N04_radio_info.py61-76
Integration with Strategy Engine
The sentiment and intent analysis system feeds directly into the F1 Strategy Engine, providing critical context for strategic decision-making during races.
Key Use Cases
Radio analysis enhances strategic decision-making in several key ways:
- Driver Mood Detection: Sentiment analysis identifies driver stress or confidence levels, allowing appropriate strategy adjustments
- Action Prioritization: Intent classification distinguishes between urgent orders and routine information
- Information Extraction: Entity recognition pulls out specific information like track conditions or technical issues
- Competitor Intelligence: Analysis of team radio from other drivers can reveal strategy intentions
Model Loading and Inference
The system defines utility functions to load and run inference with the trained models:
Model Loading
The load_sentiment_model()
, load_intent_model()
, and load_bert_ner_model()
functions handle model loading with appropriate configurations:
- scripts/NLP_radio_processing/NLP_utils/N06_model_merging.py125-162
- scripts/NLP_radio_processing/NLP_utils/N06_model_merging.py169-206
- scripts/NLP_radio_processing/NLP_utils/N06_model_merging.py213-238
Prediction Process
The overall prediction process is streamlined through the analyze_radio_message()
function: