04 03 named entity recognition - VforVitorio/F1_Strat_Manager GitHub Wiki

Named Entity Recognition

Relevant source files

2. Entity Types

The F1 NER system recognizes nine domain-specific entity types, each capturing critical information for race strategy:

Entity Type	Description	Example
ACTION	Direct commands or actions	"push now", "follow my instruction"
SITUATION	Racing context or circumstances	"Hamilton is 2 seconds behind"
INCIDENT	Accidents or on-track events	"Ferrari in the wall"
STRATEGY_INSTRUCTION	Strategic directives	"We're looking at Plan B"
POSITION_CHANGE	References to overtakes or positions	"You're P4", "gaining on Verstappen"
PIT_CALL	Specific pit stop instructions	"Box this lap"
TRACK_CONDITION	Mentions of track state	"yellows in turn 7", "track is drying"
TECHNICAL_ISSUE	Car-related problems	"losing grip on the rear"
WEATHER	Weather conditions	"rain expected in 5 minutes"

3. Technical Implementation

The NER component implements a fine-tuned BERT model customized for the F1 domain with a BIO (Beginning-Inside-Outside) tagging scheme.

3.1 BIO Tagging Approach

The NER system uses BIO tagging, a standard approach in sequence labeling:

B-: Marks the Beginning of an entity
I-: Marks the Inside (continuation) of an entity
O: Marks tokens Outside any entity For example, the message "Ferrari in the wall, no? Yes, that's Charles stopped" would be tagged as: | Word | Tag | | ------- | ---------- | | Ferrari | B-INCIDENT | | in | I-INCIDENT | | the | I-INCIDENT | | wall | I-INCIDENT | | , | O | | no | O | | ? | O | | Yes | O | | , | O | | that's | B-INCIDENT | | Charles | I-INCIDENT | | stopped | I-INCIDENT |

3.2 Model Architecture

The NER system uses a BERT-based token classification model:

The technical components include:

Base Model: BERT-large-cased-finetuned-conll03-english
Customization: Fine-tuned on annotated F1 radio communications
Output Layer: 19 output classes (B- and I- for each of the 9 entity types, plus O)
Training Approach: Focused fine-tuning with class weights to handle entity imbalance

4. Data Processing Pipeline

The NER system transforms raw text into structured entity data through several processing steps:

Key steps in the process:

Tokenization: The raw text is tokenized using BERT's WordPiece tokenizer
Prediction: Tokens are processed through the model to predict BIO tags
Entity Extraction: Consecutive tokens with matching entity types are combined

6. Example Output

Here's an example of a processed radio message showing the full pipeline output:

This structured output enables the expert system to understand:

The message contains a pit stop instruction (ORDER intent)
The emotional tone is neutral
There are specific actions (box), pit instructions (this lap for softs), and a situational update (Hamilton is catching up)

7. From NER to Strategy Decisions

The NER-extracted entities directly inform strategic decision-making in the expert system:

Examples of how NER-extracted information influences strategy:

PIT_CALL entities: Trigger pit stop preparation rules
TRACK_CONDITION entities: Activate weather strategy adjustment rules
TECHNICAL_ISSUE entities: Inform tire management or defensive driving recommendations
POSITION_CHANGE entities: Update race situation awareness for overtake opportunities

8. Implementation Details

The F1 NER system is implemented using the following key technologies and patterns:

Data Preparation:
- Character-span annotation to BIO tag conversion
- Custom tokenization handling for BERT model
Model Configuration:
- BertForTokenClassification with custom classification head
- Entity-specific class weighting to handle imbalance
Inference Functions:
- analyze_f1_radio(): Main function for entity extraction
- Custom post-processing to merge BIO tags into entities
Integration Interface:
- Module organization: NLP_utils/N05_ner_models.py
- Integration point: analyze_radio_message() in N06_model_merging.py

9. Usage in the Strategy System

The structured entity information from the NER component is consumed by the F1 Strategy Engine as RadioFact objects, which trigger specific rules in the expert system:

The NER results inform strategic decisions such as:

When to pit based on weather changes mentioned in radio