named entity recognition - VforVitorio/F1_Strat_Manager GitHub Wiki

Named Entity Recognition

This document details the Named Entity Recognition (NER) component of the F1 Strategy Manager's Natural Language Processing pipeline. The NER system extracts structured information from team radio communications, identifying key racing concepts such as actions, track conditions, and strategic instructions. This structured data enables the expert system to make informed strategy decisions based on radio communications.

For information about the overall NLP pipeline, see NLP Pipeline, and for details on other components like sentiment analysis and intent classification, see Sentiment and Intent Analysis.

1. Overview and Purpose

The Named Entity Recognition system identifies and extracts domain-specific entities from Formula 1 team radio messages. While traditional NER systems focus on general entities like people and locations, our custom F1 NER system recognizes racing-specific concepts such as track conditions, pit call instructions, and technical issues.

2. Entity Types

The F1 NER system recognizes nine domain-specific entity types, each capturing critical information for race strategy:

Entity Type	Description	Example
ACTION	Direct commands or actions	"push now", "follow my instruction"
SITUATION	Racing context or circumstances	"Hamilton is 2 seconds behind"
INCIDENT	Accidents or on-track events	"Ferrari in the wall"
STRATEGY_INSTRUCTION	Strategic directives	"We're looking at Plan B"
POSITION_CHANGE	References to overtakes or positions	"You're P4", "gaining on Verstappen"
PIT_CALL	Specific pit stop instructions	"Box this lap"
TRACK_CONDITION	Mentions of track state	"yellows in turn 7", "track is drying"
TECHNICAL_ISSUE	Car-related problems	"losing grip on the rear"
WEATHER	Weather conditions	"rain expected in 5 minutes"

3. Technical Implementation

The NER component implements a fine-tuned BERT model customized for the F1 domain with a BIO (Beginning-Inside-Outside) tagging scheme.

3.1 BIO Tagging Approach

The NER system uses BIO tagging, a standard approach in sequence labeling:

B-: Marks the Beginning of an entity
I-: Marks the Inside (continuation) of an entity
O: Marks tokens Outside any entity

For example, the message "Ferrari in the wall, no? Yes, that's Charles stopped" would be tagged as:

Word	Tag
Ferrari	B-INCIDENT
in	I-INCIDENT
the	I-INCIDENT
wall	I-INCIDENT
,	O
no	O
?	O
Yes	O
,	O
that's	B-INCIDENT
Charles	I-INCIDENT
stopped	I-INCIDENT

3.2 Model Architecture

The NER system uses a BERT-based token classification model:

The technical components include:

Base Model: BERT-large-cased-finetuned-conll03-english
Customization: Fine-tuned on annotated F1 radio communications
Output Layer: 19 output classes (B- and I- for each of the 9 entity types, plus O)
Training Approach: Focused fine-tuning with class weights to handle entity imbalance

4. Data Processing Pipeline

The NER system transforms raw text into structured entity data through several processing steps:

Key steps in the process:

Tokenization: The raw text is tokenized using BERT's WordPiece tokenizer
Prediction: Tokens are processed through the model to predict BIO tags
Entity Extraction: Consecutive tokens with matching entity types are combined
Structured Output: Entities are formatted into a JSON structure for the expert system

5. Integration with NLP Pipeline

The NER component is integrated with sentiment analysis and intent classification to provide comprehensive understanding of radio messages:

The pipeline outputs a standardized JSON format including:

Original message text
Sentiment classification (positive, negative, neutral)
Intent classification (ORDER, INFORMATION, QUESTION, etc.)
Extracted entities with their types

6. Example Output

Here's an example of a processed radio message showing the full pipeline output:

This structured output enables the expert system to understand:

The message contains a pit stop instruction (ORDER intent)
The emotional tone is neutral
There are specific actions (box), pit instructions (this lap for softs), and a situational update (Hamilton is catching up)

7. From NER to Strategy Decisions

The NER-extracted entities directly inform strategic decision-making in the expert system:

Examples of how NER-extracted information influences strategy:

PIT_CALL entities: Trigger pit stop preparation rules
TRACK_CONDITION entities: Activate weather strategy adjustment rules
TECHNICAL_ISSUE entities: Inform tire management or defensive driving recommendations
POSITION_CHANGE entities: Update race situation awareness for overtake opportunities

8. Implementation Details

The F1 NER system is implemented using the following key technologies and patterns:

Data Preparation:
- Character-span annotation to BIO tag conversion
- Custom tokenization handling for BERT model
Model Configuration:
- BertForTokenClassification with custom classification head
- Entity-specific class weighting to handle imbalance
Inference Functions:
- analyze_f1_radio(): Main function for entity extraction
- Custom post-processing to merge BIO tags into entities
Integration Interface:
- Module organization: NLP_utils/N05_ner_models.py
- Integration point: analyze_radio_message() in N06_model_merging.py

9. Usage in the Strategy System

The structured entity information from the NER component is consumed by the F1 Strategy Engine as RadioFact objects, which trigger specific rules in the expert system:

The NER results inform strategic decisions such as:

When to pit based on weather changes mentioned in radio
How to respond to technical issues reported by the driver
Adjusting race tactics based on incidents or track conditions
Monitoring competitor strategies mentioned in radio communications

Named Entity Recognition
1. Overview and Purpose
2. Entity Types
3. Technical Implementation
3.1 BIO Tagging Approach
3.2 Model Architecture
4. Data Processing Pipeline
5. Integration with NLP Pipeline
6. Example Output
7. From NER to Strategy Decisions
8. Implementation Details
9. Usage in the Strategy System

named entity recognition - VforVitorio/F1_Strat_Manager GitHub Wiki

Named Entity Recognition

Named Entity Recognition

1. Overview and Purpose

2. Entity Types

3. Technical Implementation

3.1 BIO Tagging Approach

3.2 Model Architecture

4. Data Processing Pipeline

5. Integration with NLP Pipeline

6. Example Output

7. From NER to Strategy Decisions

8. Implementation Details

9. Usage in the Strategy System

On this page