Machine Learning Models - VforVitorio/F1_Strat_Manager GitHub Wiki

🤖 Machine Learning Models

AI models for lap time prediction, tire degradation, and gap calculation

Data Processing Utilities

Model Artifacts and Deployment

Development Environment

Machine Learning Models Relevant source files

This page documents the machine learning models implemented in the F1 Strategy Manager system. These models are critical components that provide predictive analytics and data processing capabilities across various aspects of Formula 1 racing strategy. For information about the natural language processing pipeline specifically, see NLP Pipeline.

Overview of Machine Learning Components

The F1 Strategy Manager leverages multiple specialized machine learning models to handle different aspects of race strategy prediction:

Lap Time Prediction (XGBoost) - Forecasts expected lap times based on numerous race factors

Tire Degradation Modeling (TCN) - Predicts tire performance decline over race stints Vision-based Gap Calculation (YOLOv8) - Uses computer vision to identify cars and calculate gaps

These models work together to feed the expert system with the predictions needed to generate optimal race strategies.

Expert System Integration

Model Outputs

ML Models

Data Sources

FastF1 API Data

Telemetry Stream

Race Video Feed

XGBoost

Lap Time Prediction

TCN

Tire Degradation Model

YOLOv8

Computer Vision Model

Lap Time Predictions

Degradation Forecasts

Car Position & Gap Data

TelemetryFact

DegradationFact

GapFact

Rule Evaluation

Sources: lap_prediction.ipynb 1-42

scripts/ML_tyre_pred/N01_tire_prediction.ipynb 1-42

README.md 32-36

Model Architecture and Data Flow

The machine learning subsystem follows a multi-stage pipeline pattern, where raw data is processed through feature engineering steps before being fed into specialized models. The outputs are then standardized for consumption by the expert system.

Output Transformation

Model Execution

Input Processing

validate_lap_data()

add_sequential_features()

prepare_features_for_prediction()

XGBoost predict()

TCN forward()

YOLOv8 inference()

format_lap_predictions()

calculate_degradation_rate()

Race

Telemetry Data

Facts for

Expert System

Tire

Performance Data

Video

Frames

Sources: scripts/ML_tyre_pred/ML_utils/N00_model_lap_prediction.py 49-107

scripts/ML_tyre_pred/ML_utils/N01_tire_prediction.py 25-136

Lap Time Prediction Model

The lap time prediction component uses XGBoost to forecast lap times with high accuracy (MAE = 0.09s and RMSE = 0.15s). It takes into account multiple factors that influence lap performance including tire compound, tire age, fuel load, and track conditions.

Model Features and Implementation

The XGBoost model processes a rich set of features derived from race telemetry:

Feature Category Examples Notes

Driver/Team DriverNumber, TeamID Captures team-specific performance

Tire CompoundID, TyreAge Critical for performance understanding

Speed SpeedI1, SpeedI2, SpeedFL, SpeedST Speed at different track sectors

Sequential Prev_LapTime, LapTime_Delta, LapTime_Trend Captures performance trends

Race Context Position, FuelLoad, DRSUsed Situational race factors

The model handles sequential data by creating derived features that track changes between laps and performance trends:

Sequential Features

Feature Engineering Pipeline

Raw Lap Data

Data Validation validate_lap_data()

Sequential Feature Creation add_sequential_features()

Feature Preparation prepare_features_for_prediction()

XGBoost Model

Predict Lap Time

Previous Lap Metrics

Prev_LapTime, Prev_SpeedI1, etc.

Delta Features

LapTime_Delta, SpeedI1_Delta, etc.

Trend Features

LapTime_Trend

Sources: lap_prediction.ipynb 75-107

scripts/ML_tyre_pred/ML_utils/N00_model_lap_prediction.py 150-228

Usage and Integration

The lap time prediction model is exposed through a central predict_lap_times() function that handles the entire pipeline:

Loading the trained model

Validating input telemetry data

Engineering sequential features

Making predictions Formatting results for downstream use

The function serves as the main interface between raw telemetry data and the expert system, providing both historical lap time analysis and future lap time forecasts.

Sources: scripts/ML_tyre_pred/ML_utils/N00_model_lap_prediction.py 357-407

Tire Degradation Model

The tire degradation component uses Temporal Convolutional Networks (TCN) to model how tire performance decreases over time. The model captures the non-linear nature of tire degradation for different compounds.

Degradation Metrics and Analysis

The system calculates several key degradation metrics from raw lap time data:

Metric Description Usage

TireDegAbsolute Raw lap time increase from baseline Direct performance loss

TireDegPercent Percentage lap time increase Relative performance change

FuelAdjustedDegAbsolute Degradation with fuel effect removed Isolates tire effects DegradationRate Lap-to-lap change in performance Rate of performance loss

These metrics are calculated using functions like calculate_fuel_adjusted_metrics() and calculate_degradation_rate().

Key Metrics Computed

Tire Degradation Analysis

Raw Lap Data

calculate_fuel_adjusted_metrics()

calculate_degradation_rate()

Processed Degradation Metrics

TireDegAbsolute

TireDegPercent

FuelAdjustedDegAbsolute

DegradationRate

TCN Model

Predict Future Degradation

Stint Strategy

Optimization

Sources: scripts/ML_tyre_pred/ML_utils/N01_tire_prediction.py 25-136

TCN Architecture and Implementation

The Temporal Convolutional Network is specifically designed to handle sequence modeling problems. For tire degradation, it:

Takes a window of previous lap performance data (typically 5 laps)

Processes through convolutional layers with dilated filters Outputs predictions for future degradation (next 3-5 laps)

This approach captures how tire performance evolves over time, allowing for more accurate pit stop planning.

Features

TCN Model Structure

Input: 5-lap Window [tire_age, lap_time, speeds, etc.]

Conv Layer 1

Dilation=1

Conv Layer 2

Dilation=2

Conv Layer 3

Dilation=4

Output: Future Degradation

Predictions for Next 3-5 Laps

Tire age

Compound type

Lap time trend

Speeds at track sectors

Fuel effect adjustment

Sources: scripts/ML_tyre_pred/N01_tire_prediction.ipynb 14-41

Vision-based Gap Calculation

The YOLOv8 computer vision model is used to identify teams from race footage, enabling gap calculation when telemetry data is unavailable or needs confirmation.

YOLOv8 Implementation

The system uses YOLOv8, a state-of-the-art object detection model that achieves over 90% mAP50 on team identification tasks. This model:

Takes video frames as input

Detects F1 cars in the frame

Identifies the team/driver through livery recognition

Calculates spatial relationships between detected cars

This provides an independent source of gap data that complements telemetry-based calculations.

Sources:

README.md 34-35

Integration with Expert System

The machine learning models are integrated with the expert system through a fact-based architecture. Each model outputs predictions that are converted into fact objects:

creates

inputs to

«abstract»

Fact

TelemetryFact

+driver_number

+lap_time

+predicted_lap_time

+position

DegradationFact

+driver_number

+compound_id

+tire_age

+degradation_rate

+predicted_rates

+performance_cliff_lap

GapFact

+lead_car

+follow_car

+gap_seconds

+detection_confidence

XGBoostModel

+predict_lap_times()

TCNModel

+predict_degradation()

YOLOv8Model

+detect_cars()

+calculate_gaps()

F1StrategyEngine

+get_recommendations()

These facts trigger rules in the expert system, which then generates strategic recommendations based on the combined insights from all models.

Sources:

README.md 38-42

Performance and Accuracy Metrics

The machine learning models are evaluated on various metrics to ensure reliable strategy recommendations:

Model Key Metrics Performance

XGBoost Lap Time Prediction MAE, RMSE MAE = 0.09s, RMSE = 0.15s TCN Tire Degradation Degradation Rate Accuracy Within ±0.05s/lap YOLOv8 Team Detection mAP50 >90%

These metrics guide ongoing model improvements and help users understand the confidence level of strategy recommendations.

Sources:

README.md 33-35

lap_prediction.ipynb 108-116

Future Model Enhancements

The machine learning subsystem is designed for extensibility, with plans for:

Enhanced weather impact modeling

Driver-specific performance modeling

Circuit-specific optimization models

Expanded vision-based analytics

These enhancements will continue to improve the accuracy and scope of the strategy recommendations.

Sources: scripts/ML_tyre_pred/N01_tire_prediction.ipynb 38-41

Data Processing Utilities

Model Artifacts and Deployment

Development Environment

Lap Time Prediction

Relevant source files

Purpose and Scope

The Lap Time Prediction system is a core component of the F1 Strategy Manager that estimates future lap times for drivers based on historical telemetry data, tire conditions, and other race factors. These predictions enable strategic decision-making, particularly for pit stop timing and race pace management.

For related prediction systems, see Tire Degradation Modeling which focuses specifically on tire performance decline over time.

System Overview

The lap time prediction system uses an XGBoost machine learning model trained on historical race data to predict lap times with high accuracy (Mean Absolute Error of approximately 0.09 seconds). The system processes telemetry data from FastF1 API, enriches it with sequential features, and generates predictions that feed into the F1 Strategy Engine's decision-making process.

Output

Prediction System

Data Processing

Data Collection

FastF1 API

Race Telemetry Data

Data Validation

Feature Engineering

Sequential Feature Creation

XGBoost Model

Prediction Formatting

Next Lap Prediction

F1 Strategy Engine

Streamlit Dashboard

Sources: scripts/lap_prediction.ipynb 1-20

scripts/ML_tyre_pred/ML_utils/N00_model_lap_prediction.py 15-22

Data Pipeline

Data Sources

The prediction system loads data from preprocessed parquet files containing comprehensive race telemetry:

Data Source File Description Key Fields

Laps Spain_2023_laps.parquet Individual lap telemetry LapTime, Compound, TyreLife, SpeedI1, SpeedI2, SpeedFL, SpeedST, Position Weather Spain_2023_weather.parquet Environmental conditions AirTemp, TrackTemp, Humidity, Pressure, WindSpeed, WindDirection Intervals Spain_2023_openf1_intervals.parquet Gap and position data gap_to_leader, interval_in_seconds, undercut_window, drs_window Pitstops Spain_2023_pitstops.parquet Pit stop events PitInTime, PitOutTime, Compound, TyreLife, FreshTyre

The load_all_data() function handles data loading and performs deduplication checks across datasets. The system processes 1,312 lap records, 154 weather measurements, 8,933 interval records, and 43 pit stop events for comprehensive race analysis.

Sources: scripts/lap_prediction.ipynb 124-172

scripts/lap_prediction.ipynb 224-259

Data Validation and Processing

The modular prediction system in N00_model_lap_prediction.py implements a structured pipeline for data processing:

Input Validation: validate_lap_data() checks for required columns and correct data types

Data Type Conversion: Ensures numerical values are properly formatted for speed measurements

Missing Value Handling: Adds LapNumber and placeholder LapTime columns when missing

Sequential Feature Creation: add_sequential_features() generates time-series relationships

Required Input Columns: DriverNumber, Stint, CompoundID, TyreAge, SpeedI1, SpeedI2, SpeedFL, SpeedST, Position

"format_lap_predictions()" "XGBoost Model" "prepare_features_for_prediction()" "add_sequential_features()" "validate_lap_data()" "InputData" "format_lap_predictions()" "XGBoost Model" "prepare_features_for_prediction()" "add_sequential_features()" "validate_lap_data()" "InputData" "Check required columns

Validate data types

Add LapNumber if missing" "Create Prev_* columns

Calculate *_Delta features

Generate LapTime_Trend" "Align with model.feature_names_in_

Handle missing/extra columns

Correct column order" "Calculate RMSE/MAE metrics

Add PredictedLapTime column

Generate next lap prediction" "Raw telemetry CSV/DataFrame" "Validated DataFrame" "DataFrame with sequential features" "Feature matrix X" "model.predict(X)" "Formatted prediction results"

Sources: scripts/ML_tyre_pred/ML_utils/N00_model_lap_prediction.py 86-143

scripts/ML_tyre_pred/ML_utils/N00_model_lap_prediction.py 150-228

Model Architecture

XGBoost Model

The lap time prediction system utilizes an XGBoost regression model, which was selected for its:

High accuracy on time-series data

Robustness to outliers

Ability to capture non-linear relationships

Fast prediction speed for real-time strategy decisions

The model is trained on historical race data and achieves a Mean Absolute Error (MAE) of approximately 0.09 seconds, making it reliable for strategic decision-making.

Sources: scripts/lap_prediction.ipynb 15-16

Feature Engineering

The add_sequential_features() function creates time-series features essential for accurate lap time prediction:

Core Features

CompoundID: Tire compound mapping (1=SOFT, 2=MEDIUM, 3=HARD, 4=INTERMEDIATE, 5=WET)

TyreAge: Number of laps on current tire set

Position: Current race position

Speed Measurements: SpeedI1, SpeedI2, SpeedFL, SpeedST at track sectors

Sequential Features (Generated)

Previous Lap Values: Prev_LapTime, Prev_SpeedI1, Prev_SpeedI2, Prev_SpeedFL, Prev_SpeedST, Prev_TyreAge

Delta Features: LapTime_Delta, SpeedI1_Delta, SpeedI2_Delta, SpeedFL_Delta, SpeedST_Delta

Trend Analysis: LapTime_Trend (second derivative for pace trajectory)

Feature Processing Details

Yes

Raw Telemetry Data

Group by DriverNumber

Group by Stint

Sort by LapNumber

len(stint_data) >= 2?

Skip stint (insufficient laps)

Create Sequential Features

Add Prev_* columns from i-1 lap

Calculate *_Delta features

Calculate LapTime_Trend (i>=2)

Fill NaN values with 0

Combine all processed rows

The system requires minimum 2 laps per stint to generate meaningful sequential features and skips insufficient data with logging.

Sources: scripts/ML_tyre_pred/ML_utils/N00_model_lap_prediction.py 31-46

scripts/ML_tyre_pred/ML_utils/N00_model_lap_prediction.py 150-228

Prediction Process

Model Loading and Initialization

The load_lap_prediction_model() function handles model initialization and feature extraction:

Yes

model_path parameter

model_path is None?

../../outputs/week3/xgb_sequential_model.pkl

Use provided path

pickle.load(model)

model.feature_names_in_

Load successful?

Return (model, feature_names)

Return (None, None)

Print: Model loaded with N features

Print: Error loading model

The function extracts model.feature_names_in_ to ensure feature alignment during prediction and provides comprehensive error handling with logging.

Sources: scripts/ML_tyre_pred/ML_utils/N00_model_lap_prediction.py 49-79

Complete Prediction Pipeline

The predict_lap_times() function orchestrates the entire prediction workflow:

Yes

input_data (CSV path or DataFrame)

predict_lap_times()

model_path parameter

Step 1: load_lap_prediction_model()

model is None?

Print: Failed to load model

Step 2: validate_lap_data()

df is None?

Print: Data validation failed

Step 3: add_sequential_features()

len(df_seq) == 0?

Print: Failed to create sequential features

Step 4: prepare_features_for_prediction()

Step 5: model.predict(X)

Step 6: format_lap_predictions()

Print: Predictions complete: N rows

return None

return result_df

The pipeline includes comprehensive error handling at each step with informative logging and graceful failure modes.

Sources: scripts/ML_tyre_pred/ML_utils/N00_model_lap_prediction.py 359-406

Future Lap Prediction

The format_lap_predictions() function generates next lap predictions for strategic planning:

Yes

Prediction Results DataFrame

For each DriverNumber

For each Stint

driver_stint_data.iloc[-1]

Create next_lap dictionary

Set prediction fields:

DriverNumber: same

Stint: same

LapNumber: last + 1

CompoundID: same

TyreAge: last + 1

Position: same

LapTime: None

PredictedLapTime: same as last

IsNextLapPrediction: True

Append to next_lap_predictions[]

More drivers/stints?

pd.concat([result_df, next_lap_df])

fillna(IsNextLapPrediction=False)

This synthetic next lap prediction uses current race state with incremented tire age, providing strategists with immediate next-lap performance estimates.

Sources: scripts/ML_tyre_pred/ML_utils/N00_model_lap_prediction.py 318-350

Integration with Strategy Engine

The lap time predictions are a critical input to the F1 Strategy Engine, which uses them to evaluate potential strategy options:

Strategy Output

Expert System

Lap Time Prediction System

XGBoost Model

Lap Time Predictions

Next Lap Estimates

TelemetryFact Creation

F1LapTimeRules

F1CompleteStrategyEngine

Strategy Recommendations

Lap Time Visualization

The predictions flow into the expert system as facts, where they are processed by rule-based components to generate strategic recommendations.

Sources: scripts/ML_tyre_pred/ML_utils/N00_model_lap_prediction.py 405-406

Implementation Structure

Main Training Notebook

The scripts/lap_prediction.ipynb notebook provides the complete training and evaluation pipeline:

Data Loading: Uses load_all_data() to process Spain 2023 Grand Prix data from parquet files

Feature Engineering: Creates sequential features with 5-lap windows for time-series analysis

Model Training: Implements XGBoost with GridSearchCV for hyperparameter optimization

Evaluation: Generates comprehensive performance metrics and visualizations Model Persistence: Saves trained model to outputs/week3/xgb_sequential_model.pkl Modular Prediction Interface

The N00_model_lap_prediction.py module provides production-ready prediction capabilities:

Function Purpose Key Parameters load_lap_prediction_model() Load trained XGBoost model model_path (optional) validate_lap_data() Validate input telemetry data input_data (CSV path or DataFrame) add_sequential_features() Create time-series features df (validated DataFrame) prepare_features_for_prediction() Align features with model df, feature_names format_lap_predictions() Structure prediction output df, predictions array predict_lap_times() Complete prediction pipeline input_data, model_path, include_next_lap Compound Mapping Constants compound_colors = {1: 'red', 2: 'yellow', 3: 'gray', 4: 'green', 5: 'blue'} compound_names = {1: 'SOFT', 2: 'MEDIUM', 3: 'HARD', 4: 'INTERMEDIATE', 5: 'WET'}

Sources: scripts/lap_prediction.ipynb 7-16

scripts/ML_tyre_pred/ML_utils/N00_model_lap_prediction.py 31-46

scripts/ML_tyre_pred/ML_utils/N00_model_lap_prediction.py 359-406

Performance and Limitations

Model Accuracy

The XGBoost model achieves strong performance metrics:

Mean Absolute Error (MAE): 0.09 seconds

Handles various tire compounds and track conditions reliably

Limitations

Requires at least two previous laps to generate sequential features

Performance may degrade in highly unusual race conditions Dependent on the quality and completeness of telemetry data

Sources: scripts/lap_prediction.ipynb 15-16

Data Processing Utilities

Model Artifacts and Deployment

Development Environment

Tire Degradation Modeling

Relevant source files

Purpose and Scope

This document details the tire degradation modeling subsystem of the F1 Strategy Manager. The system uses LSTM and TCN (Temporal Convolutional Network) models to predict how tire performance deteriorates over successive laps in a Formula 1 race. This capability is critical for optimal pit stop strategy planning and feeds directly into the expert system's rule engine.

The system implements sequence-based machine learning models that use 5-lap windows to predict degradation patterns for the next 3-5 laps, with fuel-adjusted metrics to isolate true tire degradation from fuel load effects.

For information about how these predictions are used in strategy decisions, see Degradation Rules.

Sources: scripts/ML_tyre_pred/N01_tire_prediction.ipynb 7-42

Core Concepts

Tire Degradation in Formula 1

Tire degradation in Formula 1 racing refers to the progressive loss of tire performance over the course of a stint. This degradation is reflected in increasing lap times as tires wear out. The degradation pattern is not strictly linear and depends on multiple factors:

Tire compound (Soft=1, Medium=2, Hard=3)

Track conditions (temperature, surface)

Driving style

Fuel load

The system models this complex relationship using sequence-based machine learning techniques (LSTM and TCN) that can capture non-linear patterns in sequential lap data.

Implementation Approach

The tire degradation prediction system implements the following approach:

Sequence Length: 5 laps (input) → predict next 3-5 laps

Features: Tire age, compound, lap time trends, fuel load

Target: Derived degradation metric or direct lap time prediction

Models: LSTM network for primary predictions, XGBoost with quantile regression for uncertainty estimation

Sources: scripts/ML_tyre_pred/N01_tire_prediction.ipynb 24-36

Compound Mappings

The system uses standardized compound mappings defined in the utility module:

CompoundID Compound Name Color Coding 1 SOFT red 2 MEDIUM yellow 3 HARD gray 4 INTERMEDIATE green 5 WET blue

Sources: scripts/ML_tyre_pred/ML_utils/N01_tire_prediction.py 8-19

Fuel Effect Adjustment

A key aspect of accurate tire degradation modeling is isolating the tire effect from the fuel effect. As cars burn fuel during a race, they become lighter and therefore faster, which can mask the true tire degradation.

The system uses a constant value of 0.055 seconds per lap (LAP_TIME_IMPROVEMENT_PER_LAP) as the empirical improvement due to fuel burn. This value is added back to the lap times to create "fuel-adjusted" metrics that more accurately reflect the pure tire degradation.

FuelAdjustedLapTime = ActualLapTime + (LapsFromBaseline * 0.055)

Sources: scripts/ML_tyre_pred/ML_utils/N01_tire_prediction.py 21-97

Data Processing Pipeline

The tire degradation modeling system follows a sophisticated data processing pipeline that transforms raw telemetry data into prediction-ready inputs.

Data Pipeline Diagram

Unsupported markdown: list

Unsupported markdown: list Unsupported markdown: list

Raw Telemetry Data

CSV Files

Load Sequential Data pd.read_csv()

calculate_fuel_adjusted_metrics() Fuel effect removal

calculate_degradation_rate() Rate calculations

Sequential Feature Engineering

Previous lap features, deltas, trends

Plotting Functions plot_lap_time_deltas() plot_fuel_adjusted_degradation()

Model-Ready Sequences 5-lap windows

Sources:

scripts/ML_tyre_pred/N01_tire_prediction.ipynb 464-467 scripts/ML_tyre_pred/ML_utils/N01_tire_prediction.py 25-97 scripts/ML_tyre_pred/ML_utils/N01_tire_prediction.py 100-135 Required Input Data

The tire degradation model requires specific telemetry data for each lap:

Column Name Description Purpose

LapTime Time taken to complete the lap Primary performance metric

CompoundID Tire compound (1=Soft, 2=Medium, 3=Hard) Determines degradation pattern

TyreAge Number of laps tire has completed Primary predictor of degradation

Stint Current stint number Tracks tire changes

SpeedI1, SpeedI2, SpeedFL, SpeedST Speed at various track sectors Additional performance indicators

FuelLoad Estimated fuel load (normalized) Used in fuel adjustment

Position Current race position Race context

DriverNumber Driver's race number Identifies the driver

LapsSincePitStop Laps since last pit stop Alternative tire age metric

DRSUsed DRS usage indicator Performance modifier

TeamID Team identifier Team-specific patterns

The system also generates derived sequential features including Prev_LapTime, LapTime_Delta, speed deltas, and LapTime_Trend for enhanced predictive capability.

Sources: scripts/ML_tyre_pred/N01_tire_prediction.ipynb 464-467

Degradation Metrics Calculation

The calculate_fuel_adjusted_metrics() function calculates several key metrics to quantify tire degradation:

FuelAdjustedLapTime: Actual lap time with fuel effect added back

FuelAdjustedDegAbsolute: Absolute degradation compared to baseline

FuelAdjustedDegPercent: Percentage degradation compared to baseline

DegradationRate: Rate of change in lap time from one lap to the next Fuel Adjustment Process

Input Data:

LapTime, TyreAge, CompoundID

Establish Baseline:

TyreAge=1 or minimum available

Calculate:

LapsFromBaseline = TyreAge - baseline_tire_age

Calculate Fuel Effect:

FuelEffect = LapsFromBaseline * 0.055

Calculate:

FuelAdjustedLapTime = LapTime + FuelEffect

Calculate Degradation Metrics:

FuelAdjustedDegAbsolute FuelAdjustedDegPercent

calculate_degradation_rate(): DegradationRate per tire age

Output:

Enhanced data with degradation metrics

Sources: scripts/ML_tyre_pred/ML_utils/N01_tire_prediction.py 25-135

Sequential Feature Engineering

The system treats tire degradation as a sequential problem using historical lap data. The sequential dataset includes:

Previous lap features: Prev_LapTime, Prev_SpeedI1, Prev_SpeedI2, Prev_SpeedFL, Prev_SpeedST, Prev_TyreAge

Delta features: LapTime_Delta, SpeedI1_Delta, SpeedI2_Delta, SpeedFL_Delta, SpeedST_Delta Trend features: LapTime_Trend

The approach uses 5-lap windows to predict future degradation patterns for the next 3-5 laps, focusing on capturing both short-term performance changes and longer-term degradation trends.

Sources: scripts/ML_tyre_pred/N01_tire_prediction.ipynb 1000-1004

Model Architecture

LSTM and TCN Models

The tire degradation prediction system implements both LSTM (Long Short-Term Memory) and TCN (Temporal Convolutional Network) architectures for capturing temporal patterns in tire degradation data.

LSTM Architecture

Input Sequence 5 laps with features

LSTM Layer 1 hidden_size units

Dropout Layer regularization

LSTM Layer 2 hidden_size units

Dropout Layer regularization

Dense Layer output_size predictions

Output:

Next 3-5 lap predictions

TCN Architecture

Input Sequence 5 laps with features

1D Convolutional Layers Dilated convolutions

Residual Connections

Skip connections

ReLU Activation

Non-linear transforms

Global Pooling

Temporal aggregation

Dense Layers

Final predictions

Output:

Degradation predictions

Sources: scripts/ML_tyre_pred/N01_tire_prediction.ipynb 24-26

Model Implementation Details

The system uses PyTorch for neural network implementation:

LSTM Implementation:

Multi-layer LSTM with dropout regularization

Handles variable sequence lengths Outputs predictions for multiple future laps

TCN Implementation:

1D convolutional layers with dilated convolutions

Residual connections for gradient flow

Temporal pooling for sequence aggregation

Training Approach:

Sequence length: 5 laps (input) → predict next 3-5 laps

Features include tire age, compound, lap time trends, fuel load

Both direct lap time prediction and derived degradation metrics

Alternative Modeling Approach

The system also implements XGBoost with quantile regression for uncertainty estimation, providing confidence bounds around degradation predictions (10th, 50th, 90th percentiles).

Sources: scripts/ML_tyre_pred/N01_tire_prediction.ipynb 24-41

Model Training and Evaluation

Training Process

The system trains both LSTM and TCN models using PyTorch:

Sequential Lap Data 5-lap windows

Data Preprocessing

Normalization & Feature Engineering

LSTM Model Training

PyTorch implementation

TCN Model Training

Temporal convolutions

XGBoost Training

Quantile regression

Model Evaluation

MAE, RMSE metrics

Model Persistence torch.save() / pickle

Model Outputs Directory

The trained models are saved to ../../outputs/week5/models/ with the following structure:

LSTM models for tire degradation prediction

TCN models for sequence modeling

XGBoost models for uncertainty quantification

Model metadata and training statistics

Sources: scripts/ML_tyre_pred/N01_tire_prediction.ipynb 91-92

Visualization and Analysis

Degradation Analysis Functions

The system provides comprehensive visualization capabilities through utility functions:

Tire Degradation Visualization

Tire Degradation Data fuel-adjusted metrics

plot_lap_time_deltas() Delta vs tire age by compound

plot_speed_vs_tire_age() Sector speeds vs tire age

plot_regular_vs_adjusted_degradation() Comparison of fuel effects

plot_fuel_adjusted_degradation() Absolute degradation trends

plot_fuel_adjusted_percentage_degradation() Percentage degradation

plot_degradation_rate() Rate of degradation

Key Visualization Functions

Function Purpose Output plot_lap_time_deltas() Shows lap time changes by tire age Line plots by compound plot_speed_vs_tire_age() Sector speed degradation analysis Speed trends over tire life plot_regular_vs_adjusted_degradation() Compares raw vs fuel-adjusted metrics Side-by-side comparison plot_fuel_adjusted_degradation() Pure tire degradation trends Compound-specific degradation plot_degradation_rate() Rate of performance loss Degradation rate per lap

All plotting functions use compound-specific colors and return matplotlib figures for integration into analysis workflows.

Sources: scripts/ML_tyre_pred/ML_utils/N01_tire_prediction.py 138-416

Integration with Strategy Engine

The tire degradation predictions are integrated into the F1 Strategy Engine through degradation facts and rules.

System Integration Diagram

FastF1 Telemetry Data

CSV lap data

calculate_fuel_adjusted_metrics() calculate_degradation_rate()

LSTM Model torch.nn.LSTM

TCN Model 1D Convolutions

XGBoost Model

Quantile regression

DegradationFact

Expert system facts

F1StrategyEngine

Experta-based rules

F1DegradationRules

Tire-specific rules

Strategy Recommendations

Pit stop timing

Integration with Expert System

The tire degradation predictions feed into the expert system to trigger rules like:

High Degradation Pit Stop: When degradation exceeds thresholds

Performance Cliff Detection: Detecting imminent severe performance drops Comparative Strategy Evaluation: Assessing different compound choices

The output includes:

Predicted degradation rates for next 3-5 laps

Uncertainty bounds (10th, 50th, 90th percentiles)

Compound-specific degradation patterns

Fuel-adjusted performance metrics

These predictions serve as input facts for the F1DegradationRules component in the expert system.

Sources: scripts/ML_tyre_pred/N01_tire_prediction.ipynb 7-42

Conclusion

The tire degradation modeling subsystem provides crucial strategic intelligence for F1 race strategy. By accurately predicting how tire performance will evolve over future laps, it enables more informed pit stop decisions and compound selections.

Key strengths of the system include:

Fuel-adjusted metrics that isolate true tire degradation

Compound-specific modeling that accounts for different degradation patterns

Ensemble approach that combines general and specialized knowledge

Integration with expert system for actionable strategy recommendations

This module represents a critical component in the F1 Strategy Manager's ability to provide data-driven strategy recommendations.

Data Processing Utilities

Model Artifacts and Deployment

Development Environment

Vision-based Gap Calculation Relevant source files

This document covers the computer vision system for detecting Formula 1 cars in video footage and calculating time/distance gaps between vehicles. The system uses YOLO-based object detection to identify and track F1 cars, enabling real-time gap analysis for strategic decision-making.

For information about integrating gap data with strategic rules, see Gap Analysis Rules. For details about the overall machine learning pipeline, see Machine Learning Models.

System Overview

The vision-based gap calculation system processes F1 race video footage to extract structured gap data between competing cars. It combines computer vision techniques with domain-specific knowledge of F1 racing to provide accurate measurements that feed into the broader strategy analysis pipeline.

Core Capabilities

Real-time Car Detection: Uses fine-tuned YOLO models to detect and classify F1 cars by team

Gap Measurement: Calculates both distance (meters) and time (seconds) gaps between consecutive vehicles

Object Tracking: Maintains consistent car identification across video frames

Data Extraction: Exports structured CSV data for integration with the expert system

Interactive Visualization: Provides real-time overlay with gap measurements and team identification

Sources: scripts/gap_calculation.ipynb 1-50

scripts/YOLO_fine_tune.ipynb 1-30

Architecture Overview

Output Systems

Gap Analysis

Tracking System

Detection Pipeline

Input Sources

F1 Race Video

MP4 Files

YOLO Model Weights model_anti_alpine.pt yolo11n.pt

cv2.VideoCapture

Frame Processing

Resize & Normalize

YOLO Model Inference

Car Detection

Object ID Assignment

Track Consistency

Team Classification

Confidence Thresholds

Classification History

Stabilization

calculate_gap() Distance & Time

Car Dimensions 5.63m Length

300km/h Reference 83.33 m/s

Real-time Visualization

OpenCV Display

CSV Data Export

Structured Gap Data

Expert System

Gap-based Rules

Sources: scripts/gap_calculation.ipynb 15-26

scripts/gap_calculation.ipynb 346-352

scripts/gap_calculation.ipynb 477-754

YOLO Model Configuration

The system uses fine-tuned YOLO models specifically trained for F1 car detection and team classification. The models achieve high accuracy in detecting cars while maintaining real-time performance.

Model Loading and Device Configuration

Configuration Value Purpose

DEVICE 'cuda:0' or 'cpu' GPU acceleration when available

FRAME_WIDTH 1280 Processing resolution for balance of speed/accuracy CAR_LENGTH_METERS 5.63 Real F1 car length for scale calculation GAP_DETECTION_THRESHOLD 0.25 Low threshold to maximize car detections

The model weights are loaded from trained checkpoints, with the primary model being model_anti_alpine.pt which provides specialized performance for F1 car detection.

Performance Metrics

Team Classes

Model Architecture

Input Frame 1280x720

YOLOv8 Backbone

Feature Extraction

Feature Pyramid

Multi-scale Detection

Detection Head 10 F1 Teams + Background

Ferrari, Mercedes, Red Bull

McLaren, Aston Martin

Alpine, Williams, Haas

Kick Sauber, Racing Bulls

mAP50: 0.815

Precision: 0.654

Recall: 0.794

Inference: ~36ms/frame

Sources: scripts/gap_calculation.ipynb 346-351

scripts/gap_calculation.ipynb 381-423

scripts/YOLO_fine_tune.ipynb 304-315

Gap Calculation Algorithm

The core gap calculation converts pixel distances between detected cars into real-world measurements using the known dimensions of F1 cars as a scale reference.

Distance and Time Conversion

The calculate_gap() function implements the mathematical conversion from pixel space to physical measurements:

Parameter Formula Purpose

Pixel Distance np.hypot(cx2 - cx1, cy2 - cy1) Euclidean distance between car centers

Scale Factor CAR_LENGTH_METERS / avg_width Pixels-to-meters conversion

Physical Distance pixel_distance * scale Distance in meters Time Gap physical_distance / 83.33 Time at 300km/h reference speed

The algorithm assumes a reference speed of 300km/h (83.33 m/s) to convert physical distances into strategically relevant time gaps that indicate overtaking opportunities.

Output Metrics

Distance Calculation

Center Calculation

Input Detection

Car 1 Bounding Box (x1, y1, x2, y2)

Car 2 Bounding Box (x1, y1, x2, y2)

cx1 = (x1 + x2) / 2 cy1 = (y1 + y2) / 2

cx2 = (x1 + x2) / 2 cy2 = (y1 + y2) / 2

pixel_distance = sqrt((cx2-cx1)² + (cy2-cy1)²)

avg_width = ((box1_width + box2_width) / 2)

scale = 5.63m / avg_width

distance_meters = pixel_distance × scale

gap_seconds = distance_meters / 83.33

Sources: scripts/gap_calculation.ipynb 439-461

Object Tracking and Classification

The system maintains consistent object identification across video frames to enable accurate gap tracking over time. This involves both spatial tracking and team classification stabilization.

Tracking Algorithm

Component Method Purpose

ID Assignment Distance-based matching Link detections across frames

Classification History Rolling 5-frame buffer Stabilize team identification

Confidence Thresholds Team-specific values Handle model uncertainty

Spatial Tracking 100-pixel proximity threshold Associate objects between frames

The tracking system uses team-specific confidence thresholds to handle the varying detection reliability across different F1 teams:

class_thresholds = { 'Williams': 0.90, # Very high - difficult to classify 'Alpine': 0.90, # Very high - model struggles 'McLaren': 0.30, # Low - easily detected 'Red Bull': 0.85, # High - distinctive livery # ... other teams }

Sources: scripts/gap_calculation.ipynb 405-423

scripts/gap_calculation.ipynb 557-604

Video Processing Pipeline

The main processing pipeline handles both real-time visualization and batch data extraction through two primary functions: process_video_with_yolo() and extract_gaps_from_video().

Real-time Processing

The real-time pipeline provides interactive visualization with the following capabilities:

Feature Controls Functionality

Threshold Adjustment + / - keys Modify detection sensitivity

Video Navigation d key Skip forward 10 seconds

Exit Control q key Terminate processing

Visual Feedback Color-coded boxes Team identification confidence

Batch Data Extraction

The batch processing system samples video at configurable intervals (default 10 seconds) to extract structured gap data:

Output

Data Structure

Processing Loop

Input Configuration

Yes

video_path

sample_interval_seconds = 10

output_csv path

Read video frame

timestamp = frame / fps

Is sample frame?

YOLO inference

Calculate gaps

frame, timestamp car1_id, car2_id car1_team, car2_team distance_meters, gap_seconds

CSV file with structured gap data

Gap statistics and summary

Sources: scripts/gap_calculation.ipynb 815-1207

scripts/gap_calculation.ipynb 1216-1248

Data Output Format

The system exports gap measurements in a structured CSV format designed for integration with the expert system and further analysis.

CSV Schema

Column Data Type Description Example frame Integer Video frame number 1245 timestamp Float Time in seconds from video start 62.25 car1_id Integer Unique identifier for leading car 3 car2_id Integer Unique identifier for following car 7 car1_team String Team name or "F1 Car" if uncertain "Ferrari" car2_team String Team name or "F1 Car" if uncertain "Mercedes" distance_meters Float Physical gap in meters 45.2 gap_seconds Float Time gap at 300km/h 0.54 Integration Points

The CSV output integrates with the broader F1 Strategy Manager system through several pathways:

Expert System Facts: Gap data transforms into GapFact objects for rule evaluation

Strategic Analysis: Time gaps inform undercut/overcut opportunity detection

Dashboard Visualization: Real-time gap trends in the Streamlit interface

Historical Analysis: Long-term gap pattern analysis for strategy optimization

Sources: scripts/gap_calculation.ipynb 1272-1299

scripts/gap_calculation.ipynb 1185-1207

Performance Characteristics

The system achieves real-time processing performance suitable for live race analysis while maintaining accuracy for strategic decision-making.

Computational Requirements

Metric GPU (CUDA) CPU Only Notes

Inference Time ~36ms/frame ~150ms/frame YOLOv8 model performance

Processing FPS ~25-30 FPS ~6-8 FPS Including visualization overhead

Memory Usage ~2GB VRAM ~1GB RAM Model and frame buffers

Detection Accuracy mAP50: 0.815 Same Model-dependent, not hardware Accuracy Metrics

The YOLO model performance varies significantly by F1 team due to livery distinctiveness:

Team mAP50 Precision Recall Notes

Haas 0.995 High High Distinctive white livery Racing Bulls 0.995 High High Clear visual features Mercedes 0.995 High High Silver distinctive McLaren 0.775 Medium Medium Orange/blue challenging Alpine 0.811 Medium High Blue sometimes confused Williams 0.450 Low High Most challenging to classify

Sources: scripts/YOLO_fine_tune.ipynb 304-320

scripts/gap_calculation.ipynb 702-706

Configuration and Deployment

The system provides flexible configuration options for different deployment scenarios, from development testing to production race analysis.

Key Configuration Parameters

Detection sensitivity

GAP_DETECTION_THRESHOLD = 0.25 # Low for maximum car detection

Visual processing

FRAME_WIDTH = 1280 # Balance of speed and accuracy class_colors = { # Team-specific visualization 'Ferrari': (0, 0, 255), # Red in BGR format 'Mercedes': (200, 200, 200), # Silver # ... other teams }

Physical constants

CAR_LENGTH_METERS = 5.63 # F1 car reference dimension speed_mps = 83.33 # 300km/h reference speed

Deployment Considerations

GPU Acceleration: Strongly recommended for real-time processing

Model Weights: Requires access to trained YOLO checkpoint files

Video Input: Supports standard MP4 formats with configurable resolution Output Storage: Configurable CSV export with automatic directory creation

Sources: scripts/gap_calculation.ipynb 381-402

scripts/gap_calculation.ipynb 1190-1200

📝 This documentation is automatically generated using browser automation.
🕒 Last updated: $(date '+%Y-%m-%d %H:%M:%S UTC')