03 01 lap time prediction - VforVitorio/F1_Strat_Manager GitHub Wiki

Lap Time Prediction

Relevant source files

Data Pipeline

Data Sources

The prediction system relies on multiple data sources to build a comprehensive view of race conditions:

Data Source Description Key Fields
Laps Individual lap telemetry LapTime, Compound, TyreLife, Speed measurements
Weather Environmental conditions AirTemp, TrackTemp, Humidity, WindSpeed
Intervals Gap information between cars gap_to_leader, interval_in_seconds
Pitstops Pit stop information PitInTime, PitOutTime, Compound change
The system primarily focuses on lap-specific data including tire compounds, tire age, sector times, and speed measurements at various points on the track.

Data Validation and Processing

Before prediction can occur, the system validates input data through the following steps:

  1. Input Validation: Checks for required columns and correct data types
  2. Data Type Conversion: Ensures numerical values are properly formatted
  3. Missing Value Handling: Adds placeholders or calculates values for missing data points
  4. Sequential Feature Creation: Generates features that capture time-series relationships

Model Architecture

XGBoost Model

The lap time prediction system utilizes an XGBoost regression model, which was selected for its:

  • High accuracy on time-series data
  • Robustness to outliers
  • Ability to capture non-linear relationships
  • Fast prediction speed for real-time strategy decisions The model is trained on historical race data and achieves a Mean Absolute Error (MAE) of approximately 0.09 seconds, making it reliable for strategic decision-making.

Feature Importance

The model relies on several key feature types:

  1. Current State Features:
    • Tire compound (SOFT, MEDIUM, HARD, etc.)
    • Tire age (number of laps)
    • Current position in race
    • Speed measurements (SpeedI1, SpeedI2, SpeedFL, SpeedST)
  2. Sequential Features:
    • Previous lap time
    • Speed deltas between consecutive laps
    • Lap time trends
  3. External Factors:
    • Track status
    • Team/driver identifier

Prediction Process

Step 1: Model Loading

The prediction process begins by loading the pre-trained XGBoost model:

Step 2: Prediction Pipeline

The complete prediction pipeline consists of the following steps:

  1. Load model - Retrieves the trained XGBoost model
  2. Validate data - Ensures input data meets requirements
  3. Add sequential features - Creates time-series based features
  4. Prepare features - Aligns input with model expectations
  5. Make predictions - Generates lap time estimates
  6. Format results - Structures output for consumption by other systems

Step 3: Next Lap Prediction

A key feature of the system is its ability to predict the next lap time based on the current state:

  1. The system identifies the last lap for each driver
  2. Creates a synthetic next lap entry with incremented tire age
  3. Predicts the lap time for this future state This next lap prediction is crucial for real-time strategy decisions during a race.

Prediction Function API

The system exposes a primary function for generating predictions:

This function accepts telemetry data as input and returns structured predictions that can be used by other system components.

Performance and Limitations

Model Accuracy

The XGBoost model achieves strong performance metrics:

  • Mean Absolute Error (MAE): 0.09 seconds
  • Handles various tire compounds and track conditions reliably