TrajectoryAnalyzer - jasonxfrazier/puddy GitHub Wiki

TrajectoryAnalyzer

TrajectoryAnalyzer is the main class in Puddy for feature extraction, anomaly detection, and normalcy scoring of normalized trajectories.

Initialization

from puddy import TrajectoryAnalyzer

analyzer = TrajectoryAnalyzer(collection)

collection: a TrajectoryCollection that has already been loaded and normalized.

Methods

extract_features

extract_features(trajectory: NormalizedTrajectory) -> Dict[str, float]

Extracts geometric and movement features from a single trajectory.

Features and formulas:

total_distance

total_distance = ∑ sqrt((x₂-x₁)² + (y₂-y₁)² + (z₂-z₁)²)

bounding_box_volume

bounding_box_volume = (x_max - x_min) * (y_max - y_min) * (z_max - z_min)

mean_altitude
The average z-coordinate across the trajectory.
altitude_range
The range (max - min) of z-coordinates.
path_linearity
Fraction of total variance explained by the first principal component, via PCA:
```
path_linearity = explained_variance_ratio_[0] (from sklearn PCA)
```
total_turns
Number of direction changes where the angle between consecutive segments > 45°:
```
cos(θ) = (v₁ · v₂) / (||v₁|| * ||v₂||)
```
Counts a turn if θ > π/4.

aspect_ratio

aspect_ratio = max([x_range, y_range, z_range]) / min([x_range, y_range, z_range])

prepare_features

prepare_features() -> np.ndarray

Extracts, validates, and standard-scales all features for all trajectories in the collection.
Invalid trajectories (containing NaN or inf) are dropped.

train_anomaly_detector

train_anomaly_detector(method: str = "isolation_forest", **kwargs) -> None

Fits an anomaly detection model using the chosen method.

"isolation_forest": Uses IsolationForest
"lof": Uses LocalOutlierFactor (novelty mode)

get_normalcy_scores

get_normalcy_scores() -> np.ndarray

Returns an array of normalcy scores for all trajectories.
Higher score = more normal. Lower score = more anomalous.

find_anomalies

find_anomalies(threshold: float = 0.2) -> List[Tuple[NormalizedTrajectory, float]]

Returns a list of (trajectory, score) pairs where the normalcy score is less than threshold.

get_normalcy_df

get_normalcy_df(id_column_name: str = "identifier") -> pd.DataFrame

Returns a DataFrame with each trajectory’s identifier and its normalcy score, sorted from most to least normal.

visualize_trajectories_sample

# gets bottom 20%
threshold = np.percentile(scores, 20)

visualize_trajectories_sample(
    analyzer.collection.trajectories,
    scores,
    normal_sample=20,
    show_all_anomalies=True,
    threshold=threshold 
)

Plots a random sample of normal trajectories and (optionally) all anomalies, colored by their normalcy score in 3D.

See the TrajectoryCollection, ColumnConfig, and other API docs for more details on preparing your data for analysis.