TrajectoryAnalyzer - jasonxfrazier/puddy GitHub Wiki
TrajectoryAnalyzer
TrajectoryAnalyzer
is the main class in Puddy for feature extraction, anomaly detection, and normalcy scoring of normalized trajectories.
Initialization
from puddy import TrajectoryAnalyzer
analyzer = TrajectoryAnalyzer(collection)
- collection: a
TrajectoryCollection
that has already been loaded and normalized.
Methods
extract_features
extract_features(trajectory: NormalizedTrajectory) -> Dict[str, float]
Extracts geometric and movement features from a single trajectory.
Features and formulas:
-
total_distance
total_distance = ∑ sqrt((x₂-x₁)² + (y₂-y₁)² + (z₂-z₁)²)
-
bounding_box_volume
bounding_box_volume = (x_max - x_min) * (y_max - y_min) * (z_max - z_min)
-
mean_altitude
The average z-coordinate across the trajectory. -
altitude_range
The range (max - min) of z-coordinates. -
path_linearity
Fraction of total variance explained by the first principal component, via PCA:path_linearity = explained_variance_ratio_[0] (from sklearn PCA)
-
total_turns
Number of direction changes where the angle between consecutive segments > 45°:cos(θ) = (v₁ · v₂) / (||v₁|| * ||v₂||)
Counts a turn if θ > π/4.
-
aspect_ratio
aspect_ratio = max([x_range, y_range, z_range]) / min([x_range, y_range, z_range])
prepare_features
prepare_features() -> np.ndarray
Extracts, validates, and standard-scales all features for all trajectories in the collection.
Invalid trajectories (containing NaN or inf) are dropped.
train_anomaly_detector
train_anomaly_detector(method: str = "isolation_forest", **kwargs) -> None
Fits an anomaly detection model using the chosen method.
"isolation_forest"
: Uses IsolationForest"lof"
: Uses LocalOutlierFactor (novelty mode)
get_normalcy_scores
get_normalcy_scores() -> np.ndarray
Returns an array of normalcy scores for all trajectories.
Higher score = more normal. Lower score = more anomalous.
find_anomalies
find_anomalies(threshold: float = 0.2) -> List[Tuple[NormalizedTrajectory, float]]
Returns a list of (trajectory, score) pairs where the normalcy score is less than threshold
.
get_normalcy_df
get_normalcy_df(id_column_name: str = "identifier") -> pd.DataFrame
Returns a DataFrame with each trajectory’s identifier and its normalcy score, sorted from most to least normal.
visualize_trajectories_sample
# gets bottom 20%
threshold = np.percentile(scores, 20)
visualize_trajectories_sample(
analyzer.collection.trajectories,
scores,
normal_sample=20,
show_all_anomalies=True,
threshold=threshold
)
Plots a random sample of normal trajectories and (optionally) all anomalies, colored by their normalcy score in 3D.
See the TrajectoryCollection
, ColumnConfig
, and other API docs for more details on preparing your data for analysis.