Meeting notes week 4 - davidlabee/Graph4Air GitHub Wiki

🗓️ Meeting Notes – 29 April 2025, 14:00

🗒️ Last Meeting’s Feedback

  • Graph Augmentation: Avoid over-densifying the graph; ensure new edges reflect functional similarity while preserving spatial structure.
  • GAT vs. GCN: GAT outperforms GCN; investigate why attention helps in this context.
  • Train/Test Splitting: Random splits may not fit the use case; focus on spatial interpolation and map quality over raw accuracy.
  • Aggregation: Consider the road-segment aggregation for hierarchical models; multi-graph by road type may be a faster alternative.
  • Baseline Documentation: Clearly describe and justify baseline models and evaluation choices in the wiki.

🗒️ Progress

Pieter

David

  • Drafted and refined the introduction section of the thesis report in LaTeX.
  • Conducted additional literature research on graph-augmentation techniques (similarity-based edges, spatial graphs).
  • Ran multiple parameter configurations of the augmentation on the full Amsterdam network; hit memory limits.
  • Implemented an optimized approach comparing only node pairs within spatial range—still awaiting full results.

🗒️ New Meeting Notes

General Notes

  • External validation: Zhendong to provide Palmes tube data for final validation—and possibly include in training.
  • Train/Test Split: Use an 80% node‐mask for training; model sees features for all nodes but only labels for the 80%.
  • Evaluation Metrics: Validate using the Palmes dataset; report MAE and RMSE.
  • Dataset Imbalance: Highway segments dominate; consider penalizing over-represented classes in the loss.
  • Continue refining baseline GAT/GCN architectures and document choices in the wiki.

We Thought About the Model Architecture

  • Discussed external validation workflow with Palmes data—could be integrated into training or held out.
  • Revisited train/test masking strategy and its impact on interpolation quality.
  • Confirmed MAE/RMSE as primary metrics.

We Also Did Some Things

  • Pieter demonstrated his GitHub wiki updates on the baseline model architecture.
  • Pieter showcased progress on data aggregation—current square‐grid approach is functional.
  • David shared preliminary results of the optimized similarity algorithm (still running).

Feedback for David

  • Propose clustering nodes first to optimize feature-similarity augmentation.
  • Zhendong will grant access to multiple CPU cores to speed up pairwise similarity.
  • Zhendong to parallelize the similarity-computation function.
  • Instead of leaving nodes out, compute similarity across all node pairs.

Feedback for Pieter

  • Continue working on this data aggregation method and start testing with the models.
  • Check the stats (distribution) of the current aggregation and try other options.
  • Think about alternative graph construction with Highway segments separate from the rest.

Next Meeting: 13 May 2025, 11:00