Meeting notes week 4 - davidlabee/Graph4Air GitHub Wiki
🗓️ Meeting Notes – 29 April 2025, 14:00
🗒️ Last Meeting’s Feedback
- Graph Augmentation: Avoid over-densifying the graph; ensure new edges reflect functional similarity while preserving spatial structure.
- GAT vs. GCN: GAT outperforms GCN; investigate why attention helps in this context.
- Train/Test Splitting: Random splits may not fit the use case; focus on spatial interpolation and map quality over raw accuracy.
- Aggregation: Consider the road-segment aggregation for hierarchical models; multi-graph by road type may be a faster alternative.
- Baseline Documentation: Clearly describe and justify baseline models and evaluation choices in the wiki.
🗒️ Progress
Pieter
- Thought through the general GNN workflow → Thinking about the Model Architecture
- Drafted the Model Architecture page for the Baseline Models → Baseline Model Architecture
- Created a graph at a different aggregation level; shared progress and notes → Pieter's Graph Design
David
- Drafted and refined the introduction section of the thesis report in LaTeX.
- Conducted additional literature research on graph-augmentation techniques (similarity-based edges, spatial graphs).
- Ran multiple parameter configurations of the augmentation on the full Amsterdam network; hit memory limits.
- Implemented an optimized approach comparing only node pairs within spatial range—still awaiting full results.
🗒️ New Meeting Notes
General Notes
- External validation: Zhendong to provide Palmes tube data for final validation—and possibly include in training.
- Train/Test Split: Use an 80% node‐mask for training; model sees features for all nodes but only labels for the 80%.
- Evaluation Metrics: Validate using the Palmes dataset; report MAE and RMSE.
- Dataset Imbalance: Highway segments dominate; consider penalizing over-represented classes in the loss.
- Continue refining baseline GAT/GCN architectures and document choices in the wiki.
We Thought About the Model Architecture
- Discussed external validation workflow with Palmes data—could be integrated into training or held out.
- Revisited train/test masking strategy and its impact on interpolation quality.
- Confirmed MAE/RMSE as primary metrics.
We Also Did Some Things
- Pieter demonstrated his GitHub wiki updates on the baseline model architecture.
- Pieter showcased progress on data aggregation—current square‐grid approach is functional.
- David shared preliminary results of the optimized similarity algorithm (still running).
Feedback for David
- Propose clustering nodes first to optimize feature-similarity augmentation.
- Zhendong will grant access to multiple CPU cores to speed up pairwise similarity.
- Zhendong to parallelize the similarity-computation function.
- Instead of leaving nodes out, compute similarity across all node pairs.
Feedback for Pieter
- Continue working on this data aggregation method and start testing with the models.
- Check the stats (distribution) of the current aggregation and try other options.
- Think about alternative graph construction with Highway segments separate from the rest.
Next Meeting: 13 May 2025, 11:00