Meeting notes week 3 - davidlabee/Graph4Air GitHub Wiki

πŸ—“οΈ Meeting Notes – 29 April 2025, 14:00

πŸ‘₯ Attendees

  • David
  • Pieter
  • Zhendong
  • Jules

🧠 Discussion Points

πŸ“Œ David’s Presentation – Subject Overview and Edge Testing

  • Summary:
    • David first introduced his thesis topic to Jules, explaining the goal of enhancing road network graphs for better air pollution modeling.
    • He presented five different augmentation strategies developed and tested (road_graph_strategies_subset-2.ipynb).
  • Key Insights:
    1. Feature category-based selectivity (cosine similarity on traffic, population, land use, morphology).
    2. Sparse multi-modal agreement (β‰₯3 feature domains agreeing).
    3. Distance-constrained similarity edges (500 m–20 km range).
    4. Top-K strongest similarity edges (K=2).
    5. Soft augmentation with similarity as edge weights.
    • Strategy 2 (Sparse Multi-Modal Agreement) currently shows the best balance between sparsity and meaningful connectivity.
  • Challenges:
    • Avoid over-densifying the graph.
    • Maintain functional similarity while preserving spatial structure.

πŸ“Œ Pieter’s Presentation – Baseline GAT and GCN Models

  • Summary:
    • Pieter walked through the baseline models (Baseline_50m_Thesis_(COPY).ipynb).
    • GAT and GCN were trained on the raw 50 m-segment road graph without augmentation.
  • Key Insights:
    • GAT already outperforms GCN in initial experiments.
    • Likely reason: GAT uses attention mechanisms to weigh more important neighbors more heavily, improving information aggregation.
  • Challenges:
    • Further analyze why attention benefits this context.

πŸ“Œ Discussion – Train/Test Splitting Strategy

  • Summary:
    • Debated how to best split data for training and evaluation.
  • Key Insights:
    • Traditional random splits may or may not be ideal.
    • Our goal is interpolation across known road segments, not predicting truly unseen segments.
  • Challenges:
    • Decide if a held-out test set is necessary.
    • Alternatives: cross-validation over segments, internal validation (early stopping), etc.
    • Emphasize achieving smooth, accurate visual maps over raw accuracy.

πŸ“Œ Additional Topic – Graph Aggregation Alternatives

  • Summary:
    • Jules and Pieter met earlier to discuss alternative node aggregation methods.
  • Key Ideas:
    • Square-grid aggregation: group segments into grid cells as nodes. This could function as a layer of nodes that captures patterns only observed at a course level. Later this grid layer and the 50m segment layer could be combined in an hierarchical GNN model that captures both low and high resolution patterns.

    • Multi-graph approach: build separate graphs by functional categories (e.g., residential vs. highways). Suggested by Zhendong as faster to implement.

  • Challenges:
    • Time overhead of square-grid aggregation.
    • Deciding how to train/evaluate multiple specialized graphs.

πŸ“… Next Steps

  • David to benchmark and finalize the best augmentation strategy and test it on the whole city (Sparse Multi-Modal Agreement).
  • Pieter to merge the current 50m segments into 100m and 200m segments (possibly adding segment length as a feature if segment length varies a lot)
  • Group to:
    • Describe baseline model in the wiki while adressing the challenges and motivating choices.
    • Define final evaluation methodology (Basic holdout vs. cross-validation)(with or without early stopping?).
    • Look into using multiple graphs. So a graph for residential areas, highways etc.

Next Meeting:
6 may 2025, 14:30