Meeting notes week 2 - davidlabee/Graph4Air GitHub Wiki

🗓️ Meeting Notes – 22 April 2025, 9:00

Context: Mobile car-based sensors yield high-resolution NO₂ data at 1s intervals.
Research Focus: Evaluates how spatial aggregation (10–500m segments) impacts GNN performance.
Key Questions:
- What is the optimal spatial resolution for graph construction?
- Can multi-resolution learning outperform single-scale GNNs?
Methodology:
- Construct graphs at multiple scales (25m, 50m, 100m, 200m, etc.)
- Compare GCN and GAT across resolutions.
- Propose hierarchical or parallel multi-resolution models.
Expected Contribution:
- First mobile-sensor study using multi-scale GNNs for air quality.
- Performance benchmark across resolutions.
- A framework for combining multiple resolutions in GNNs.

Motivation: Physical adjacency in graphs misses global functional patterns relevant to pollution spread.
Research Question: Can feature similarity-based edges (e.g. land use, traffic, morphology) enhance GNN predictions while maintaining spatial integrity?
Graph Design:
- Base graph: Road network topology (adjacency).
- Augmentation: Add edges based on cosine similarity (e.g. threshold > 0.9) using feature vectors.
Cited Studies:
- Yan et al. (2021), Zhou et al. (2023): Importance of global semantic similarity.
- Briggs et al. (2020): Global correlations matter for accurate environmental modeling.
- Madrid Study, UrbanAir, Chen et al. (2022): Showed hybrid/augmented graphs outperform standard spatial-only models.
Design Challenges:
- Balancing between preserving spatial locality and introducing functionally meaningful global edges.
- Risk of graph densification and over-smoothing.

Discuss why not use intersections as nodes and back this up with scientific literature.
Give the presentation more as challenges which there are and how we can tackle them with this research
For Pieter: A Multi-resolution GNN should have the output on the 50m segments for comparison with other methods
For David: think theoretically about your suggestion and also look which similarity measure you are gonna use.
Look at a good baseline model. Zhendong has 2 models (linear model and random forest) to compare the GCN/GAT models. Also find a way to compare with the landuse Regression model. Also look at a baseline graph model that you create yourself.

Both Look at the intersections as nodes and how they could work
Both Update presentations to include feedback
Pieter Start experimenting with building the graph
Pieter Write a mail to Jules to request aggregation of features
David Start experimenting with building the graph
David Experiment with different similarity metrics and thresholds for graph augmentation
Zhendong share the related papers which you mentioned in the meeting
Zhendong Find a moment for the next meeting

Next Meeting:
NO DATE YET, will have to be determined