Meeting notes week 5 - davidlabee/Graph4Air GitHub Wiki

🗓️ Meeting Notes – 29 April 2025, 14:00

Confirm current aggregation method “works well enough” for initial experiments.
Check the stats (distribution) of the current aggregation and try other options.
Think about alternative graph construction with Highway segments separate from the rest.

Parallelize the similarity-computation function.
Instead of leaving nodes out, try to compute similarity across all node pairs.

Continue refining baseline GAT/GCN architectures and document choices in the wiki.
External validation: Use Palmes tube data for final validation—and possibly include in training.
Evaluation Metrics: Report PearsonR, MAE and RMSE.
Dataset Imbalance: Highway segments dominate; consider penalizing over-represented classes in the loss.

Added all features to the baseline models.
Created function for handling missing target(NO2) values in baseline graph construction -> Baseline Graph Design
Organized all the code in different notebooks
- Data exploration
- Baseline Graph
- Basic GNN models
- Multi Resolution model
updated wiki -> Baseline Model Architecture
first trials with multi-scale GNN look promising
created scoreboard -> Scoreboard

Created a little different graph augmentation method
Will present the first testing results and working of the new graph augmentation method

Check the unit for NO2d data. Is it the same as in the research of the supervisors?
Should we use a mask for the missing values instead of mean imputation + dropping?
What is the effect of the current imputation function on performance?
Put the created Colab notebooks on Github as well.
Fill in the Scoreboard page to document results.
Try the cross validation to see if there is overfitting.
Think about bias, overfitting and look at ways to prevent these.
Also consider other GNN python packages.
Do external validation on the Palmes dataset for all models. (could maybe be done during hyperparameter tuning)
If you are satisfied with the current graph structure look into hyperparameter tuning
Is there is a way to take 'half' of the Palmes measurements into the graph structure (for training)

Refine basic aggregation method using existing graph partitioning algorithms.
Optional: try multiscale graphs where aggregation happens based on road types.

For next week create some good comparisons of the graph augmentation model with different parameters, scores and the baseline model scores.

Next Meeting: TBD