Thinking about the Model Architecture - davidlabee/Graph4Air GitHub Wiki

Before model training, it is important to clarify the purpose, assumptions, and future considerations behind the workflow. This ensures that choices are made consciously rather than automatically.

📌 What exactly is the goal?

Is it interpolation across known road segments? (predicting better values across an already covered network)
Or is it true extrapolation to unseen segments? (generalizing to entirely new areas not present in the training graph)
➤ If the goal is extrapolation, further experiments with held-out graph partitions or new cities would be needed (future work).
External NO2 validation on Palmes Tubes (paper Jules and Zhendong) [ask supervisors about this]

📌 Initial idea for Core Workflow:

For both a Graph Convolutional Network (GCN) and a Graph Attention Network (GAT);

Train the model on 80% of nodes (train_mask)
➤ Why? To allow the GNN to learn meaningful patterns from a subset of the graph structure.
Evaluate performance on the remaining 20% of nodes (test_mask)
➤ Why? To estimate how well the model generalizes to unseen parts of the graph, preventing overfitting.
Apply the trained model to all nodes for visualization/mapping
➤ Why? In real-world deployment, predictions are needed for every node such that interpretable spatial maps can be produced.
During training, test node features and edges are used, but their target values are not.

📌 Higher-Level Questions to Consider:

Do we prioritize smooth, visually realistic maps over pure raw accuracy?
- In environmental applications, smoothness could be just as important as minimizing numeric errors.
- Future work could include metrics beyond MSE/R² (e.g., spatial continuity measures).
Why might attention (GAT) be beneficial in this context?
- Road segments likely differ in importance (e.g., a major highway vs a side street).
- Attention allows the model to learn which neighbors matter more for each node’s NO₂ prediction, rather than treating all neighbors equally (as GCN does).
Why do we even need a held-out test set?
- Since GNNs propagate information across the graph, simply measuring training loss could be misleading.
- A test set ensures an unbiased evaluation of true generalization performance.
Why not use cross-validation across segments?
- Cross-validation would improve robustness by reducing dependency on a single train-test split.
- Future work: Implement k-fold cross-validation adapted to spatial graphs (e.g., stratified or cluster-based node folds).
Why not use internal validation during training?
- We could split off a small validation set from training nodes to monitor overfitting earlier.
- Future work: Add an explicit validation set (val_mask) and implement early stopping based on validation loss.
How do we handle missing values?