【Discussion】AI_Research_with_Phd_Da - Bili-Sakura/NOTES GitHub Wiki
Note
Arthor: Sakura
Date: Beijing, May 18, 2024, 11:30-12:30
Outline:
- Remote Sensing Images (RSIs)
- Journal/Conference on Geoscience and Remote Sensing
- Remote Sensing Tasks
- Datasets
Examples of optical
remote sensing images [3].
SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery [2]


- Optical Images:
- RGB
- Multi-Band (Landsat/)
- Synthetic Aperture Radar (SAR)
- Single-Band(X/C/L/S/P)
- Multi-Band
- ...
Sentinel-1, Sentinel-2, Gaofen, Landsat, Worldview
[!IMPORTANT] Remote Sensing
Domain: Interdisciplinary/Comprehensive/Emerging
Journal: IEEE Transactions on Geoscience and Remote Sensing
Abbreviation: TGRS(CCF-B)
[!IMPORTANT] Remote Sensing
Domain: Interdisciplinary/Comprehensive/Emerging
Journal: ISPRS Journal of Photogrammetry and Remote Sensing
Abbreviation: ISPRS
[!NOTE] Remote Sensing (partially) Domain: Database/Data Mining/Information Retrieval
Journal: International Journal of Geographical Information Science
Abbreviation: IJGIS(CCF-C)
[!TIP] GeoSpatial Research Domain: Database/Data Mining/Information Retrieval
Journal: GeoInformatica
Abbreviation: GeoInformatica (CCF-B)
[!CAUTION] CAUTION: Remote Sensing Journal: Remote Sensing
[!NOTE] This section is mainly referred to Li et al.'s work [1].
Overview
Ascending to the 3-D realm, our attention turns toward building height retrieval. This critical task augments our knowledge by providing insights into the vertical dimension of structures. Building height retrieval involves addressing two problems: 1) delineating building footprints and 2) estimating building heights.
Dataset/Challenge ...
Overview
With the building footprints delineated, our focus shifts to a more detailed examination of structures through building facade segmentation. This task delves into the exterior face of buildings, contributing essential information for a more comprehensive understanding of their architectural characteristics. For SAR imagery, there are no studies focusing on extracting building facades. This is due to the side-looking geometry of SAR: building areas refer to roofs and facades, making it difficult to extract sole facade information. For optical imagery, the building facade is usually invisible at the nadir angle. Thus, off-nadir imagery is the primary type of data source to provide beneficial information for building facade segmentation.
Dataset/Challenge
Overview
Now, we proceed to roof segment and superstructure segmentation, advancing our analysis to the uppermost regions of buildings. This phase enriches our understanding by capturing the roof structures. Each planar roof segment [see Fig. 4(b)] of the building usually has a specific orientation. Moreover, roofs usually contain some structures [see Fig. 5(b)], e.g., chimneys and windows, which are generally named roof superstructures. Very-high-resolution optical images provide a valuable resource for roof segment and superstructure segmentation, as the details of roof segments and superstructures are visible.
Roof segment segmentation aims to extract individual roof planar segments. One early work1 relies on a line detection algorithm to detect roof ridges and gutters, and then, roof planar segments (which face in various orientations) of the building can be deduced. Recently, semantic segmentation networks have been implemented to directly learn roof segments from aerial imagery2,3. Roof superstructure segmentation focuses on segmenting different superstructures on the roof. To extract roof superstructure, early efforts 1,4 utilize either contour detection5 or watershed segmentation6, while recent studies3,7 use semantic segmentation networks.
- K. Mainzer, S. Killinger, R. McKenna, and W. Fichtner, “Assessment of rooftop photovoltaic potentials at the urban level using publicly available geodata and image recognition techniques,” Sol. Energy, vol. 155, pp. 561–573, Oct. 2017.
- S. Lee, S. Iyengar, M. Feng, P. Shenoy, and S. Maji, “DeepRoof: A data-driven approach for solar potential estimation using rooftop imagery,” in Proc. 25th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, Jul. 2019, pp. 2105–2113.
- S. Krapf, L. Bogenrieder, F. Netzler, G. Balke, and M. Lienkamp, “RID—Roof information dataset for computer vision-based photovoltaic potential assessment,” Remote Sens., vol. 14, no. 10, p. 2299, May 2022.
- Y. El Merabet, C. Meurie, Y. Ruichek, A. Sbihi, and R. Touahni, “Building roof segmentation from aerial images using a lineand regionbased watershed segmentation technique,” Sensors, vol. 15, no. 2, pp. 3172–3203, Feb. 2015.
- S. Suzuki and K. Be, “Topological structural analysis of digitized binary images by border following,” Comput. Vis., Graph., Image Process., vol. 30, no. 1, pp. 32–46, Apr. 1985.
- L. Vincent and P. Soille, “Watersheds in digital spaces: An efficient algorithm based on immersion simulations,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 13, no. 6, pp. 583–598, Jun. 1991.
- S. Krapf, B. Willenborg, K. Knoll, M. Bruhse, and T. H. Kolbe, “Deep learning for semantic 3D city model extension: Modeling roof superstructures using aerial images for solar potential analysis,” ISPRS Ann. Photogramm., Remote Sens. Spatial Inf. Sci., vol. 10, pp. 161–168, Oct. 2022.
-
xBD Dataset | Papers With Code
Leadboard: 2D Semantic Segmentation on xBD
[!NOTE] Reference Gupta, R., Hosfelt, R., Sajeev, S., Patel, N., Goodman, B., Doshi, J., Heim, E., Choset, H., & Gaston, M. (2019). xBD: A Dataset for Assessing Building Damage from Satellite Imagery (arXiv:1911.09296). arXiv. https://doi.org/10.48550/arXiv.1911.09296
[1] Q. Li, L. Mou, Y. Sun, Y. Hua, Y. Shi, and X. X. Zhu, ‘A Review of Building Extraction From Remote Sensing Imagery: Geometrical Structures and Semantic Attributes’, TGRS 2024, vol. 62, pp. 1–15, 2024, doi: 10.1109/TGRS.2024.3369723.
[2] X. Guo et al., ‘SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery’, in Computer Vision and Pattern Recognition (CVPR), arXiv, Mar. 2024. doi: 10.48550/arXiv.2312.10115.
[3] F. Meng et al., “Single Remote Sensing Image Super-Resolution via a Generative Adversarial Network With Stratified Dense Sampling and Chain Training,” IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–22, 2024, doi: 10.1109/TGRS.2023.3344112.
Note
Da, L., Liou, K., Chen, T., Zhou, X., Luo, X., Yang, Y., & Wei, H. (2023). Open-TI: Open Traffic Intelligence with Augmented Language Model. ArXiv, abs/2401.00211.
Important
Prompt Engineering Techniques
Question:
Planning: Task -> subtask (LLM)
Solution: few-shot | (fine-tuning)
Caution
We found that LLMs show weak capabilties in selecting the appropraite tool among numerous tools based on tool descriptions and task desctiptions.