GeoPackage Export - Black-Lights/planetscope-py GitHub Wiki
GeoPackage Export
Professional GeoPackage creation with comprehensive metadata, imagery support, and one-line functions
The GeoPackage Export module provides professional capabilities for creating standardized GeoPackage files containing PlanetScope scene polygons, comprehensive metadata, and optionally downloaded imagery with proper spatial referencing. The module now includes enhanced one-liner functions for simplified workflows alongside the comprehensive original functionality.
Overview
The GeoPackageManager
and enhanced one-liner functions create professional GeoPackage files including:
- Scene Footprint Polygons: Vector layers with comprehensive metadata attributes
- Multi-Layer Support: Vector polygons and raster imagery in single file
- Professional Schemas: Minimal, standard, and comprehensive attribute schemas
- GIS Software Integration: Direct compatibility with QGIS, ArcGIS, and other tools
- Automatic Styling: Generation of .qml style files for immediate visualization
- Cross-Platform Standards: Standardized schemas for maximum compatibility
- One-Line Functions: Simplified API for common workflows
- Enhanced Planet API Integration: All Planet API search parameters supported
Quick Start
NEW: One-Line Functions (Recommended for Most Users)
from planetscope_py import create_scene_geopackage, create_milan_geopackage
from shapely.geometry import box
milan_roi = box(9.04, 45.40, 9.28, 45.52)
# Simplest usage - create GeoPackage with scene footprints
gpkg_path = create_scene_geopackage(milan_roi, "2025-01-01/2025-01-31")
# Milan area presets with different sizes
milan_gpkg = create_milan_geopackage("last_month", size="large")
# Search and export with comprehensive statistics
from planetscope_py import quick_scene_search_and_export
result = quick_scene_search_and_export(milan_roi, "2025-01-01", "2025-01-31")
print(f"Found {result['scenes_found']} scenes, {result['coverage_ratio']:.1%} coverage")
Advanced Usage with Full Control
from planetscope_py import GeoPackageManager, GeoPackageConfig, PlanetScopeQuery
from shapely.geometry import Polygon
import geopandas as gpd
# Step 1: Define your region of interest
milan_polygon = Polygon([
[8.7, 45.1], [9.8, 44.9], [10.3, 45.3], [10.1, 45.9],
[9.5, 46.2], [8.9, 46.0], [8.5, 45.6], [8.7, 45.1]
])
# Step 2: Search for scenes
query = PlanetScopeQuery()
results = query.search_scenes(
geometry=milan_polygon,
start_date="2025-01-01",
end_date="2025-01-31",
cloud_cover_max=0.3,
item_types=["PSScene"]
)
# Step 3: Initialize GeoPackage manager with custom configuration
config = GeoPackageConfig(
include_imagery=False, # Set to True only if you have downloaded imagery files
clip_to_roi=True,
attribute_schema="standard"
)
geopackage_manager = GeoPackageManager(config=config)
# Step 4: Create comprehensive GeoPackage
geopackage_path = geopackage_manager.create_scene_geopackage(
scenes=results['features'],
output_path="milan_analysis.gpkg",
roi=milan_polygon
# Note: only include downloaded_files parameter if you have actual imagery files
)
print(f" GeoPackage created: {geopackage_path}")
# Get layer information using GeoPandas
gdf = gpd.read_file(geopackage_path, layer="scene_footprints")
print(f" Features: {len(gdf)}")
print(f" Geometry type: {gdf.geometry.geom_type.iloc[0] if len(gdf) > 0 else 'Unknown'}")
One-Line Functions Reference
1. Basic GeoPackage Creation
from planetscope_py import quick_geopackage_export
# Most comprehensive one-liner with ALL Planet API parameters
gpkg_path = quick_geopackage_export(
roi=milan_polygon,
time_period="2025-01-01/2025-01-31",
output_path="milan_scenes.gpkg",
clip_to_roi=True,
schema="comprehensive", # minimal, standard, comprehensive
cloud_cover_max=0.2,
sun_elevation_min=30,
ground_control=True,
quality_category="standard",
item_types=["PSScene"],
# ALL additional Planet API parameters:
visible_percent_min=0.8,
clear_percent_min=0.7,
usable_data_min=0.9,
shadow_percent_max=0.1,
snow_ice_percent_max=0.05,
heavy_haze_percent_max=0.1,
light_haze_percent_max=0.15,
anomalous_pixels_max=0.1,
view_angle_max=25,
off_nadir_max=15,
gsd_min=3.0,
gsd_max=4.5,
satellite_ids=["1001", "1002"],
instrument="PSB.SD",
provider="planet",
processing_level="3B"
)
2. Milan Area Presets
from planetscope_py import create_milan_geopackage
# Milan area with predefined polygons
milan_gpkg = create_milan_geopackage(
time_period="last_month", # or "2025-01-01/2025-01-31"
size="large", # city_center (~100 km²), small (~500 km²),
# medium (~1200 km²), large (~2000+ km²)
cloud_cover_max=0.15,
sun_elevation_min=25,
quality_category="standard"
)
3. Clipped vs Full Grid Shortcuts
from planetscope_py import create_clipped_geopackage, create_full_grid_geopackage
# Force clipping to ROI shape
clipped_gpkg = create_clipped_geopackage(roi, "last_month", "clipped.gpkg")
# Show full scene footprints (no clipping)
full_gpkg = create_full_grid_geopackage(roi, "last_month", "full.gpkg")
4. Search + Export with Statistics
from planetscope_py import quick_scene_search_and_export
# Comprehensive search and export with detailed results
result = quick_scene_search_and_export(
roi=milan_polygon,
start_date="2024-01-01",
end_date="2024-12-31",
output_path="detailed_analysis.gpkg",
cloud_cover_max=0.1,
sun_elevation_min=30,
ground_control=True,
quality_category="standard",
usable_data_min=0.9
)
# Access comprehensive statistics
print(f"Success: {result['success']}")
print(f"Scenes found: {result['scenes_found']}")
print(f"ROI area: {result['roi_area_km2']:.1f} km²")
print(f"Coverage ratio: {result['coverage_ratio']:.1%}")
print(f"GeoPackage: {result['geopackage_path']}")
# Detailed statistics
stats = result['statistics']
if 'cloud_cover' in stats:
print(f"Cloud cover: {stats['cloud_cover']['mean']:.1%} average")
if 'quality_distribution' in stats:
print(f"Quality distribution: {stats['quality_distribution']}")
if 'satellite_distribution' in stats:
print(f"Satellites: {stats['unique_satellites']} unique")
5. Batch Processing
from planetscope_py import batch_geopackage_export, validate_geometry
from shapely.geometry import Polygon
# Milan (9.19°E, 45.46°N) - 5000 km²
milan_roi = Polygon([
(8.648, 45.142), # Bottom-left
(9.732, 45.142), # Bottom-right
(9.732, 45.778), # Top-right
(8.648, 45.778), # Top-left
(8.648, 45.142) # Close
])
# Import Shapefiles or GEOJSON or WKT files and parse it through validate_geometry function to get the polygon
rome_roi = validate_geometry(r"C:\path\to\directory\rome.geojson")
florence_roi = validate_geometry(r"C:\path\to\directory\shapefiles\Florence.shp")
# Process multiple ROIs in one call
roi_list = [rome_roi, florence_roi, milan_roi]
results = batch_geopackage_export(
roi_list=roi_list,
time_period="last_month",
output_dir="./multi_city_analysis",
cloud_cover_max=0.2,
clip_to_roi=True
)
# Check results
for roi_name, result in results.items():
if result['success']:
validation = result['validation']
print(f"{roi_name}: {validation['feature_count']} scenes")
else:
print(f"{roi_name}: Failed - {result['error']}")
6. Export from Existing Scenes
from planetscope_py import export_scenes_to_geopackage
# If you already have scenes from search
gpkg_path = export_scenes_to_geopackage(
scenes=existing_scenes,
output_path="existing_scenes.gpkg",
roi=roi_polygon,
clip_to_roi=True,
schema="standard"
)
7. Validation and Quality Assurance
from planetscope_py import validate_geopackage_output
# Validate created GeoPackage
validation = validate_geopackage_output("clipped.gpkg")
if validation['valid']:
print(f" Valid GeoPackage:")
print(f" Features: {validation['feature_count']}")
print(f" File size: {validation['file_size_mb']:.1f} MB")
print(f" CRS: {validation['crs']}")
# Additional statistics if available
if 'aoi_statistics' in validation:
aoi_stats = validation['aoi_statistics']
print(f" Total AOI: {aoi_stats['total_aoi_km2']:.1f} km²")
if 'cloud_statistics' in validation:
cloud_stats = validation['cloud_statistics']
print(f" Cloud cover: {cloud_stats['mean_cloud_cover']:.1%}")
else:
print(f" Invalid GeoPackage: {validation['error']}")
Planet API Parameters Reference
The one-liner functions support ALL Planet API search parameters:
Core Parameters
cloud_cover_max
(float): Maximum cloud cover threshold (0.0-1.0)item_types
(List[str]): Planet item types (default: ["PSScene"])sun_elevation_min
(float): Minimum sun elevation in degreesground_control
(bool): Require ground control pointsquality_category
(str): Required quality category ("test", "standard", etc.)
Advanced Quality Filters
visible_percent_min
(float): Minimum visible pixels percentageclear_percent_min
(float): Minimum clear pixels percentageusable_data_min
(float): Minimum usable data percentage (0.0-1.0)shadow_percent_max
(float): Maximum shadow percentagesnow_ice_percent_max
(float): Maximum snow/ice percentageheavy_haze_percent_max
(float): Maximum heavy haze percentagelight_haze_percent_max
(float): Maximum light haze percentageanomalous_pixels_max
(float): Maximum anomalous pixels
Geometric Constraints
view_angle_max
(float): Maximum view angle in degreesoff_nadir_max
(float): Maximum off-nadir angle in degreesgsd_min
(float): Minimum ground sample distance in metersgsd_max
(float): Maximum ground sample distance in meters
Technical Filters
satellite_ids
(List[str]): Specific satellite IDs to includeinstrument
(str): Specific instrument typeprovider
(str): Data provider filterprocessing_level
(str): Processing level requirementclear_confidence_percent_min
(float): Minimum clear confidencevisible_confidence_percent_min
(float): Minimum visible confidencepixel_resolution_min
(float): Minimum pixel resolutionpixel_resolution_max
(float): Maximum pixel resolution
Configuration Options
GeoPackageConfig Settings
from planetscope_py import GeoPackageConfig
# Comprehensive configuration
config = GeoPackageConfig(
# Imagery options
include_imagery=True,
clip_to_roi=True,
imagery_format="GeoTIFF", # or "COG" (Cloud Optimized GeoTIFF)
compression="LZW", # LZW, DEFLATE, or JPEG
overview_levels=[2, 4, 8, 16], # Pyramid levels
# Coordinate system
target_crs="EPSG:4326", # WGS84 (default)
# target_crs="EPSG:3857", # Web Mercator
# target_crs="EPSG:32632", # UTM Zone 32N (for Italy)
# Attribute schema
attribute_schema="comprehensive", # minimal, standard, comprehensive
# File size limits
max_raster_size_mb=100, # Maximum raster size per layer
)
Enhanced Attribute Schema Options
Schema | Fields | Description | Use Case |
---|---|---|---|
minimal |
~12 | Basic attributes (ID, date, cloud cover, area) | Quick exports, small files |
standard |
~35 | Most Planet API fields + area calculations | General analysis workflows |
comprehensive |
~50+ | All available Planet metadata + statistics | Detailed analysis, archival |
Enhanced Standard Schema (NEW)
The standard schema now includes most Planet API fields as default:
standard_attributes = [
# Core identification
'scene_id', 'acquired', 'satellite_id', 'item_type',
# Quality metrics (enhanced)
'cloud_cover', 'cloud_percent', 'clear_percent', 'visible_percent',
'usable_data', 'sun_elevation', 'sun_azimuth', 'quality_category',
# Spatial information
'area_km2', 'aoi_km2', 'coverage_percentage',
'centroid_lat', 'centroid_lon',
# Technical details
'pixel_resolution', 'gsd', 'view_angle', 'off_nadir',
'instrument', 'satellite_id', 'strip_id', 'published',
# Confidence metrics
'clear_confidence_percent', 'cloud_confidence_percent',
'shadow_confidence_percent', 'snow_ice_confidence_percent',
'heavy_haze_confidence_percent', 'light_haze_confidence_percent',
# Processing information
'ground_control', 'processing_level', 'provider'
]
Core Functions (Advanced Usage)
1. Creating Scene GeoPackages
# Advanced scene GeoPackage with full control
layer_info = geopackage_manager.create_scene_geopackage(
scenes=scenes,
output_path="scenes.gpkg",
roi=roi,
scene_name_field="id", # Field to use for scene names
downloaded_files=downloaded_file_paths # Optional imagery
)
# Access detailed layer information
print(f"Layer name: {layer_info.name}")
print(f"Feature count: {layer_info.feature_count}")
print(f"Geometry type: {layer_info.geometry_type}")
print(f"CRS: {layer_info.crs}")
print(f"Bounding box: {layer_info.bbox}")
print(f"Created: {layer_info.created}")
2. Adding Imagery to Existing GeoPackages
# Add imagery to existing GeoPackage
success = geopackage_manager.add_imagery_to_existing_geopackage(
geopackage_path="existing_scenes.gpkg",
downloaded_files=new_downloaded_files,
scene_mapping={"scene_001": "file_001.tif", "scene_002": "file_002.tif"}
)
if success:
print("Successfully added imagery to existing GeoPackage")
3. GeoPackage Analysis and Information
# Get comprehensive GeoPackage information
info = geopackage_manager.get_geopackage_info("analysis.gpkg")
print("GeoPackage Analysis:")
print(f" File size: {info['file_size_mb']:.1f} MB")
print(f" Total layers: {info['total_layers']}")
print(f" Created: {info['created']}")
# Layer details
for layer_name, layer_info in info['layer_info'].items():
print(f"\nLayer: {layer_name}")
print(f" Type: {layer_info['layer_type']}")
print(f" Features: {layer_info['feature_count']}")
print(f" Geometry: {layer_info['geometry_type']}")
print(f" CRS: {layer_info['crs']}")
Working with GeoPackage Outputs
Loading in QGIS
# Automatic QGIS style generation
style_path = geopackage_manager.create_qgis_style(
geopackage_path="analysis.gpkg",
layer_name="scene_footprints"
)
print(f"QGIS style file created: {style_path}")
# Manual QGIS loading:
# 1. Open QGIS
# 2. Layer > Add Layer > Add Vector Layer
# 3. Select the .gpkg file
# 4. Choose the layer to load
# 5. Apply the .qml style file if generated
Loading with GeoPandas
import geopandas as gpd
# Load vector layer
scenes_gdf = gpd.read_file("analysis.gpkg", layer="scene_footprints")
print(f"Loaded {len(scenes_gdf)} scene polygons")
# Explore enhanced attributes
print("Available columns:")
for col in scenes_gdf.columns:
if col != 'geometry':
print(f" {col}: {scenes_gdf[col].dtype}")
# Enhanced analysis with new fields
print(f"\nScene statistics:")
print(f" Date range: {scenes_gdf['acquired'].min()} to {scenes_gdf['acquired'].max()}")
print(f" Cloud cover: {scenes_gdf['cloud_cover'].mean():.1%} average")
if 'quality_category' in scenes_gdf.columns:
print(f" Quality distribution: {scenes_gdf['quality_category'].value_counts().to_dict()}")
if 'satellite_id' in scenes_gdf.columns:
print(f" Unique satellites: {scenes_gdf['satellite_id'].nunique()}")
Advanced Features
Custom Metadata Processing
from planetscope_py import MetadataProcessor
# Custom metadata processor for specific attributes
class CustomMetadataProcessor(MetadataProcessor):
def extract_scene_metadata(self, scene):
"""Extract custom metadata fields."""
base_metadata = super().extract_scene_metadata(scene)
# Add custom fields
props = scene.get('properties', {})
base_metadata.update({
'custom_quality_class': self._classify_quality(props),
'acquisition_season': self._get_season(props.get('acquired')),
'daynight_flag': self._classify_daynight(props),
'processing_level': props.get('processing_level', 'unknown')
})
return base_metadata
def _classify_quality(self, props):
"""Custom quality classification."""
cloud_cover = props.get('cloud_cover', 1.0)
sun_elevation = props.get('sun_elevation', 0)
if cloud_cover < 0.05 and sun_elevation > 50:
return 'premium'
elif cloud_cover < 0.15 and sun_elevation > 30:
return 'high'
elif cloud_cover < 0.3:
return 'medium'
else:
return 'low'
# Use custom processor with one-liner
custom_processor = CustomMetadataProcessor()
geopackage_manager = GeoPackageManager(metadata_processor=custom_processor)
Multi-ROI GeoPackages
# Create GeoPackage with multiple regions using one-liners
from shapely.geometry import box
# Define multiple ROIs
rois = {
'milan_center': box(9.15, 45.45, 9.21, 45.50),
'milan_north': box(9.10, 45.50, 9.25, 45.55),
'milan_south': box(9.05, 45.40, 9.20, 45.45)
}
# Process each ROI with one-liner
results = {}
for roi_name, roi_geom in rois.items():
result = quick_scene_search_and_export(
roi=roi_geom,
start_date="2025-01-01",
end_date="2025-01-31",
output_path=f"analysis_{roi_name}.gpkg",
clip_to_roi=True
)
results[roi_name] = result
if result['success']:
print(f"{roi_name}: {result['scenes_found']} scenes")
# Alternative: Batch processing
batch_results = batch_geopackage_export(list(rois.values()), "last_month")
Temporal Grouping with One-Liners
# Create monthly GeoPackages using one-liners
from datetime import datetime
# Define monthly periods
months = [
"2025-01-01/2025-01-31",
"2025-02-01/2025-02-28",
"2025-03-01/2025-03-31"
]
for month_period in months:
start_date = month_period.split('/')[0]
month_key = start_date[:7] # YYYY-MM
result = quick_scene_search_and_export(
roi=roi,
start_date=start_date,
end_date=month_period.split('/')[1],
output_path=f"scenes_{month_key}.gpkg"
)
if result['success'] and result['scenes_found'] >= 5:
print(f"{month_key}: {result['scenes_found']} scenes -> {result['geopackage_path']}")
Quality Assurance and Validation
Comprehensive Validation
# Enhanced validation using one-liner
def comprehensive_validation(geopackage_path, original_scenes=None):
"""Comprehensive GeoPackage validation."""
# Basic validation
validation = validate_scene_geopackage(geopackage_path)
if not validation['valid']:
return validation
print(" Basic Validation Passed")
print(f" Features: {validation['feature_count']}")
print(f" File size: {validation['file_size_mb']:.1f} MB")
# Extended validation if original scenes provided
if original_scenes:
print("\n Extended Validation:")
# Load and compare
import geopandas as gpd
gdf = gpd.read_file(geopackage_path)
print(f" Original scenes: {len(original_scenes)}")
print(f" GeoPackage features: {len(gdf)}")
# Check for missing scenes
original_ids = {scene['properties']['id'] for scene in original_scenes}
geopackage_ids = set(gdf['scene_id'])
missing_ids = original_ids - geopackage_ids
if missing_ids:
print(f" Missing scenes: {len(missing_ids)}")
else:
print(f" All scenes included")
# Check attribute completeness
null_counts = gdf.isnull().sum()
problematic_columns = null_counts[null_counts > 0]
if len(problematic_columns) > 0:
print(f" Columns with null values: {len(problematic_columns)}")
else:
print(f" No missing attribute values")
return validation
# Use comprehensive validation
validation = comprehensive_validation("analysis.gpkg", original_scenes)
Data Integrity Checks
# Automated integrity checking
def check_geopackage_integrity(geopackage_paths):
"""Check integrity of multiple GeoPackages."""
results = {}
for path in geopackage_paths:
validation = validate_scene_geopackage(path)
results[path] = validation
status = "✅" if validation['valid'] else "❌"
feature_count = validation.get('feature_count', 0)
print(f"{status} {path}: {feature_count} features")
return results
# Check multiple files
geopackage_files = ["milan.gpkg", "rome.gpkg", "florence.gpkg"]
integrity_results = check_geopackage_integrity(geopackage_files)
Export and Reporting
Generate Comprehensive Reports
# Generate analysis report using one-liner statistics
def create_analysis_report(search_result, validation_result):
"""Create comprehensive analysis report."""
report = {
"analysis_summary": {
"success": search_result['success'],
"scenes_found": search_result.get('scenes_found', 0),
"roi_area_km2": search_result.get('roi_area_km2', 0),
"coverage_ratio": search_result.get('coverage_ratio', 0),
"search_parameters": search_result.get('search_parameters', {})
},
"geopackage_validation": validation_result,
"detailed_statistics": search_result.get('statistics', {}),
"temporal_coverage": search_result.get('date_range', {}),
"generated_at": datetime.now().isoformat()
}
return report
# Create report
search_result = quick_scene_search_and_export(roi, "2025-01-01", "2025-01-31")
validation = validate_scene_geopackage(search_result['geopackage_path'])
report = create_analysis_report(search_result, validation)
# Save report
import json
with open("analysis_report.json", "w") as f:
json.dump(report, f, indent=2, default=str)
Export to Multiple Formats
# Convert GeoPackage to other formats using validation info
def export_to_multiple_formats(geopackage_path, output_dir):
"""Export GeoPackage to multiple GIS formats."""
from pathlib import Path
import geopandas as gpd
# Validate first
validation = validate_scene_geopackage(geopackage_path)
if not validation['valid']:
print(f" Cannot export invalid GeoPackage: {validation['error']}")
return False
output_path = Path(output_dir)
output_path.mkdir(exist_ok=True)
# Load data
gdf = gpd.read_file(geopackage_path)
# Export to various formats
formats = {
"shapefile": {"driver": "ESRI Shapefile", "ext": ".shp"},
"geojson": {"driver": "GeoJSON", "ext": ".geojson"},
"kml": {"driver": "KML", "ext": ".kml"}
}
exported_files = {}
for format_name, format_config in formats.items():
try:
export_path = output_path / f"scenes{format_config['ext']}"
gdf.to_file(export_path, driver=format_config['driver'])
exported_files[format_name] = str(export_path)
print(f" Exported to {format_name}: {export_path}")
except Exception as e:
print(f" Failed to export to {format_name}: {e}")
# Export metadata to CSV
csv_path = output_path / "scene_metadata.csv"
metadata_df = gdf.drop('geometry', axis=1)
metadata_df.to_csv(csv_path, index=False)
exported_files["csv"] = str(csv_path)
print(f" Exported metadata to CSV: {csv_path}")
return exported_files
# Use export function
exported = export_to_multiple_formats("analysis.gpkg", "exported_formats")
Performance Optimization
Large Dataset Handling with One-Liners
# Optimized processing for large datasets
def process_large_area(large_roi, time_period, max_scenes_per_chunk=1000):
"""Process large areas using chunked approach."""
# First, estimate total scenes
test_result = quick_scene_search_and_export(
roi=large_roi,
start_date=time_period.split('/')[0],
end_date=time_period.split('/')[1],
output_path=None # Don't create file yet
)
total_scenes = test_result.get('scenes_found', 0)
print(f"Estimated {total_scenes} scenes for large area")
if total_scenes > max_scenes_per_chunk:
print(f" Large dataset detected - using chunked processing")
# Use memory-efficient minimal schema
result = quick_scene_search_and_export(
roi=large_roi,
start_date=time_period.split('/')[0],
end_date=time_period.split('/')[1],
output_path="large_area_minimal.gpkg",
schema="minimal", # Use minimal schema for large datasets
clip_to_roi=False # Skip clipping for performance
)
else:
# Use standard processing
result = quick_scene_search_and_export(
roi=large_roi,
start_date=time_period.split('/')[0],
end_date=time_period.split('/')[1],
output_path="large_area_standard.gpkg",
schema="standard"
)
return result
# Process large area
large_milan = Polygon([
[8.5, 44.8], [10.5, 44.6], [10.8, 46.2], [8.2, 46.4], [8.5, 44.8]
]) # Very large area around Milan
result = process_large_area(large_milan, "2025-01-01/2025-01-31")
Memory Management
# Memory-efficient configuration for one-liners
def memory_efficient_export(roi, time_period, output_path):
"""Memory-efficient GeoPackage export."""
import psutil
import gc
def get_memory_usage():
process = psutil.Process()
return process.memory_info().rss / 1024 / 1024 # MB
initial_memory = get_memory_usage()
print(f"Initial memory: {initial_memory:.1f} MB")
# Use minimal schema and no clipping for efficiency
result = quick_scene_search_and_export(
roi=roi,
start_date=time_period.split('/')[0],
end_date=time_period.split('/')[1],
output_path=output_path,
schema="minimal", # Minimal schema for memory efficiency
clip_to_roi=False, # No clipping for speed
cloud_cover_max=0.5 # More permissive for faster search
)
final_memory = get_memory_usage()
print(f"Final memory: {final_memory:.1f} MB")
print(f"Memory increase: {final_memory - initial_memory:.1f} MB")
# Force garbage collection
gc.collect()
return result
# Use memory-efficient export
result = memory_efficient_export(roi, "2025-01-01/2025-01-31", "efficient.gpkg")
Error Handling and Troubleshooting
Comprehensive Error Handling with One-Liners
from planetscope_py.exceptions import ValidationError, PlanetScopeError
def robust_geopackage_workflow(roi, time_period, output_path):
"""Robust GeoPackage creation with comprehensive error handling."""
try:
# Validate inputs first
if not hasattr(roi, 'bounds') and not isinstance(roi, (list, dict)):
raise ValidationError("Invalid ROI format")
# Attempt creation with fallback strategies
strategies = [
{"schema": "comprehensive", "clip_to_roi": True},
{"schema": "standard", "clip_to_roi": True},
{"schema": "minimal", "clip_to_roi": False}
]
for i, strategy in enumerate(strategies):
try:
print(f"Attempt {i+1}: {strategy}")
result = quick_scene_search_and_export(
roi=roi,
start_date=time_period.split('/')[0],
end_date=time_period.split('/')[1],
output_path=output_path,
**strategy
)
if result['success'] and result['scenes_found'] > 0:
# Validate the created file
validation = validate_scene_geopackage(result['geopackage_path'])
if validation['valid']:
print(f" Success with strategy {i+1}")
return result
else:
print(f" Strategy {i+1} created invalid file")
continue
else:
print(f" Strategy {i+1} found no scenes")
continue
except Exception as e:
print(f" Strategy {i+1} failed: {e}")
continue
raise PlanetScopeError("All strategies failed")
except ValidationError as e:
print(f" Validation error: {e}")
return {"success": False, "error": str(e)}
except PlanetScopeError as e:
print(f" PlanetScope error: {e}")
return {"success": False, "error": str(e)}
except Exception as e:
print(f" Unexpected error: {e}")
return {"success": False, "error": str(e)}
# Use robust workflow
result = robust_geopackage_workflow(milan_roi, "2025-01-01/2025-01-31", "robust.gpkg")
Common Issues and Solutions
# Diagnostic function for troubleshooting
def diagnose_geopackage_issues(roi, time_period):
"""Diagnose common GeoPackage creation issues."""
print(" GeoPackage Creation Diagnostics")
print("=" * 40)
# Check 1: ROI validation
try:
from planetscope_py.utils import validate_geometry, calculate_area_km2
if hasattr(roi, 'is_valid'):
if roi.is_valid:
area = calculate_area_km2(roi)
print(f" ROI valid, area: {area:.1f} km²")
if area > 10000: # Very large area
print(" Large ROI detected - consider using minimal schema")
elif area < 1: # Very small area
print(" Small ROI detected - may find few scenes")
else:
print(" Invalid ROI geometry")
return False
else:
print(" ROI type unknown - attempting conversion")
except Exception as e:
print(f" ROI validation failed: {e}")
return False
# Check 2: Time period validation
try:
if "/" in time_period:
start_str, end_str = time_period.split("/")
from datetime import datetime
start_date = datetime.fromisoformat(start_str)
end_date = datetime.fromisoformat(end_str)
date_diff = (end_date - start_date).days
print(f" Time period valid: {date_diff} days")
if date_diff > 365:
print(" Long time period - may find many scenes")
elif date_diff < 7:
print(" Short time period - may find few scenes")
else:
print(f" Time period preset: {time_period}")
except Exception as e:
print(f" Time period validation failed: {e}")
return False
# Check 3: Test search
try:
print("\n🔍 Testing scene search...")
test_result = quick_scene_search_and_export(
roi=roi,
start_date=time_period.split('/')[0] if '/' in time_period else "2025-01-01",
end_date=time_period.split('/')[1] if '/' in time_period else "2025-01-31",
output_path=None, # Don't create file
cloud_cover_max=1.0 # Accept all scenes for test
)
if test_result['success']:
scenes_found = test_result['scenes_found']
print(f" Found {scenes_found} scenes")
if scenes_found == 0:
print(" Try expanding time period or relaxing cloud cover")
elif scenes_found > 1000:
print(" Consider using minimal schema for large datasets")
else:
print(" Good scene count for standard processing")
else:
print(f" Scene search failed: {test_result.get('error', 'Unknown error')}")
return False
except Exception as e:
print(f" Test search failed: {e}")
return False
print("\n Diagnostics completed - ready for GeoPackage creation")
return True
# Run diagnostics
if diagnose_geopackage_issues(milan_roi, "2025-01-01/2025-01-31"):
print("Proceeding with GeoPackage creation...")
Debugging and Detailed Logging
# Enable detailed logging for debugging
import logging
def debug_geopackage_creation(roi, time_period, output_path):
"""Create GeoPackage with detailed logging."""
# Enable debug logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger('planetscope_py')
print(" Debug Mode: Detailed GeoPackage Creation")
print("=" * 45)
# Step-by-step creation with logging
try:
# Log configuration
print(" Configuration:")
print(f" ROI area: {calculate_area_km2(roi):.1f} km²")
print(f" Time period: {time_period}")
print(f" Output path: {output_path}")
# Create with debug info
result = quick_scene_search_and_export(
roi=roi,
start_date=time_period.split('/')[0],
end_date=time_period.split('/')[1],
output_path=output_path,
schema="standard",
clip_to_roi=True
)
# Log results
print("\n📊 Results:")
print(f" Success: {result['success']}")
if result['success']:
print(f" Scenes found: {result['scenes_found']}")
print(f" Coverage ratio: {result['coverage_ratio']:.1%}")
print(f" File created: {result['geopackage_path']}")
# Validate result
validation = validate_scene_geopackage(result['geopackage_path'])
print(f" Validation: {' Valid' if validation['valid'] else ' Invalid'}")
else:
print(f" Error: {result.get('error', 'Unknown error')}")
return result
except Exception as e:
logger.exception("Debug creation failed")
print(f" Debug creation failed: {e}")
return {"success": False, "error": str(e)}
# Use debug creation
debug_result = debug_geopackage_creation(milan_roi, "2025-01-01/2025-01-31", "debug.gpkg")
Best Practices
1. Choose Appropriate Schema and Functions
# Schema selection guidelines with one-liners
# For quick analysis and small files
quick_gpkg = quick_geopackage_export(roi, "last_month", schema="minimal")
# For standard GIS workflows (RECOMMENDED)
standard_gpkg = create_scene_geopackage(roi, "2025-01-01/2025-01-31")
# For comprehensive analysis and archival
comprehensive_result = quick_scene_search_and_export(
roi, "2025-01-01", "2025-01-31",
schema="comprehensive",
output_path="comprehensive_analysis.gpkg"
)
2. Optimize File Organization
# Organized workflow using one-liners
def create_organized_analysis(roi, time_period, base_dir="analysis_outputs"):
"""Create organized analysis outputs using one-liners."""
from pathlib import Path
base_path = Path(base_dir)
base_path.mkdir(exist_ok=True)
results = {}
# 1. Quick overview (minimal schema)
overview_path = base_path / "overview.gpkg"
overview_result = quick_geopackage_export(
roi=roi,
time_period=time_period,
output_path=str(overview_path),
schema="minimal",
clip_to_roi=False # Full footprints for overview
)
results['overview'] = overview_result
# 2. Detailed analysis (comprehensive schema)
detailed_path = base_path / "detailed_analysis.gpkg"
detailed_result = quick_scene_search_and_export(
roi=roi,
start_date=time_period.split('/')[0],
end_date=time_period.split('/')[1],
output_path=str(detailed_path),
schema="comprehensive",
clip_to_roi=True
)
results['detailed'] = detailed_result
# 3. Quality-filtered version
quality_path = base_path / "high_quality.gpkg"
quality_result = quick_scene_search_and_export(
roi=roi,
start_date=time_period.split('/')[0],
end_date=time_period.split('/')[1],
output_path=str(quality_path),
cloud_cover_max=0.1,
sun_elevation_min=30,
quality_category="standard",
schema="standard"
)
results['quality'] = quality_result
# 4. Export metadata summary
summary_path = base_path / "analysis_summary.json"
summary = {
"overview": {
"scenes": overview_result.get('scenes_found', 0) if isinstance(overview_result, dict) else "N/A",
"file": str(overview_path)
},
"detailed": {
"scenes": detailed_result.get('scenes_found', 0),
"coverage": f"{detailed_result.get('coverage_ratio', 0):.1%}",
"file": str(detailed_path)
},
"quality_filtered": {
"scenes": quality_result.get('scenes_found', 0),
"coverage": f"{quality_result.get('coverage_ratio', 0):.1%}",
"file": str(quality_path)
},
"generated_at": datetime.now().isoformat()
}
import json
with open(summary_path, 'w') as f:
json.dump(summary, f, indent=2)
results['summary_file'] = str(summary_path)
print(" Organized Analysis Created:")
print(f" Overview: {overview_result.get('scenes_found', 'N/A')} scenes")
print(f" Detailed: {detailed_result.get('scenes_found', 0)} scenes")
print(f" Quality: {quality_result.get('scenes_found', 0)} scenes")
print(f" Summary: {summary_path}")
return results
# Create organized analysis
organized_results = create_organized_analysis(milan_roi, "2025-01-01/2025-01-31")
3. Version Control and Documentation
# Add processing metadata using one-liners
def documented_geopackage_export(roi, time_period, output_path, **kwargs):
"""Create GeoPackage with comprehensive documentation."""
# Create the GeoPackage
result = quick_scene_search_and_export(
roi=roi,
start_date=time_period.split('/')[0],
end_date=time_period.split('/')[1],
output_path=output_path,
**kwargs
)
if result['success']:
# Create documentation file
doc_path = Path(output_path).with_suffix('.json')
documentation = {
"geopackage_info": {
"file_path": result['geopackage_path'],
"created_at": datetime.now().isoformat(),
"library_version": "planetscope-py v4.0.0"
},
"processing_parameters": {
"roi_area_km2": result.get('roi_area_km2', 0),
"time_period": time_period,
"search_parameters": result.get('search_parameters', {}),
**kwargs
},
"results_summary": {
"scenes_found": result.get('scenes_found', 0),
"coverage_ratio": result.get('coverage_ratio', 0),
"statistics": result.get('statistics', {}),
"date_range": result.get('date_range', {})
},
"validation": validate_scene_geopackage(result['geopackage_path'])
}
with open(doc_path, 'w') as f:
json.dump(documentation, f, indent=2, default=str)
print(f" Documentation created: {doc_path}")
result['documentation_file'] = str(doc_path)
return result
# Create documented GeoPackage
documented_result = documented_geopackage_export(
roi=milan_roi,
time_period="2025-01-01/2025-01-31",
output_path="documented_analysis.gpkg",
schema="comprehensive",
cloud_cover_max=0.2
)
Complete Workflow Examples
1. Simple Analysis (One-Liner Approach)
# Complete analysis in just a few lines
from planetscope_py import create_scene_geopackage, validate_scene_geopackage
from shapely.geometry import Polygon
# Define ROI
milan_roi = Polygon([
[8.7, 45.1], [9.8, 44.9], [10.3, 45.3], [10.1, 45.9],
[9.5, 46.2], [8.9, 46.0], [8.5, 45.6], [8.7, 45.1]
])
# Create GeoPackage (one line!)
gpkg_path = create_scene_geopackage(milan_roi, "2025-01-01/2025-01-31")
# Validate result
stats = validate_scene_geopackage(gpkg_path)
print(f" Created {gpkg_path} with {stats['feature_count']} scenes")
2. Advanced Analysis with Statistics
# Advanced analysis with comprehensive statistics
from planetscope_py import search_and_export_scenes
result = search_and_export_scenes(
roi=milan_roi,
start_date="2025-01-01",
end_date="2025-01-31",
output_path="advanced_milan_analysis.gpkg",
cloud_cover_max=0.15,
sun_elevation_min=25,
quality_category="standard",
schema="comprehensive"
)
# Print comprehensive results
if result['success']:
print(f" Analysis Results:")
print(f" Scenes found: {result['scenes_found']}")
print(f" ROI area: {result['roi_area_km2']:.1f} km²")
print(f" Coverage: {result['coverage_ratio']:.1%}")
# Detailed statistics
stats = result['statistics']
if 'cloud_cover' in stats:
cc_stats = stats['cloud_cover']
print(f" Cloud cover: {cc_stats['mean']:.1%} avg, {cc_stats['min']:.1%}-{cc_stats['max']:.1%}")
if 'satellite_distribution' in stats:
sat_dist = stats['satellite_distribution']
print(f" Satellites: {stats['unique_satellites']} unique")
for sat_id, count in list(sat_dist.items())[:3]: # Top 3
print(f" {sat_id}: {count} scenes")
# Temporal info
date_range = result['date_range']
if date_range['start'] and date_range['end']:
print(f" Date range: {date_range['start']} to {date_range['end']}")
print(f" Unique dates: {date_range['unique_dates']}")
3. Milan Size Comparison
# Compare different Milan area sizes using presets
from planetscope_py import create_milan_geopackage
milan_sizes = ["city_center", "small", "medium", "large"]
time_period = "2025-01-01/2025-01-31"
print(" Milan Area Size Comparison")
print("=" * 35)
for size in milan_sizes:
try:
result_path = create_milan_geopackage(
time_period=time_period,
size=size,
cloud_cover_max=0.3,
output_path=f"milan_{size}.gpkg"
)
# Validate and get stats
validation = validate_scene_geopackage(result_path)
if validation['valid']:
area_stats = validation.get('aoi_statistics', {})
total_area = area_stats.get('total_aoi_km2', 0)
print(f"✅ {size:12}: {validation['feature_count']:3} scenes, {total_area:6.0f} km²")
else:
print(f"❌ {size:12}: Failed - {validation['error']}")
except Exception as e:
print(f"❌ {size:12}: Error - {e}")
4. Multi-City Batch Processing
# Process multiple cities using batch functions
from planetscope_py import batch_geopackage_export
from shapely.geometry import box
# Define multiple city ROIs
cities = {
"milan": Polygon([
[8.7, 45.1], [9.8, 44.9], [10.3, 45.3], [10.1, 45.9],
[9.5, 46.2], [8.9, 46.0], [8.5, 45.6], [8.7, 45.1]
]),
"rome": box(12.3, 41.8, 12.7, 42.0),
"florence": box(11.1, 43.7, 11.4, 43.9),
"venice": box(12.2, 45.3, 12.5, 45.5)
}
# Batch process all cities
batch_results = batch_geopackage_export(
roi_list=list(cities.values()),
time_period="last_month",
output_dir="./italian_cities_analysis",
cloud_cover_max=0.25,
schema="standard",
clip_to_roi=True
)
# Print results summary
print("🇮🇹 Italian Cities Analysis Results")
print("=" * 40)
for i, (city_name, roi) in enumerate(cities.items()):
roi_key = f"roi_{i+1}"
if roi_key in batch_results:
result = batch_results[roi_key]
if result['success']:
validation = result['validation']
print(f"✅ {city_name:10}: {validation['feature_count']:3} scenes")
else:
print(f"❌ {city_name:10}: {result['error']}")
Integration with Other Modules
With Density Analysis
# Combine GeoPackage export with density analysis
from planetscope_py import analyze_roi_density, export_scenes_to_geopackage
# Run density analysis first
density_result = analyze_roi_density(
roi_polygon=milan_roi,
time_period="2025-01-01/2025-01-31",
create_visualizations=True,
export_geotiff=True
)
# Export scenes used in analysis to GeoPackage
if density_result['scenes_found'] > 0:
# Get scenes from density result
scenes = density_result.get('scenes', [])
# Export to GeoPackage with density metadata
gpkg_path = export_scenes_to_geopackage(
scenes=scenes,
output_path="density_analysis_scenes.gpkg",
roi=milan_roi,
clip_to_roi=True,
schema="comprehensive"
)
print(f" Density analysis scenes exported: {gpkg_path}")
print(f" Density stats: mean={density_result['density_result'].stats['mean']:.1f}")
With Asset Management
# Combine with asset management for imagery download
async def complete_imagery_workflow():
"""Complete workflow with imagery download and GeoPackage export."""
# Search and export scenes first
result = search_and_export_scenes(
roi=milan_roi,
start_date="2025-01-01",
end_date="2025-01-15", # Shorter period for download demo
cloud_cover_max=0.1,
quality_category="standard"
)
if result['success'] and result['scenes_found'] > 0:
print(f" Found {result['scenes_found']} high-quality scenes")
# Download assets (if asset management available)
try:
from planetscope_py import AssetManager
# Get scenes from the created GeoPackage
import geopandas as gpd
gdf = gpd.read_file(result['geopackage_path'])
# Convert back to scene format for asset manager
scenes_for_download = []
for _, row in gdf.head(5).iterrows(): # Download first 5 scenes
scene = {
'id': row['scene_id'],
'properties': {
'id': row['scene_id'],
'acquired': row['acquired']
}
}
scenes_for_download.append(scene)
# Download assets
from planetscope_py import PlanetScopeQuery
query = PlanetScopeQuery()
asset_manager = AssetManager(query.auth)
downloads = await asset_manager.activate_and_download_assets(
scenes=scenes_for_download,
clip_to_roi=milan_roi
)
downloaded_files = [
d.downloaded_file_path for d in downloads
if d.status.name == 'COMPLETED'
]
if downloaded_files:
# Create new GeoPackage with imagery
from planetscope_py import GeoPackageManager, GeoPackageConfig
config = GeoPackageConfig(
include_imagery=True,
clip_to_roi=True,
attribute_schema="comprehensive"
)
manager = GeoPackageManager(config=config)
imagery_gpkg = manager.create_scene_geopackage(
scenes=scenes_for_download,
output_path="milan_with_imagery.gpkg",
roi=milan_roi,
downloaded_files=downloaded_files
)
print(f" GeoPackage with imagery: {imagery_gpkg}")
print(f" Downloaded files: {len(downloaded_files)}")
except ImportError:
print(" Asset management not available - skipping download")
return result
# Run complete workflow (uncomment to use)
# result = await complete_imagery_workflow()
Migration from Manual Approach
Before (Manual Approach)
# OLD: Manual approach (your original code)
from datetime import datetime
from shapely.geometry import Polygon
from planetscope_py import PlanetScopeQuery, GeoPackageManager, GeoPackageConfig
# Create polygon
milan_polygon = Polygon([
[8.7, 45.1], [9.8, 44.9], [10.3, 45.3], [10.1, 45.9],
[9.5, 46.2], [8.9, 46.0], [8.5, 45.6], [8.7, 45.1]
])
# Search scenes
query = PlanetScopeQuery()
results = query.search_scenes(
geometry=milan_polygon,
start_date="2025-01-05",
end_date="2025-01-25",
cloud_cover_max=1.0,
item_types=["PSScene"]
)
scenes = results.get('features', [])
# Create GeoPackage
if scenes:
config = GeoPackageConfig(
clip_to_roi=True,
attribute_schema="standard"
)
manager = GeoPackageManager(config=config)
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
output_path = f"milan_large_{timestamp}.gpkg"
manager.create_scene_geopackage(
scenes=scenes,
output_path=output_path,
roi=milan_polygon
)
After (One-Liner Approach)
# NEW: One-liner approach (equivalent functionality)
from planetscope_py import create_scene_geopackage
from shapely.geometry import Polygon
# Same polygon
milan_polygon = Polygon([
[8.7, 45.1], [9.8, 44.9], [10.3, 45.3], [10.1, 45.9],
[9.5, 46.2], [8.9, 46.0], [8.5, 45.6], [8.7, 45.1]
])
# One-liner equivalent
gpkg_path = create_scene_geopackage(
milan_polygon,
"2025-01-05/2025-01-25",
clip_to_roi=True,
cloud_cover_max=1.0
)
# Or even simpler with Milan preset:
# gpkg_path = quick_milan_geopackage("2025-01-05/2025-01-25", size="large")
Summary
The enhanced GeoPackage Export module provides both comprehensive control through the original GeoPackageManager
class and simplified workflows through new one-liner functions:
Key Benefits of One-Liner Functions:
- Reduced complexity: 20+ lines → 3 lines
- Built-in error handling: Automatic fallback strategies
- Comprehensive Planet API support: All search parameters available
- Enhanced metadata: Improved standard schema with most Planet API fields
- Automatic validation: Built-in quality checks
- Batch processing: Multiple ROIs in single call
- Preset configurations: Common scenarios pre-configured (Milan areas)
- Detailed statistics: Comprehensive analysis results
When to Use Each Approach:
Use One-Liner Functions When:
- Quick analysis and prototyping
- Standard workflows
- Batch processing multiple areas
- Learning and testing
- Integration with other planetscope-py modules
Use Full GeoPackageManager When:
- Custom metadata processing
- Complex raster inclusion workflows
- Fine-grained control over configuration
- Advanced error handling requirements
- Custom schema definitions
Both approaches are fully compatible and can be mixed in the same workflow. The one-liner functions internally use the same robust GeoPackageManager
infrastructure, ensuring consistent quality and reliability.