Storage Backend - FeitianTech/postquantum-webauthn-platform GitHub Wiki
- Introduction
- Dual-Layer Architecture
- Core Storage Components
- Data Access Patterns
- Object Naming Conventions
- Local Storage Fallback
- Error Handling and Retry Mechanisms
- Performance Considerations
- Security and Encryption
- Migration Strategies
- Testing and Validation
- Troubleshooting Guide
The PostQuantum WebAuthn Platform implements a sophisticated dual-layer storage backend designed to provide flexible, scalable, and resilient data persistence for cryptographic credentials, session metadata, device logs, and MDS snapshots. The architecture separates concerns between abstract storage interfaces and concrete implementations, enabling seamless switching between local filesystem storage and Google Cloud Storage while maintaining data consistency and performance.
The storage system serves multiple critical functions:
- Secure credential storage for WebAuthn authentication
- Session metadata management for user state persistence
- Device registration logging for audit trails
- Metadata synchronization for authenticator verification
- Backup and disaster recovery capabilities
The storage backend follows a layered design pattern with clear separation between abstraction and implementation:
graph TB
subgraph "Application Layer"
A[WebAuthn Routes]
B[Session Management]
C[Device Logging]
D[Metadata Services]
end
subgraph "Storage Abstraction Layer"
E[storage.py<br/>Credential Storage]
F[session_metadata_store.py<br/>Session Metadata]
G[device_logs.py<br/>Device Logging]
H[metadata.py<br/>Metadata Management]
end
subgraph "Implementation Layer"
I[cloud_storage.py<br/>Google Cloud Storage]
J[Local Filesystem<br/>Fallback Storage]
end
subgraph "External Dependencies"
K[Google Cloud Storage API]
L[GitHub API]
M[Filesystem]
end
A --> E
B --> F
C --> G
D --> H
E --> I
E --> J
F --> I
F --> J
G --> L
H --> I
H --> J
I --> K
G --> L
H --> L
J --> M
Diagram sources
- server/server/storage.py
- server/server/cloud_storage.py
- server/server/session_metadata_store.py
Application Layer: Contains route handlers and business logic that utilize storage services for data persistence and retrieval.
Storage Abstraction Layer: Provides unified interfaces for different types of data storage, hiding implementation details from higher layers.
Implementation Layer: Contains concrete implementations for cloud and local storage backends.
External Dependencies: Interfaces with external systems like Google Cloud Storage and GitHub for data persistence.
Section sources
- server/server/storage.py
- server/server/session_metadata_store.py
The credential storage system manages WebAuthn credentials with robust session isolation and cross-platform compatibility:
classDiagram
class StorageInterface {
+savekey(name, key, session_id)
+readkey(name, session_id)
+delkey(name, session_id)
+iter_credentials(session_id)
+list_credentials(session_id)
}
class CloudStorage {
+upload_bytes(blob_name, data, content_type)
+download_bytes(blob_name)
+delete_blob(blob_name, missing_ok)
+list_blob_names(prefix)
+blob_exists(blob_name)
}
class LocalStorage {
+_local_filename(name, session_id, create)
+_local_directory(session_id, create)
+_legacy_local_filename(name)
}
class BlobNaming {
+_credential_blob(name, session_id)
+_legacy_credential_blob(name)
+_credential_prefix(session_id)
+_user_root_prefix(session_id)
}
StorageInterface --> CloudStorage : "uses when GCS enabled"
StorageInterface --> LocalStorage : "uses when GCS disabled"
StorageInterface --> BlobNaming : "uses for naming"
Diagram sources
- server/server/storage.py
- server/server/cloud_storage.py
Session metadata storage provides persistent state management with automatic cleanup and activity tracking:
sequenceDiagram
participant App as Application
participant SSM as Session Metadata Store
participant CS as Cloud Storage
participant FS as Local Filesystem
App->>SSM : ensure_session(session_id)
SSM->>SSM : _using_gcs()
alt GCS Enabled
SSM->>CS : touch_last_access(session_id)
CS-->>SSM : success
else Local Storage
SSM->>FS : _local_session_directory(session_id, create=True)
FS-->>SSM : directory_path
end
App->>SSM : write_file(session_id, filename, data)
SSM->>SSM : _session_blob(session_id, filename)
alt GCS Enabled
SSM->>CS : upload_bytes(blob_name, data, content_type)
CS-->>SSM : success
else Local Storage
SSM->>FS : write to local file
FS-->>SSM : success
end
App->>SSM : list_files(session_id)
alt GCS Enabled
SSM->>CS : list_blob_names(prefix)
CS-->>SSM : blob_names[]
else Local Storage
SSM->>FS : os.listdir(directory)
FS-->>SSM : filenames[]
end
Diagram sources
- server/server/session_metadata_store.py
- server/server/storage.py
Section sources
- server/server/storage.py
- server/server/session_metadata_store.py
The storage system supports comprehensive credential lifecycle management:
| Operation | Description | Implementation Pattern |
|---|---|---|
| Save | Persist credential data with session isolation | Pickle serialization + blob upload |
| Load | Retrieve credentials with fallback support | Multi-blob search with legacy compatibility |
| Delete | Remove credentials with cleanup | Both current and legacy blob removal |
| Iterate | Bulk credential enumeration | Streaming blob listing with filtering |
Session metadata follows a hierarchical structure with automatic lifecycle management:
| Operation | Purpose | Cleanup Policy |
|---|---|---|
| Touch Access | Update activity timestamp | Automatic pruning after 14 days |
| List Files | Enumerate session contents | Prefix-based filtering |
| Write File | Store arbitrary data | Content-type aware uploads |
| Delete Session | Complete session removal | Recursive blob deletion |
Device logs implement asynchronous upload with structured data preservation:
flowchart TD
A[Registration Event] --> B[Build Log Payload]
B --> C[Convert to JSON-Safe Format]
C --> D[Generate Unique Path]
D --> E[Upload to GitHub]
E --> F[Log Upload Status]
G[Error Handling] --> H[Retry Logic]
H --> I[Failure Logging]
B --> G
E --> G
Diagram sources
- server/server/device_logs.py
Section sources
- server/server/storage.py
- server/server/session_metadata_store.py
- server/server/device_logs.py
The storage system employs consistent naming patterns for different data types:
Credential data follows a structured naming convention that ensures uniqueness and organization:
-
Current Format:
{session_id}/user-data/credentials/{username}_credential_data.pkl -
Legacy Format:
{username}_credential_data.pkl(root level) -
Prefix Structure:
{session_id}/{folder_prefix}/{subdir}/{filename}
Session metadata uses a hierarchical structure for organized storage:
-
Root Prefix:
{session_id}/user-data/metadata/ -
Access Marker:
.last-access(hidden file for activity tracking) -
File Extensions:
.jsonfor metadata,.meta.jsonfor metadata info
Device logs implement time-based organization with AAGUID grouping:
-
Path Pattern:
logs/{aaguid}/{timestamp}_{shortid}.json -
Timestamp Format:
YYYYMMDDTHHMMSSZ - Short ID: Random 8-character identifier for uniqueness
The system supports configurable bucket management with environment-based configuration:
| Environment Variable | Purpose | Default Behavior |
|---|---|---|
FIDO_SERVER_GCS_BUCKET |
Target storage bucket | Required for cloud storage |
FIDO_SERVER_GCS_ENABLED |
Enable cloud storage | Disabled by default |
FIDO_SERVER_GCS_CREDENTIALS_FILE |
Service account file | Uses default credentials |
FIDO_SERVER_GCS_PROJECT |
Target project | Inherits from credentials |
Section sources
- server/server/storage.py
- server/server/session_metadata_store.py
- server/server/device_logs.py
The storage system implements intelligent fallback mechanisms when cloud storage is unavailable:
flowchart TD
A[Storage Operation] --> B{GCS Enabled?}
B --> |Yes| C{Bucket Available?}
B --> |No| D[Use Local Storage]
C --> |Yes| E[Use Cloud Storage]
C --> |No| F[Log Warning]
F --> D
D --> G[Local Filesystem]
E --> H[Google Cloud Storage]
Diagram sources
- server/server/storage.py
- server/server/cloud_storage.py
Credential Storage Fallback:
- Primary: Cloud storage with session isolation
- Secondary: Local filesystem with session directory creation
- Legacy Support: Backward compatibility with root-level files
Session Metadata Fallback:
- Primary: Cloud storage for scalability
- Secondary: Local filesystem for offline capability
- Cleanup: Automatic pruning of inactive sessions
Device Logging Fallback:
- Primary: GitHub integration for audit trails
- Secondary: Local logging for development
- Asynchronous: Background upload with graceful degradation
The system supports seamless migration between storage backends through:
- Data Export: Serialize all data to portable format
- Validation: Verify data integrity across backends
- Atomic Switch: Replace backend configuration without data loss
- Rollback Capability: Restore previous configuration if needed
Section sources
- server/server/storage.py
- server/server/session_metadata_store.py
The cloud storage layer implements exponential backoff with configurable retry parameters:
sequenceDiagram
participant Client as Storage Client
participant Retry as Retry Handler
participant GCS as Google Cloud Storage
Client->>Retry : Operation Request
Retry->>GCS : Attempt Operation
GCS-->>Retry : Transient Error
Retry->>Retry : Calculate Delay
Retry->>Retry : Sleep (Backoff)
Retry->>GCS : Retry Operation
GCS-->>Retry : Success/Failure
alt Max Attempts Reached
Retry-->>Client : Final Failure
else Success
Retry-->>Client : Operation Result
end
Diagram sources
- server/server/cloud_storage.py
The system categorizes exceptions into retryable and non-retryable types:
| Exception Category | Examples | Retry Behavior |
|---|---|---|
| Transient | Network timeouts, rate limits, temporary unavailability | Exponential backoff with max attempts |
| Permanent | Authentication failures, resource not found | Immediate failure with error reporting |
| Unknown | Unexpected errors, system failures | Conservative retry with logging |
During network outages, the system maintains operational continuity through:
- Graceful Degradation: Continue using local storage when cloud is unavailable
- Queue Management: Buffer operations for later replay
- Health Monitoring: Track connectivity status and alert on persistent failures
- Circuit Breaker: Prevent cascading failures during extended outages
The system provides eventual consistency through:
- Optimistic Concurrency: Allow concurrent operations with conflict resolution
- Idempotent Operations: Ensure repeated operations produce same results
- Transaction Simulation: Use atomic operations where supported
- Conflict Resolution: Merge conflicting updates with conflict detection
Section sources
- server/server/cloud_storage.py
- server/server/storage.py
The storage system implements several strategies to minimize latency:
Connection Pooling: Reuse HTTP connections for multiple operations to reduce handshake overhead.
Caching Strategy: Implement multi-level caching for frequently accessed metadata and credentials.
Batch Operations: Group multiple small operations into larger batches to reduce API call overhead.
Async Processing: Use asynchronous uploads for non-critical operations like device logs.
The system supports different storage classes based on access patterns:
| Storage Class | Use Case | Cost Factor | Performance |
|---|---|---|---|
| Standard | Active credentials and session data | Medium | High throughput |
| Nearline | Historical session metadata | Low | Moderate latency |
| Coldline | Long-term device logs | Very Low | Higher latency |
| Archive | Compliance logs | Lowest | Batch access only |
For bulk operations, the system provides:
- Bulk Upload: Efficiently upload multiple files in single operation
- Parallel Processing: Concurrent processing of independent operations
- Progress Tracking: Monitor long-running batch operations
- Error Recovery: Resume interrupted batch operations
The architecture supports horizontal scaling through:
- Partitioning: Distribute data across multiple storage units
- Replication: Maintain multiple copies for availability
- Load Balancing: Distribute requests across storage instances
- Auto-scaling: Adjust resources based on demand
Section sources
- server/server/cloud_storage.py
- server/server/session_metadata_store.py
The storage system implements comprehensive security measures:
Encryption in Transit: All data transmitted to/from cloud storage uses TLS 1.2+ encryption.
Encryption at Rest: Supports customer-managed encryption keys for sensitive data.
Access Control: Implements principle of least privilege with role-based access controls.
The system integrates with Google Cloud IAM for fine-grained access control:
graph LR
A[Service Account] --> B[Storage Bucket]
B --> C[Blob Operations]
C --> D[Read Access]
C --> E[Write Access]
C --> F[Delete Access]
G[Environment Variables] --> H[Credential Management]
H --> I[Automatic Refresh]
I --> J[Secure Storage]
Diagram sources
- server/server/cloud_storage.py
Multiple authentication mechanisms are supported:
| Method | Use Case | Security Level |
|---|---|---|
| Service Account Keys | Production deployments | High |
| Workload Identity | Kubernetes environments | High |
| Default Credentials | Development/testing | Medium |
| OAuth 2.0 Tokens | Interactive applications | Medium |
The system provides comprehensive audit capabilities:
- Operation Logging: Track all storage operations with timestamps
- Access Monitoring: Monitor who accessed what data and when
- Anomaly Detection: Identify unusual access patterns
- Compliance Reporting: Generate reports for regulatory requirements
Section sources
- server/server/cloud_storage.py
- server/server/github_client.py
Successful migration between storage backends requires careful planning:
Pre-Migration Assessment:
- Inventory all stored data and dependencies
- Assess current storage utilization and growth patterns
- Identify potential compatibility issues
- Plan rollback strategy and timeline
Migration Execution:
- Phase 1: Export data from source backend
- Phase 2: Validate data integrity in target backend
- Phase 3: Switch configuration to target backend
- Phase 4: Verify functionality and monitor performance
Post-Migration Validation:
- Test all critical operations
- Verify data completeness and accuracy
- Monitor for performance regressions
- Document lessons learned
The system provides utilities for automated migration:
flowchart TD
A[Migration Command] --> B[Source Validation]
B --> C[Target Preparation]
C --> D[Data Transfer]
D --> E[Integrity Verification]
E --> F[Configuration Update]
F --> G[Cleanup Old Data]
H[Error Handling] --> I[Rollback Option]
I --> J[Partial Recovery]
Diagram sources
- server/server/startup.py
For production environments, zero-downtime migration is achieved through:
- Shadow Mode: Run both backends simultaneously during migration
- Read-Through Write-Back: Read from old backend, write to new backend
- Gradual Cutover: Phase out old backend gradually
- Monitoring: Continuous monitoring during migration process
Section sources
- server/server/startup.py
The storage system includes extensive testing coverage:
Unit Tests: Individual component testing with mocked dependencies Integration Tests: End-to-end testing with real storage backends Performance Tests: Load testing and latency measurement Compatibility Tests: Cross-platform and cross-version testing
The testing framework supports multiple scenarios:
graph TB
A[Test Runner] --> B[Unit Tests]
A --> C[Integration Tests]
A --> D[Performance Tests]
B --> E[Mocked Storage]
C --> F[Real Storage]
D --> G[Load Generator]
H[Environment Setup] --> I[Cloud Emulator]
H --> J[Local Filesystem]
H --> K[Network Simulator]
Diagram sources
- tests/test_storage.py
- tests/test_cloud_storage.py
The system implements multiple validation approaches:
Data Integrity Checks: Verify data consistency across storage operations Cross-Backend Validation: Compare results between different storage backends Performance Benchmarking: Measure and compare performance characteristics Security Auditing: Validate access controls and encryption implementation
Section sources
- tests/test_storage.py
- tests/test_cloud_storage.py
Storage Unavailable:
- Verify GCS bucket configuration and permissions
- Check network connectivity to Google Cloud endpoints
- Review service account credentials and expiration
Performance Degradation:
- Monitor API rate limits and adjust retry parameters
- Consider storage class optimization for access patterns
- Evaluate connection pooling and keep-alive settings
Data Corruption:
- Validate data integrity using checksums
- Check for concurrent modification conflicts
- Review backup and recovery procedures
The system provides built-in diagnostic capabilities:
Health Checks: Verify storage backend connectivity and functionality Metrics Collection: Monitor operation counts, latencies, and error rates Logging: Comprehensive logging with configurable verbosity levels Alerting: Configure alerts for critical storage events
For storage-related incidents, follow these recovery steps:
- Assessment: Identify the scope and impact of the issue
- Isolation: Determine affected components and data
- Recovery: Execute appropriate recovery procedures
- Validation: Verify system functionality post-recovery
- Documentation: Record incident details and lessons learned
Section sources
- server/server/startup.py
- server/server/cloud_storage.py