Storage & Persistence - FeitianTech/postquantum-webauthn-platform GitHub Wiki
- Introduction
- Data Models
- Dual Storage Backend Implementation
- Credential Storage and Retrieval
- Session Metadata Management
- Credential Artifacts Storage
- Device Logs and Metadata
- Data Lifecycle and Security
- Conclusion
The post-quantum WebAuthn platform implements a comprehensive storage and persistence system designed to securely manage WebAuthn credentials, session metadata, device logs, and configuration settings. The system features a dual storage backend that supports both local file storage and Google Cloud Storage (GCS), providing flexibility for different deployment environments. This documentation details the data models, access patterns, serialization formats, and implementation specifics of the storage system, with a focus on security, data integrity, and efficient data management.
Section sources
- storage.py
- cloud_storage.py
The storage system manages several key data entities, each with specific attributes and relationships. The primary data models include credential artifacts, session metadata, device logs, and configuration settings.
WebAuthn credentials are stored with comprehensive metadata including user handles, public keys, and registration details. The credential data model includes:
- User Information: Email, username, display name, and user handle
- Credential Metadata: Credential ID, public key, algorithm, sign count, and creation timestamp
- Authenticator Details: AAGUID (Authenticator Attestation GUID), authenticator attachment type, and resident key status
- Security Flags: User presence (UP), user verification (UV), attestation (AT), and extension data (ED) flags
- Attestation Data: Attestation format, attestation statement, and decoded attestation object
The data model also includes client extension outputs and properties specific to the authenticator implementation.
Session metadata tracks temporary authentication state and user session information with the following attributes:
- Session Identifier: Cryptographically secure random token used to identify the session
- Last Access Timestamp: Unix timestamp indicating the most recent activity in the session
- Metadata Entries: Custom metadata statements uploaded during the session
- Cookie Management: HTTP-only, secure cookies with SameSite=None for cross-origin requests
- Cleanup Policy: Automatic cleanup of inactive sessions after 14 days of inactivity
Device registration events are logged with detailed information for auditing and analysis:
- Registration Timestamp: ISO 8601 formatted timestamp in UTC and local time (CST)
- Relying Party Information: RP ID and name
- User Details: User ID (bytes), username, and display name
- Credential Information: Credential ID, COSE public key, sign count, and transports
- Authenticator Data: AAGUID, device name from MDS, and attestation format
- Raw Data: Base64URL-encoded attestation object and client data JSON
The system configuration includes environment variables and constants that control storage behavior:
- Storage Backend Configuration: GCS bucket name, credentials file, and project ID
- Path Configuration: Base paths for credential storage, session metadata, and credential artifacts
- Security Settings: Secret key for Flask sessions and trusted attestation CA subjects
- Metadata Configuration: URLs and paths for FIDO metadata service (MDS) data
erDiagram
CREDENTIALS {
string email PK
string credentialId
string publicKey
string algorithm
int signCount
string aaguid
string createdAt
boolean residentKey
string authenticatorAttachment
}
SESSION_METADATA {
string sessionId PK
float lastAccessTimestamp
string metadataEntries
string cookieValue
datetime createdAt
}
DEVICE_LOGS {
string logId PK
datetime timestamp
string rpId
bytes userId
string userName
bytes credentialId
map publicKeyCose
int signCount
string aaguid
string deviceNameMds
string attestationFormat
bytes attestationObject
bytes clientDataJson
}
CONFIGURATION {
string gcsBucket
string gcsCredentialsFile
string basePath
string secretKey
string rpName
string rpId
}
CREDENTIALS ||--o{ SESSION_METADATA : "stored_in"
DEVICE_LOGS }o--|| CREDENTIALS : "records"
CONFIGURATION }|--|| CREDENTIALS : "controls"
CONFIGURATION }|--|| SESSION_METADATA : "controls"
Diagram sources
- storage.py
- session_metadata_store.py
- device_logs.py
- config.py
Section sources
- storage.py
- session_metadata_store.py
- device_logs.py
- config.py
The storage system implements a dual backend architecture that supports both local file storage and Google Cloud Storage, allowing seamless migration between development and production environments.
The local file storage implementation in storage.py provides a file-based persistence mechanism for WebAuthn credentials:
-
Directory Structure: Credentials are stored in session-specific directories under
session-credentials/ -
File Naming: Credential files follow the pattern
{username}_credential_data.pkl - Data Serialization: Credentials are serialized using Python's pickle module for efficient storage
- Legacy Support: The system maintains backward compatibility with legacy credential storage locations
- Error Handling: Comprehensive error handling for file operations with appropriate fallbacks
The local storage system also includes automatic directory creation and permission management to ensure reliable operation across different deployment environments.
The GCS integration in cloud_storage.py provides a scalable, cloud-based storage solution:
- Client Initialization: The system creates a GCS client using service account credentials from environment variables
- Retry Mechanisms: Robust retry logic with exponential backoff for transient failures
- Thread Safety: Thread-safe client initialization using locks to prevent race conditions
- Blob Management: Utilities for uploading, downloading, and deleting blobs with proper error handling
- List Operations: Efficient listing of blob names with prefix-based filtering
The GCS implementation supports multiple authentication methods including service account files, JSON credentials, and default application credentials, providing flexibility for different deployment scenarios.
The system implements a clean abstraction layer that seamlessly switches between storage backends:
-
Environment Detection: The system checks the
FIDO_SERVER_GCS_ENABLEDenvironment variable and bucket configuration - Path Resolution: Dynamic path resolution that maps logical storage locations to physical GCS blobs or file paths
- Fallback Mechanisms: Automatic fallback to legacy storage locations for backward compatibility
- Unified API: A consistent interface for storage operations regardless of the underlying backend
classDiagram
class StorageBackend {
<<interface>>
+savekey(name : str, key : Any, session_id : Optional[str]) None
+readkey(name : str, session_id : Optional[str]) List[Any]
+delkey(name : str, session_id : Optional[str]) None
+iter_credentials(session_id : Optional[str]) Iterator[Tuple[str, List[Any]]]
+list_credentials(session_id : Optional[str]) Dict[str, List[Any]]
}
class LocalStorage {
-_LOCAL_CREDENTIAL_BASE : str
-_local_directory(session_id : str, create : bool) str
-_local_filename(name : str, session_id : str, create : bool) str
+savekey(name : str, key : Any, session_id : Optional[str]) None
+readkey(name : str, session_id : Optional[str]) List[Any]
+delkey(name : str, session_id : Optional[str]) None
+iter_credentials(session_id : Optional[str]) Iterator[Tuple[str, List[Any]]]
+list_credentials(session_id : Optional[str]) Dict[str, List[Any]]
}
class CloudStorage {
-_CLIENT : Optional[storage.Client]
-_BUCKET : Optional[storage.Bucket]
-_CLIENT_LOCK : threading.Lock
+ensure_ready(max_attempts : int, retry_delay : float) None
+_with_retry(operation : Callable[[], _T], max_attempts : int, base_delay : float) _T
+build_blob_name(*components : str, prefix : Optional[str]) str
+upload_bytes(blob_name : str, data : bytes, content_type : Optional[str]) None
+download_bytes(blob_name : str) Optional[bytes]
+delete_blob(blob_name : str, missing_ok : bool) None
+list_blob_names(prefix : str) Iterable[str]
+blob_exists(blob_name : str) bool
+blob_updated_timestamp(blob_name : str) Optional[float]
+savekey(name : str, key : Any, session_id : Optional[str]) None
+readkey(name : str, session_id : Optional[str]) List[Any]
+delkey(name : str, session_id : Optional[str]) None
+iter_credentials(session_id : Optional[str]) Iterator[Tuple[str, List[Any]]]
+list_credentials(session_id : Optional[str]) Dict[str, List[Any]]
}
StorageBackend <|-- LocalStorage
StorageBackend <|-- CloudStorage
storage "1" *-- "1..*" StorageBackend
cloud_storage "1" *-- "1..*" CloudStorage
Diagram sources
- storage.py
- cloud_storage.py
Section sources
- storage.py
- cloud_storage.py
The credential storage system provides a comprehensive API for managing WebAuthn credentials throughout their lifecycle.
Credentials are serialized using Python's pickle module, which provides efficient binary serialization of complex Python objects:
- Pickle Protocol: The system uses the default pickle protocol for serialization
- Data Structure: Credentials are stored as lists of credential objects, allowing multiple credentials per user
-
Binary Format: The serialized data is stored in binary format with MIME type
application/octet-stream -
Base64 Encoding: For JSON serialization, binary data is converted to base64 strings using the
convert_bytes_for_jsonfunction
The serialization format preserves all credential attributes including public keys, sign counts, and authenticator metadata.
The system implements the following core operations for credential management:
-
Saving Credentials: The
savekeyfunction stores credentials with the specified name and session ID -
Reading Credentials: The
readkeyfunction retrieves credentials with fallback to legacy storage locations -
Deleting Credentials: The
delkeyfunction removes credentials with cleanup of both current and legacy locations -
Listing Credentials: The
list_credentialsfunction returns all credentials for a session -
Iterating Credentials: The
iter_credentialsfunction provides an iterator over all stored credentials
When a user completes the registration process, the system saves the credential data as follows:
sequenceDiagram
participant Client as "Client Application"
participant Server as "Authentication Server"
participant Storage as "Storage System"
Client->>Server : Registration Response
Server->>Server : Parse attestation response
Server->>Server : Extract credential data
Server->>Server : Create credential entry
Server->>Storage : savekey(username, credentials, session_id)
Storage->>Storage : Serialize credentials with pickle
alt GCS Enabled
Storage->>Storage : Upload to GCS blob
Storage-->>Server : Confirmation
else Local Storage
Storage->>Storage : Write to local file
Storage-->>Server : Confirmation
end
Server-->>Client : Registration Success
Note over Server,Storage : Credential saved with user handle,<br>public key, and registration metadata
Diagram sources
- storage.py
- routes/simple.py
During the authentication process, the system retrieves credentials for verification:
sequenceDiagram
participant Client as "Client Application"
participant Server as "Authentication Server"
participant Storage as "Storage System"
Client->>Server : Authentication Request
Server->>Server : Extract username
Server->>Storage : readkey(username, session_id)
alt GCS Enabled
Storage->>Storage : Download from GCS blob
alt Blob Exists
Storage-->>Storage : Deserialize credentials
Storage-->>Server : Credentials
else Blob Missing
Storage->>Storage : Try legacy GCS location
Storage-->>Server : Credentials or empty list
end
else Local Storage
Storage->>Storage : Read from local file
alt File Exists
Storage-->>Storage : Deserialize credentials
Storage-->>Server : Credentials
else File Missing
Storage->>Storage : Try legacy file location
Storage-->>Server : Credentials or empty list
end
end
Server->>Server : Create authentication options
Server-->>Client : Authentication Options
Note over Server,Storage : Credentials retrieved with public key<br>for signature verification
Diagram sources
- storage.py
- routes/advanced.py
Section sources
- storage.py
- routes/simple.py
- routes/advanced.py
The session metadata system in session_metadata_store.py manages temporary authentication state and user session information.
The session metadata store provides the following functionality:
-
Session Creation: The
ensure_sessionfunction creates a session directory or GCS prefix - File Operations: Functions for reading, writing, and deleting session files
-
Timestamp Management: The
touch_last_accessandresolve_last_accessfunctions manage session activity timestamps - Cleanup Operations: Automatic cleanup of inactive sessions after 14 days
-
Existence Checks: The
file_existsfunction checks if a session file exists
The system uses a .last-access marker file to track the last activity time for each session, enabling efficient cleanup of inactive sessions.
The session metadata system follows these access patterns:
- Session Isolation: Each session has a unique identifier that isolates its metadata from other sessions
- File-Based Storage: Metadata is stored as individual JSON files in session directories
- Atomic Operations: File writes use temporary files and atomic renames to prevent corruption
- Thread Safety: Operations are protected by locks to ensure thread safety
- Error Resilience: Comprehensive error handling ensures the system continues to operate even if individual operations fail
The session metadata system is used in the following scenarios:
- Metadata Uploads: Users can upload custom metadata statements that are stored in their session
- Session Recovery: The system can recover session state from cookies or Flask session data
- Temporary Storage: Authentication state and intermediate results are stored temporarily during complex operations
- Audit Trails: Session activity is logged for security and debugging purposes
flowchart TD
Start([Start]) --> CheckSession["Check for existing session"]
CheckSession --> SessionExists{"Session exists?"}
SessionExists --> |Yes| UseExisting["Use existing session"]
SessionExists --> |No| CreateNew["Create new session"]
CreateNew --> GenerateID["Generate secure session ID"]
GenerateID --> StoreID["Store ID in Flask session and cookie"]
StoreID --> InitializeStorage["Initialize storage directory/blob prefix"]
InitializeStorage --> UseExisting
UseExisting --> PerformOperations["Perform session operations"]
PerformOperations --> UpdateTimestamp["Update last access timestamp"]
UpdateTimestamp --> CheckCleanup["Check if cleanup needed"]
CheckCleanup --> CleanupNeeded{"Cleanup needed?"}
CleanupNeeded --> |Yes| Cleanup["Remove inactive sessions"]
CleanupNeeded --> |No| Continue["Continue operations"]
Cleanup --> Continue
Continue --> End([End])
style Start fill:#f9f,stroke:#333
style End fill:#f9f,stroke:#333
Diagram sources
- session_metadata_store.py
- metadata.py
Section sources
- session_metadata_store.py
- metadata.py
The credential artifacts system in credential_artifacts.py provides specialized storage for advanced credential data.
Credential artifacts include additional information beyond standard WebAuthn credentials:
- Storage ID: Unique identifier for the artifact
- Payload: JSON-serializable data structure containing artifact details
- Timestamps: Creation and update timestamps
- Metadata: Additional context and provenance information
The system uses SHA-256 hashing of the storage ID to generate secure filenames, preventing path traversal attacks.
The credential artifacts system provides the following operations:
-
Storing Artifacts: The
store_credential_artifactfunction saves artifact data with optional merging of existing data -
Loading Artifacts: The
load_credential_artifactfunction retrieves artifact data -
Deleting Artifacts: The
delete_credential_artifactfunction removes artifact data -
Merging Data: The
_merge_payloadfunction combines existing and new data when updating artifacts
The system uses thread-safe operations with a reentrant lock to prevent race conditions during concurrent access.
Credential artifacts are used for:
- Advanced Authentication Data: Storing additional authentication context and metadata
- Device Capabilities: Recording device-specific capabilities and limitations
- Security Policies: Storing security policies and constraints for specific credentials
- Audit Information: Maintaining detailed audit trails for credential usage
classDiagram
class CredentialArtifacts {
-_ARTIFACT_DIR : str
-_USER_FOLDER_PREFIX : str
-_ARTIFACT_SUBDIR : str
-_LOCK : threading.RLock
+store_credential_artifact(storage_id : Any, payload : Dict[str, Any], merge : bool, session_id : Optional[str]) bool
+load_credential_artifact(storage_id : Any, session_id : Optional[str]) Optional[Dict[str, Any]]
+delete_credential_artifact(storage_id : Any, session_id : Optional[str]) bool
-_normalise_storage_id(storage_id : Any) Optional[str]
-_artifact_path(storage_id : str) str
-_artifact_filename(storage_id : str) str
-_user_root_prefix(session_id : str) str
-_artifact_prefix(session_id : str) str
-_artifact_blob(storage_id : str, session_id : str) str
-_using_gcs() bool
-_ensure_directory() None
-_read_file(path : str) Optional[Dict[str, Any]]
-_write_file(path : str, payload : Dict[str, Any]) None
-_resolve_session_id(session_id : Optional[str]) str
-_read_record(storage_id : str, session_id : str) Optional[Dict[str, Any]]
-_write_record(storage_id : str, session_id : str, record : Dict[str, Any]) None
-_delete_record(storage_id : str, session_id : str) bool
-_merge_payload(base : Dict[str, Any], update : Dict[str, Any]) Dict[str, Any]
}
CredentialArtifacts --> storage : "uses"
CredentialArtifacts --> cloud_storage : "uses"
CredentialArtifacts --> config : "uses"
Diagram sources
- credential_artifacts.py
Section sources
- credential_artifacts.py
The device logging system in device_logs.py captures detailed information about WebAuthn device registrations.
Device logs include comprehensive information about registration events:
- Event Timestamp: Precise timestamp of the registration event
- Relying Party: RP ID for the registration
- User Information: User ID, name, and display name
- Credential Details: Credential ID, COSE public key, sign count, and transports
- Authenticator Data: AAGUID, device name from MDS, and attestation format
- Raw Data: Base64URL-encoded attestation object and client data JSON
The system uses dataclasses to ensure type safety and consistency in log entries.
The device logging system implements the following features:
- Asynchronous Logging: Log entries are uploaded in a separate thread to avoid blocking the main request
- GitHub Integration: Logs are stored in a GitHub repository for version control and access control
- Data Serialization: Log data is serialized to JSON with proper encoding of binary data
- Error Handling: Robust error handling ensures logging failures do not affect authentication operations
- Privacy Protection: Sensitive data is properly encoded and protected
The metadata system in metadata.py manages FIDO Metadata Service (MDS) data:
- Base Metadata: The system loads and caches base metadata from the FIDO Alliance
- Custom Metadata: Users can upload custom metadata statements that are merged with the base metadata
- Session Metadata: Metadata uploaded during a session is stored temporarily and isolated from other sessions
- Trust Verification: The system verifies the trust chain of metadata statements
- Cache Management: Metadata is cached locally to reduce network requests and improve performance
The metadata system supports both the official FIDO MDS and custom metadata uploads, providing flexibility for testing and development.
sequenceDiagram
participant Client as "Client Application"
participant Server as "Authentication Server"
participant DeviceLogs as "Device Logs System"
participant GitHub as "GitHub Repository"
Client->>Server : Registration Request
Server->>Server : Process registration
Server->>DeviceLogs : record_registration_event(event)
DeviceLogs->>DeviceLogs : Build log payload
DeviceLogs->>DeviceLogs : Create log path
DeviceLogs->>DeviceLogs : Start upload thread
DeviceLogs-->>Server : Continue processing
DeviceLogs->>GitHub : Upload log JSON
alt Upload Success
GitHub-->>DeviceLogs : Success response
DeviceLogs->>Console : Log upload success
else Upload Failure
GitHub-->>DeviceLogs : Error response
DeviceLogs->>Console : Log upload failure
end
Note over DeviceLogs,GitHub : Asynchronous logging ensures<br>registration completes regardless<br>of logging success
Diagram sources
- device_logs.py
- metadata.py
Section sources
- device_logs.py
- metadata.py
The storage system implements comprehensive data lifecycle management and security measures.
The system manages data lifecycle through the following mechanisms:
- Retention Policies: Session metadata is automatically cleaned up after 14 days of inactivity
- Backup Strategies: GCS provides built-in redundancy and backup capabilities
- Data Migration: The system supports migration between local and cloud storage
- Versioning: GCS object versioning can be enabled for additional protection
- Audit Trails: Device logs provide a complete record of registration events
The cleanup process runs periodically (every 6 hours) to remove inactive sessions, ensuring efficient use of storage resources.
The system implements multiple layers of security protection:
- Encryption at Rest: GCS provides server-side encryption for stored data
- Access Control: GCS IAM policies control access to storage buckets
- Authentication: Service account credentials with limited permissions
- Data Encoding: Binary data is properly encoded for JSON serialization
- Input Validation: Comprehensive validation of input parameters
- Path Sanitization: Proper sanitization of file and blob paths
The system also implements defense-in-depth principles with multiple layers of protection.
The storage system protects against unauthorized access through:
- Environment Variables: Sensitive configuration is stored in environment variables
- Secure Defaults: GCS access is disabled by default to prevent accidental exposure
- Thread Safety: Proper synchronization prevents race conditions
- Error Handling: Comprehensive error handling prevents information leakage
- Input Sanitization: All inputs are validated and sanitized
The system follows the principle of least privilege, granting only the minimum permissions necessary for operation.
Section sources
- storage.py
- cloud_storage.py
- session_metadata_store.py
The storage and persistence system in the post-quantum WebAuthn platform provides a robust, secure, and flexible foundation for managing authentication data. The dual storage backend implementation allows seamless operation in both development and production environments, while the comprehensive data models ensure all necessary information is captured and preserved. The system's focus on security, data integrity, and efficient lifecycle management makes it well-suited for handling sensitive authentication credentials and metadata. By combining local file storage with Google Cloud Storage integration, the system offers the best of both worlds: simplicity for development and scalability for production deployment.