Storage Backend Issues - FeitianTech/postquantum-webauthn-platform GitHub Wiki
Storage Backend Issues
Table of Contents
- Introduction
- Storage Architecture Overview
- Local File Storage Issues
- Google Cloud Storage Integration Problems
- Configuration Issues in config.py
- Diagnosing Credential Persistence Failures
- Session Metadata Storage Errors
- Cloud Authentication Problems
- Validating the Storage Abstraction Layer
- Debugging cloud_storage.py Integration
- Interpreting Google Cloud Client Library Error Messages
- Using Test Suites to Isolate Storage Problems
- Troubleshooting Common Issues
Introduction
The Post-Quantum WebAuthn Platform utilizes a dual storage backend system that supports both local file storage and Google Cloud Storage (GCS) for credential persistence and session metadata management. This document provides comprehensive guidance on diagnosing and resolving storage backend issues that may arise in either storage mode. The platform's storage architecture is designed with backward compatibility, supporting legacy credential formats while implementing a session-based credential organization system. Understanding the interaction between the storage.py abstraction layer, cloud_storage.py integration module, and configuration settings in config.py is essential for maintaining reliable credential persistence and session management.
Section sources
Storage Architecture Overview
The storage architecture in the Post-Quantum WebAuthn Platform consists of a hierarchical system with multiple layers of abstraction. At the core is the storage.py module, which provides a unified interface for credential operations while abstracting the underlying storage mechanism. This module delegates to either local file system operations or Google Cloud Storage through the cloud_storage.py module, depending on configuration. The system supports both backward compatibility with legacy credential formats and a modern session-based organization structure.
graph TD
A[Application Layer] --> B[storage.py Abstraction Layer]
B --> C{Storage Mode}
C --> |GCS Enabled| D[cloud_storage.py]
C --> |Local Storage| E[Local File System]
D --> F[Google Cloud Storage]
E --> G[Local Directory Structure]
H[config.py] --> C
H --> I[Base Path Configuration]
B --> J[Legacy Format Support]
D --> K[Retry Mechanisms]
K --> L[Transient Error Handling]
Diagram sources
Section sources
Local File Storage Issues
Local file storage issues typically manifest as credential persistence failures, permission errors, or directory structure problems. The system stores credentials in a directory structure organized by session IDs, with a base path defined in config.py. The _LOCAL_CREDENTIAL_BASE variable is set to "session-credentials" directory relative to the application base path. When storing credentials locally, the system creates session-specific subdirectories and stores credential data as pickle files with the naming pattern "{username}_credential_data.pkl".
Common issues include missing storage directories, insufficient write permissions, and disk space limitations. The _local_directory function in storage.py handles directory creation with os.makedirs(directory, exist_ok=True), but this requires appropriate permissions on the parent directory. If the application lacks write permissions to the base directory or if disk space is exhausted, credential storage operations will fail silently, returning empty results on subsequent read operations.
flowchart TD
Start([Credential Storage Attempt]) --> CheckMode["Check Storage Mode"]
CheckMode --> |Local Storage| CheckDirectory["Verify Directory Exists"]
CheckDirectory --> |Directory Missing| CreateDirectory["Create Directory Structure"]
CreateDirectory --> |Permission Error| Fail["Storage Failure"]
CheckDirectory --> |Directory Exists| CheckPermissions["Verify Write Permissions"]
CheckPermissions --> |Insufficient Permissions| Fail
CheckPermissions --> |Sufficient Permissions| CheckDiskSpace["Check Available Disk Space"]
CheckDiskSpace --> |Insufficient Space| Fail
CheckDiskSpace --> |Sufficient Space| WriteFile["Write Credential File"]
WriteFile --> |Write Failure| Fail
WriteFile --> |Success| Success["Credential Stored"]
Diagram sources
Section sources
Google Cloud Storage Integration Problems
Google Cloud Storage integration issues primarily stem from authentication problems, bucket configuration errors, or network connectivity issues. The cloud_storage.py module manages GCS interactions through a singleton pattern with thread-safe client initialization. The _ensure_bucket function validates that GCS is enabled and that the FIDO_SERVER_GCS_BUCKET environment variable is configured. When these conditions are not met, the system raises RuntimeError with descriptive messages.
Authentication to GCS can be configured through multiple methods: service account key files (FIDO_SERVER_GCS_CREDENTIALS_FILE), inline JSON credentials (FIDO_SERVER_GCS_CREDENTIALS_JSON), or default application credentials. Misconfiguration of these authentication methods is a common source of integration problems. The _build_client function attempts to create credentials from these sources in a specific priority order, and failures at any step can prevent successful GCS initialization.
flowchart TD
Start([GCS Operation]) --> CheckEnabled["Check GCS Enabled"]
CheckEnabled --> |GCS Disabled| Fail["Operation Not Available"]
CheckEnabled --> |GCS Enabled| CheckBucket["Verify Bucket Configured"]
CheckBucket --> |Bucket Not Set| Fail
CheckBucket --> |Bucket Set| BuildClient["Build GCS Client"]
BuildClient --> |Credentials File| CreateFromFile["Create Credentials From File"]
BuildClient --> |Credentials JSON| CreateFromJSON["Create Credentials From JSON"]
BuildClient --> |Project Override| UseProjectOverride["Use Project Override"]
BuildClient --> |Default| UseDefaultCredentials["Use Default Credentials"]
CreateFromFile --> |Failure| Fail
CreateFromJSON --> |Failure| Fail
UseProjectOverride --> |Failure| Fail
UseDefaultCredentials --> |Failure| Fail
BuildClient --> |Success| InitializeBucket["Initialize Bucket Connection"]
InitializeBucket --> |Success| Success["GCS Ready"]
Diagram sources
Section sources
Configuration Issues in config.py
Configuration issues in config.py primarily affect storage path definitions and GCS bucket settings. The basepath variable, defined as the absolute path of the server module directory, serves as the root for all relative storage paths. This path is used to construct the _LOCAL_CREDENTIAL_BASE in storage.py and various metadata directories. Incorrect configuration of environment variables that influence storage behavior can lead to credential persistence failures and session management issues.
Key configuration variables include FIDO_SERVER_GCS_ENABLED (boolean flag to enable GCS), FIDO_SERVER_GCS_BUCKET (GCS bucket name), FIDO_SERVER_GCS_CREDENTIALS_FILE (path to service account key file), and FIDO_SERVER_GCS_CREDENTIALS_JSON (inline service account credentials). The _env_flag function in both config.py and cloud_storage.py handles boolean environment variable parsing, accepting various representations of true/false values. Misconfiguration of these variables, such as providing an empty bucket name or invalid credentials path, will prevent successful storage operations.
Section sources
Diagnosing Credential Persistence Failures
Credential persistence failures can occur in both local and cloud storage modes and require systematic diagnosis. The storage.py module provides several functions for credential management: savekey, readkey, delkey, iter_credentials, and list_credentials. When diagnosing persistence issues, first determine whether the system is configured for local or GCS storage by checking the _using_gcs function, which evaluates both gcs_enabled() and the presence of FIDO_SERVER_GCS_BUCKET.
For local storage issues, verify the existence and permissions of the session-credentials directory and its subdirectories. Check that the application has write permissions to the base directory and sufficient disk space. For GCS issues, use the ensure_ready function to validate bucket accessibility. This function attempts to list objects in the bucket with retry logic for transient failures. Monitoring the application logs for warnings from the _list_credential_blob_names function can also reveal issues with listing credential blobs in GCS.
Section sources
Session Metadata Storage Errors
Session metadata storage errors are typically related to directory structure issues or permission problems. The SESSION_METADATA_DIR is configured in config.py as "static/session-metadata" relative to the application base path. This directory stores session-specific metadata that supports the credential management system. Errors in this subsystem can prevent proper session recovery and credential association.
The system uses a fallback mechanism for credential location, checking both session-specific locations and legacy locations in the base directory. This is implemented in the _candidate_gcs_blob_names and _list_credential_blob_names functions, which generate potential blob names for both current and legacy storage formats. When diagnosing session metadata issues, check whether the system is correctly resolving session IDs through the _resolve_session_id function, which prioritizes explicit session IDs but falls back to metadata-based session identification.
Section sources
Cloud Authentication Problems
Cloud authentication problems in the Post-Quantum WebAuthn Platform typically arise from misconfigured service account credentials or insufficient IAM permissions. The system supports three methods for providing GCS authentication: service account key files, inline JSON credentials, and default application credentials. The most common issue is providing an incorrect path to the service account key file or malformed JSON credentials.
When using service account key files, ensure the file is accessible to the application process and contains valid JSON with the required fields (project_id, private_key, client_email, etc.). For inline JSON credentials, verify that the JSON is properly formatted and properly escaped in the environment variable. The _build_client function in cloud_storage.py handles credential creation and will raise exceptions if the credentials cannot be parsed or if required fields are missing.
Section sources
Validating the Storage Abstraction Layer
Validating the storage.py abstraction layer requires testing both local and GCS code paths. The module is designed to provide a consistent interface regardless of the underlying storage mechanism. Key functions to validate include savekey, readkey, and iter_credentials, which handle the complete credential lifecycle. The abstraction layer includes backward compatibility features that allow reading from legacy storage locations while writing to the current session-based structure.
To validate the abstraction layer, verify that credential operations succeed when switching between local and GCS storage modes. Test the fallback behavior by creating credentials in one mode and reading them in another. The _candidate_gcs_blob_names function implements the fallback logic by generating blob names for both the current session-specific location and the legacy location, allowing read operations to find credentials regardless of where they were stored.
Section sources
Debugging cloud_storage.py Integration
Debugging cloud_storage.py integration issues requires understanding the retry mechanisms and error handling patterns implemented in the module. The _with_retry decorator provides exponential backoff retry logic for transient failures, with a default of three attempts and a base delay of 0.5 seconds. This mechanism handles common transient errors such as network timeouts and temporary service unavailability.
When debugging integration issues, examine the _RETRYABLE_EXCEPTIONS tuple, which includes GoogleAPICallError, RetryError, RefreshError, and OSError. These exceptions trigger the retry mechanism, while NotFound errors are propagated immediately since they represent permanent conditions. The list_blob_names function demonstrates the retry pattern, wrapping the GCS list_blobs operation in _with_retry and yielding results from the retried operation.
Section sources
Interpreting Google Cloud Client Library Error Messages
Interpreting Google Cloud client library error messages requires understanding the exception hierarchy and common error patterns. The cloud_storage.py module handles several types of exceptions from the Google Cloud client libraries: gcs_exceptions.NotFound (when a blob does not exist), gcs_exceptions.GoogleAPICallError (general API call failures), gcs_exceptions.RetryError (failures during retry attempts), and auth_exceptions.RefreshError (authentication token refresh failures).
The module implements specific error handling patterns: NotFound exceptions are converted to None return values for download operations, allowing graceful handling of missing blobs. Other exceptions may trigger retry logic or be propagated to the calling code. When diagnosing issues, check the application logs for warning messages from the _list_credential_blob_names function, which logs exceptions encountered during blob listing operations without failing the entire operation.
Section sources
Using Test Suites to Isolate Storage Problems
The test_storage.py and test_cloud_storage.py suites provide comprehensive test coverage for storage functionality and can be used to isolate and diagnose problems. These test files use mocking and monkeypatching to simulate various storage conditions without requiring actual GCS access or file system modifications.
The test_storage.py suite validates the credential storage helpers, including fallback behavior from new to legacy storage locations, handling of missing credentials, and proper cleanup during deletion. Tests like test_readkey_falls_back_to_legacy_gcs and test_iter_credentials_includes_legacy_entries verify the backward compatibility features. The test_cloud_storage.py suite focuses on GCS integration, testing retry behavior, error handling, and bucket operations.
To use these tests for problem isolation, run them with different configuration scenarios by modifying environment variables. For example, disable GCS by unsetting FIDO_SERVER_GCS_BUCKET and verify that local storage operations work correctly. Then enable GCS with a mock bucket name to test the GCS code paths. The tests use dummy implementations of the Google Cloud client classes, allowing comprehensive testing without actual cloud resources.
flowchart TD
Start([Run Storage Tests]) --> SetupEnvironment["Setup Test Environment"]
SetupEnvironment --> MockModules["Mock External Modules"]
MockModules --> SetEnvironment["Set Environment Variables"]
SetEnvironment --> RunLocalTests["Run Local Storage Tests"]
RunLocalTests --> TestSaveKey["Test savekey Function"]
TestSaveKey --> TestReadKey["Test readkey Function"]
TestReadKey --> TestDeleteKey["Test delkey Function"]
TestDeleteKey --> RunGCSTests["Run GCS Integration Tests"]
RunGCSTests --> TestUpload["Test upload_bytes"]
TestUpload --> TestDownload["Test download_bytes"]
TestDownload --> TestList["Test list_blob_names"]
TestList --> AnalyzeResults["Analyze Test Results"]
AnalyzeResults --> ReportIssues["Report Any Failures"]
Diagram sources
Section sources
Troubleshooting Common Issues
Missing Storage Directories
When storage directories are missing, the system may fail to persist credentials. To resolve:
- Verify the basepath is correctly set in config.py
- Check that the session-credentials directory exists in the base directory
- Ensure the application has write permissions to create directories
- Manually create the directory structure if needed: mkdir -p {basepath}/session-credentials
Permission Errors
Permission errors typically occur when the application lacks write access to storage directories:
- Check directory ownership and permissions using ls -la
- Ensure the application process user has write permissions
- Verify that parent directories have execute permissions for directory traversal
- Consider running the application with appropriate user privileges
Misconfigured Service Account Credentials for GCP
When GCP service account credentials are misconfigured:
- Verify the service account key file exists and is accessible
- Check that the JSON credentials are properly formatted
- Ensure the service account has the required IAM roles (Storage Object Admin)
- Validate that the project ID in the credentials matches the intended GCP project
Credential Persistence Failures
For credential persistence failures:
- Check the storage mode (local vs GCS) and corresponding configuration
- Verify the storage location exists and is accessible
- Test basic read/write operations outside the application
- Examine application logs for storage-related warnings or errors
Session Metadata Storage Errors
To resolve session metadata storage errors:
- Verify the SESSION_METADATA_DIR exists and is writable
- Check that session IDs are being properly resolved
- Ensure the metadata recovery feature is properly configured if needed
- Validate that the fallback mechanism works between legacy and current formats
Section sources