Backend Logging Guide - bcgov/lcfs GitHub Wiki
Backend Logging Guide
This document provides comprehensive guidelines for developers on how to effectively use the standardized logging system in the LCFS backend. By following this guide, you will ensure consistent logging practices, facilitating easier debugging, monitoring, and analysis. This guide is based on the original "11. Backend Logging Guide" and adapted for the project's use of structlog
.
Table of Contents
- Introduction
- Logging Library: Structlog
- Logging Configuration
- Logging Modes
- Logging Practices
- Sample Logs
- Logging Levels
- Correlation ID Usage
- Best Practices and Guidelines
- Using Kibana for Log Analysis (If Applicable)
- FAQs
1. Introduction
Effective logging is crucial for diagnosing issues, monitoring application behavior, and gaining insights into system performance. This guide outlines our standardized approach to backend logging using structlog
, helping you produce logs that are both developer-friendly and suitable for production analysis.
Our logging system, powered by structlog
, is designed to be developer-friendly and automatically enriches logs with useful metadata like source information and correlation IDs.
2. Logging Library: Structlog
The backend uses structlog
(structlog
dependency in backend/pyproject.toml
). Structlog is a powerful library for structured logging in Python, allowing for flexible and rich log event creation.
3. Logging Configuration
- The primary logging configuration is typically found in
backend/lcfs/logging_config.py
(or a similar path). - This file sets up
structlog
processors, formatters, and integrates with standard Python logging if needed. - Key features handled by the configuration include:
- Adding timestamps.
- Including log levels.
- Capturing caller information (filename, line number, function name).
- Automatic inclusion of correlation IDs.
- Processors for development (console-friendly) and production (JSON) output.
- Sensitive data masking.
4. Logging Modes
Our logging system operates in two distinct modes:
Development Mode
- Purpose: To provide developers with readable and highlighted logs during local development.
- Features:
- Human-readable, often colorized console output.
- Syntax highlighting for better readability.
- Indented and structured output for complex data types.
- Example Output Tool:
structlog.dev.ConsoleRenderer
.
Non-Development Mode (JSON)
- Purpose: To produce logs suitable for production environments, optimized for log aggregation and analysis tools (e.g., Kibana, Splunk, OpenSearch).
- Features:
- JSON-formatted logs for easy parsing by log management systems.
- Compact representation of data.
- Example Output Tool:
structlog.processors.JSONRenderer
.
5. Logging Practices
What to Log
Developers should log information that aids in understanding application behavior and diagnosing issues:
- Significant Business Events: e.g., "Compliance report submitted", "User account created".
- Key Data Points/Identifiers: e.g.,
report_id
,user_id
,transaction_id
, counts, statuses. - Errors and Exceptions: All caught exceptions or significant error conditions.
- Service Calls: Start and end of important service operations or external API calls, along with their success/failure status.
- Process Milestones: Key steps in long-running or complex processes.
- Configuration Values: Important configuration settings at startup (be mindful of sensitivity).
How to Log with Structlog
-
Get a Logger Instance: In your Python modules, obtain a
structlog
logger:import structlog logger = structlog.get_logger(__name__)
-
Log Messages with Key-Value Pairs: Log messages by calling methods on the logger object (e.g.,
info
,error
,warning
,debug
). Pass structured data as keyword arguments.logger.info("fuel_supply_created", fuel_supply_id=123, compliance_report_id=456) logger.error("payment_processing_failed", order_id=789, reason="Insufficient funds")
The first argument is typically the main "event" or message.
-
No Need to Manually Add Common Metadata: The
structlog
configuration automatically adds timestamps, log levels, source information (filename, line number, function name), and correlation IDs.
6. Sample Logs
(These are conceptual examples; actual format depends on logging_config.py
.)
Development Mode Logs
(Example from original wiki, adapted for structlog style)
2024-11-03 18:50:23 [info ] Getting fuel supply list for compliance report [your_module]compliance_report_id=123 correlation_id=177b381a-ca37-484d-a3b9-bbb16061775a filename=reports/views.py func_name=get_fuel_supply_list lineno=101
structlog.dev.ConsoleRenderer
often provides more structured, multi-line output for key-value pairs.
Non-Development Mode Logs (JSON)
(Example from original wiki, adapted for structlog style)
{
"event": "Getting fuel supply list for compliance report",
"compliance_report_id": 123,
"correlation_id": "816dbbdf-11fe-4df5-8dc8-754c07610742",
"level": "info",
"logger": "your_module.reports.views",
"filename": "reports/views.py",
"func_name": "get_fuel_supply_list",
"lineno": 101,
"timestamp": "2024-11-03T18:50:23.123456Z"
}
7. Logging Levels
Use appropriate logging levels:
logger.debug(...)
: Detailed information, typically for diagnosing problems. Often disabled in production.logger.info(...)
: Confirmation that things are working as expected; significant lifecycle events.logger.warning(...)
: Indication of an unexpected event or potential problem that doesn't prevent current operation but might cause issues later.logger.error(...)
: A more serious problem where the software was unable to perform some function.logger.critical(...)
: A very serious error, indicating the program itself may be unable to continue running.
8. Correlation ID Usage
- Purpose: A unique ID to trace a single request or operation across multiple log entries, services, or components.
- Automatic Inclusion: The logging infrastructure (via
structlog
processors) should automatically generate/propagate and include correlation IDs in every log entry. - Future Integration: Aim for end-to-end tracing by passing the correlation ID from the frontend through to all backend services.
9. Best Practices and Guidelines
Consistency
- Event Names: Use consistent and descriptive event names (the first argument to log methods like
logger.info("event_name", ...)
). - Key Names: Use consistent key names for common data points (e.g.,
user_id
,report_id
).
Security and Privacy (Sensitive Data)
- DO NOT Log Sensitive Data: Avoid logging PII (Personally Identifiable Information), passwords, access tokens, API keys, financial details, or any other confidential information directly.
- Automatic Data Masking: The
structlog
configuration should include a processor to automatically mask or censor known sensitive keys (e.g.,password
,token
,authorization
). An example processor function from the original wiki:
Ensure such a processor is active in your# In backend/lcfs/logging_config.py (conceptual) # def censor_sensitive_data_processor(_, __, event_dict): # sensitive_keys = {'password', 'token', 'secret_key', 'authorization', 'api_key'} # for key in event_dict: # if key in sensitive_keys: # event_dict[key] = '***' # Or a more robust redaction # return event_dict
structlog
pipeline. - Be Mindful: Even with masking, exercise caution. It's best not to log sensitive data at all if avoidable.
Performance Considerations
- Avoid Excessive Logging: Log necessary information but avoid overly verbose logging in high-throughput code paths or tight loops, as it can impact performance.
- Deferred Evaluation:
structlog
can be configured to defer string formatting or a_lambda_based_value_computation until it's certain a log message will actually be emitted (e.g., based on log level), which can save resources. - Asynchronous Logging: For very high-performance scenarios, consider asynchronous logging handlers (though this adds complexity).
10. Using Kibana for Log Analysis (If Applicable)
If logs are shipped to an Elasticsearch, Logstash, Kibana (ELK) stack or similar (like OpenSearch):
- Accessing Kibana: Via OpenShift Platform or a direct URL.
- Index Pattern: Typically
app-*
or similar depending on your log shipping configuration. - Timestamp Field: Usually
@timestamp
. - Searching/Filtering: Utilize Kibana Query Language (KQL) or Lucene syntax to search and filter logs.
- Filter by
correlation_id
to trace a request. - Filter by
level
(e.g.,level:error
). - Filter by
kubernetes.namespace_name
,kubernetes.pod_name
,kubernetes.container_name
for specific services in OpenShift. - Example Kibana Query (from original wiki):
kubernetes.namespace_name:"YOUR_PROJECT_NAME-tools" AND kubernetes.container_name.raw:"lcfs-backend" AND level:"error"
- Filter by
11. FAQs
- Q1: Do I need to include source info (filename, line number) in logs?
A: No,
structlog
is configured to add this automatically. - Q2: How do I include additional context?
A: Pass key-value pairs as keyword arguments to the logger methods:
logger.info("event", key1=value1, key2=value2)
. - Q3: Do I need to manage the correlation ID? A: No, this is handled by the logging infrastructure.
- Q4: How to log exceptions?
A: Use
logger.exception("event_description")
within anexcept
block to automatically include exception info and stack trace, or log manually withlogger.error("event", error=str(e), exc_info=True)
.try: # ... some operation ... except ValueError as e: logger.error("value_error_occurred", input_data=some_data, error_message=str(e)) # For full stack trace (especially in dev or if needed for error level): # logger.exception("value_error_details")
Adherence to these logging guidelines will greatly improve the observability and maintainability of the LCFS backend. Consult backend/lcfs/logging_config.py
for specific structlog
processor chain details.