Performance Considerations - bcgov/lcfs GitHub Wiki

Performance Considerations

This document outlines performance considerations for the LCFS system and strategies employed or to be considered for optimization.

1. Backend Performance

Framework: FastAPI is a high-performance Python web framework, built on Starlette and Pydantic, designed for speed and efficiency.
Asynchronous Operations: The backend utilizes asyncio with SQLAlchemy (via asyncpg) and aio-pika (for RabbitMQ). This allows for non-blocking I/O operations, significantly improving throughput for I/O-bound tasks by handling many concurrent requests efficiently.
Caching: Redis is used as a caching layer (via fastapi-cache2) to store frequently accessed data and reduce database load. See Caching Strategy for more details.
Load Testing: The backend/performance/ directory contains a locustfile.py, indicating that Locust is used for load testing the backend APIs. This helps identify bottlenecks and validate performance under concurrent user load.
- Further Investigation: Review the locustfile.py and associated readme.md in backend/performance/ to understand the specific scenarios tested and performance benchmarks.
Database Optimization:
- Query Efficiency: Ensure SQLAlchemy queries are optimized. Use EXPLAIN ANALYZE for complex queries to understand their execution plans.
- Indexing: Proper database indexing on frequently queried columns is crucial. This should be reviewed based on query patterns.
- Connection Pooling: SQLAlchemy uses connection pooling by default, which is efficient.
Efficient Data Structures: Using Pydantic for data validation and serialization is generally efficient.

2. Frontend Performance

Build Tool: Vite provides fast cold starts and Hot Module Replacement (HMR) during development, and optimized builds for production (code splitting, tree shaking, asset optimization).
Component Rendering (React):
- Memoization: Use React.memo for functional components and shouldComponentUpdate or PureComponent for class components to prevent unnecessary re-renders.
- Virtualization: For long lists or large tables, consider using windowing libraries (e.g., react-window or react-virtualized) if AG Grid's built-in virtualization isn't sufficient or applicable elsewhere.
- Lazy Loading: Use React.lazy and Suspense to code-split components and load them on demand, improving initial page load time.
State Management: Efficient use of Zustand and React Query. React Query helps avoid redundant data fetching and manages server state effectively.
Bundle Size: Regularly analyze the production bundle size (e.g., using vite-plugin-inspect or rollup-plugin-visualizer) to identify and optimize large dependencies.
Browser Caching: Ensure appropriate HTTP caching headers are set for static assets.
Image Optimization: Serve images in optimized formats (e.g., WebP) and sizes.
Debouncing/Throttling: For user inputs that trigger expensive operations (e.g., API calls in typeaheads), use debouncing or throttling.

3. ETL Performance (Apache NiFi)

Flow Design: Efficient NiFi flow design is critical. Avoid unnecessary processing, use appropriate processors, and optimize processor configurations.
Back Pressure: NiFi has built-in back-pressure mechanisms to prevent overwhelming downstream components or systems. Configure these appropriately.
Concurrent Tasks: Configure the number of concurrent tasks for processors based on available resources and the nature of the task.
Resource Allocation: Ensure NiFi, Zookeeper, and the source/target databases have adequate CPU, memory, and disk I/O resources.
Batching: Process data in batches where appropriate to reduce overhead.

4. Infrastructure & Network

OpenShift Resource Requests/Limits: Properly configure CPU and memory requests and limits for pods in OpenShift to ensure stable performance and efficient resource utilization.
Network Latency: Consider network latency between services, especially between the application and external services like Keycloak or remote databases.
CDN: For frontend static assets, using a Content Delivery Network (CDN) can significantly improve load times for geographically distributed users (though this depends on the OpenShift setup and if it's fronted by a CDN).

General Considerations

Monitoring: Implement comprehensive monitoring (e.g., using Prometheus, Grafana, or OpenShift's built-in monitoring) to track key performance indicators (KPIs) like response times, error rates, resource utilization, and queue lengths. The presence of prometheus-fastapi-instrumentator in the backend suggests Prometheus metrics are available.
Scalability: Design components to be scalable, particularly the backend and database, to handle increasing load. OpenShift provides mechanisms for horizontal pod autoscaling.

Performance is an ongoing concern. Regular testing, monitoring, and profiling are essential to identify and address bottlenecks as the system evolves and load changes.