06 Common General Best Practices - Observatorio-do-Trabalho-de-Pernambuco/documentation GitHub Wiki
This page contains general guidelines and overarching principles that apply across all areas of our Data Engineering projects.
- Modular Code: Write small, focused functions and classes.
- Meaningful Naming: Use descriptive names for variables, functions, and classes.
- Avoid Hard-coded Values: Utilize configuration files or environment variables.
- Coding Style: Follow the agreed-upon style (e.g., PEP8 for Python).
-
Linting/Formatting Tools: Ensure consistency across the team (e.g.,
black
,flake8
,isort
).
- Unit Tests: Write tests for critical functions and modules.
- Integration Tests: Ensure end-to-end workflows function correctly.
- Automation: Incorporate testing into CI pipelines.
- Layered Approach: Separate data ingestion, transformation, and output layers for clarity.
- Scalability: Keep future load/performance needs in mind when designing pipelines or databases.
- Diagrams: Use system or flow diagrams (e.g., UML) for complex architectures.
- ADR (Architecture Decision Records): Document key decisions, including rationale and outcomes.
- Peer Review: Encourage pair programming or at least one reviewer for merges.
- Constructive Feedback: Focus on improving code quality, not criticizing.
- Team Channels: Use Slack, Teams, or similar tools for quick updates.
- Stand-ups & Retros: Keep everyone aligned on progress, issues, and improvements.
- Periodic Review: Schedule times to revisit these best practices.
- Feedback Loop: Encourage team members to suggest improvements or new tools.