Understanding .pre‐commit‐config.yaml file - inzamamshajahan/github-actions-learn4 GitHub Wiki

This file is crucial for maintaining code quality and consistency before code changes are even committed to the repository.

Documentation for: `my_python_project/.pre-commit-config.yaml`

Overall Purpose:

This file, .pre-commit-config.yaml, configures the pre-commit framework. pre-commit is a tool that manages and executes Git hooks. Hooks are scripts that run automatically at certain points in the Git lifecycle, such as before a commit (pre-commit) or before a push (pre-push).

The primary goal of using pre-commit with this configuration is to automate code quality checks locally on the developer's machine before code is committed. This helps to:

Enforce Code Style and Formatting: Ensure all committed code adheres to consistent style guidelines (e.g., whitespace, line endings, import order, quote style).
Catch Simple Errors Early: Identify syntax errors, unused variables, or potential bugs before they reach the main repository or CI/CD pipeline.
Maintain Readability: Keep the codebase clean and easier for everyone to read and understand.
Reduce CI Load: By catching issues locally, it can reduce the number of failed CI builds, saving time and resources.
Improve Code Quality Incrementally: As developers commit, the code is automatically checked and often fixed, leading to a progressively cleaner codebase.

The configuration is written in YAML, a human-readable data serialization language.

Structure of the File:

The file consists of a top-level repos key, which is a list of repositories. Each item in this list specifies a repository containing pre-commit hooks and defines which hooks from that repository to use.

1. Repository: https://github.com/pre-commit/pre-commit-hooks

-   repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.5.0
    hooks:
    -   id: trailing-whitespace
    -   id: end-of-file-fixer
    -   id: check-yaml
    -   id: check-added-large-files

repo: https://github.com/pre-commit/pre-commit-hooks:
- What: Specifies the URL of the repository that provides the hooks. This is the official collection of general-purpose hooks maintained by the pre-commit project.
- Why: These are foundational, widely useful hooks for many types of projects.
rev: v4.5.0:
- What: Pins the version of the pre-commit-hooks repository to tag v4.5.0.
- Why: Using a specific tag (a released version) ensures that the behavior of these hooks is consistent and doesn't change unexpectedly if the upstream repository is updated. This is crucial for reproducible builds and local checks.
- Alternatives: Using a specific commit SHA (even more pinned, but less readable) or a branch name like main (not recommended for stability, as it could pull in breaking changes). Tagged releases are the best practice.
hooks:: A list of hook definitions from this repository.
- - id: trailing-whitespace:
  - Purpose: Automatically removes any trailing whitespace from lines in modified files.
  - Why: Trailing whitespace can be invisible but can create noise in diffs and is generally considered untidy.
  - Args/Alternatives: This hook has options to, for example, not modify markdown files where trailing whitespace might be significant (args: [--markdown-linebreak-ext=md]). For this project, the default (fixing all files) is used.
- - id: end-of-file-fixer:
  - Purpose: Ensures that every non-empty file ends with a single newline character. It will add a newline if missing or remove extra newlines at the end.
  - Why: This is a POSIX convention and helps prevent issues with some tools, version control diffs (especially if lines are added to the end of files), and file concatenation.
- - id: check-yaml:
  - Purpose: Checks the syntax of YAML files (e.g., this .pre-commit-config.yaml itself, GitHub Actions workflow files like ci-cd.yml).
  - Why: Catches YAML syntax errors early, preventing issues with tools or platforms that consume these files.
- - id: check-added-large-files:
  - Purpose: Prevents accidentally committing large files to the Git repository. By default, it usually flags files larger than 500KB.
  - Why: Large binary files can significantly bloat the Git repository, making cloning, fetching, and other operations slow. Such files are often better managed using Git LFS (Large File Storage) or stored in dedicated artifact repositories.
  - Args/Alternatives: Can be configured with args: ['--maxkb=<size>'] to set a different size threshold.

2. Repository: https://github.com/astral-sh/ruff-pre-commit

-   repo: https://github.com/astral-sh/ruff-pre-commit
    rev: 'v0.11.10'
    hooks:
    -   id: ruff
        args: [--fix, --exit-non-zero-on-fix]
    -   id: ruff-format

repo: https://github.com/astral-sh/ruff-pre-commit:
- What: Specifies the repository providing pre-commit hooks for Ruff. Ruff is an extremely fast Python linter and code formatter, written in Rust.
- Why Ruff? It's designed to be a high-performance replacement for many individual Python linting and formatting tools like Flake8, isort, pydocstyle, pyupgrade, and more, often with significantly faster execution times.
rev: 'v0.11.10':
- What: This pins the ruff-pre-commit hook configuration. In the context of ruff-pre-commit, this rev often directly corresponds to the version of the ruff tool itself that the hook will use. So, this setup aims to use ruff version 0.11.10.
- Why pin this? Ensures that the version of Ruff used by pre-commit locally is consistent.
- Note on Versioning & Alignment: It's important that the Ruff version used locally by pre-commit is compatible with, or ideally the same as, the Ruff version used in your CI pipeline (which is installed via pip install -e .[dev] based on pyproject.toml). Your CI logs show ruff-0.11.10 being installed from pyproject.toml, so this rev ensures alignment. Keeping tool versions consistent between local hooks and CI is key to avoiding "it works on my machine but fails in CI" (or vice-versa).
- Updating: While v0.11.10 is used here, Ruff has newer versions (e.g., 0.4.x series as of May 2024). Regularly updating ruff (in pyproject.toml) and this rev to newer stable versions is good practice to get the latest features, bug fixes, and performance improvements.
hooks::
- - id: ruff:
  - Purpose: Runs the Ruff linter.
  - args: [--fix, --exit-non-zero-on-fix]:
    - --fix: Instructs Ruff to automatically attempt to fix any fixable linting violations it finds.
    - --exit-non-zero-on-fix: If Ruff applies any fixes, the hook will exit with a non-zero status. This causes pre-commit to abort the commit process, signaling that files were changed. The developer then needs to git add the modified files and attempt the commit again. This ensures that automated changes are reviewed and explicitly included in the commit.
    - Why this combination? It provides the convenience of auto-fixing while ensuring the developer is aware of and approves the changes made by the tool.
- - id: ruff-format:
  - Purpose: Runs the Ruff code formatter. This ensures the code style (line length, quotes, indentation, etc.) is consistent across the project, based on the configuration in pyproject.toml under [tool.ruff.format].
  - Behavior: Like the linting hook with --fix, if ruff-format modifies files, the commit will typically be aborted by pre-commit, requiring the developer to stage the changes and re-commit. This enforces that all committed code is correctly formatted.

3. Repository: https://github.com/pre-commit/mirrors-mypy

-   repo: https://github.com/pre-commit/mirrors-mypy
    rev: 'v1.7.1'
    hooks:
    -   id: mypy
        args: [--config-file=pyproject.toml]
        additional_dependencies: ['pandas-stubs', 'types-PyYAML', 'pytest']

repo: https://github.com/pre-commit/mirrors-mypy:
- What: Specifies the repository that provides a pre-commit compatible way to run Mypy. Mypy is a static type checker for Python. This mirrors-mypy repository makes it easy to integrate Mypy releases into the pre-commit framework.
rev: 'v1.7.1':
- What: Pins the hook to use a version corresponding to Mypy v1.7.1.
- Why: Ensures a consistent version of Mypy is used for local type checking.
hooks::
- - id: mypy:
  - Purpose: Executes Mypy to perform static type checking on your Python code based on type hints.
  - Why Mypy? It helps catch a class of errors (type mismatches, incorrect API usage based on types) before runtime, which can make code more robust, easier to understand, and simpler to refactor.
  - args: [--config-file=pyproject.toml]:
    - This tells Mypy to load its configuration from the [tool.mypy] section of your pyproject.toml file. This is the standard way to centralize Mypy's settings (like Python version, plugins, warning flags, paths).
  - additional_dependencies: ['pandas-stubs', 'types-PyYAML', 'pytest']:
    - Purpose: Mypy needs type information (often in the form of "stub" files, .pyi) for libraries that don't include inline type hints or don't distribute their own stubs.
    - pandas-stubs: Provides type stubs for the pandas library.
    - types-PyYAML: Provides type stubs for the PyYAML library. (Note: While PyYAML is not a direct runtime dependency of src/main.py, it might be a transitive dependency of one of your development tools, or this entry is for general robustness if YAML processing is done elsewhere in the project or by dev tools that Mypy might analyze indirectly).
    - pytest: Provides stubs for pytest itself, which is useful if your test code (tests/test_main.py) uses type hints for Pytest-specific features (e.g., fixture types).
    - Why additional_dependencies here? pre-commit runs each hook in an isolated environment. These dependencies are installed into that hook's specific environment so Mypy can find and use these stubs during its analysis. They don't necessarily need to be in your main project's dependencies or dev-dependencies in pyproject.toml (though pandas-stubs and types-PyYAML are good candidates for dev-dependencies if you want your IDE to also benefit from them).
    - Impact: Without these, Mypy might report "library stubs not found" or be unable to check code using these libraries effectively.

General Rationale & Alternatives:

Why Pre-commit? It standardizes the use of hooks across different developers and languages, manages the installation of hook environments, and makes it easy to configure and share a common set of checks.
Alternatives to Specific Hooks:
- Formatting/Linting: Before Ruff gained popularity, common choices were black for formatting, flake8 for linting, and isort for import sorting, often as separate hooks. Ruff aims to consolidate many of these.
- Type Checking: Mypy is the most established, but other type checkers like Pyright (used by VS Code's Pylance) or Pytype exist.
Configuration Location: While tool-specific configurations are rightly in pyproject.toml ([tool.ruff], [tool.mypy]), the hook definitions must be in .pre-commit-config.yaml.
Updating Hooks: It's good practice to periodically update the rev for each repository to get the latest bug fixes and features from the hooks. This can often be done with pre-commit autoupdate.

In summary, this .pre-commit-config.yaml sets up a robust local development workflow by:

Cleaning up basic file issues (whitespace, end-of-file).
Preventing accidental commits of large files or broken YAML.
Leveraging Ruff for extremely fast Python linting, auto-fixing, and formatting.
Using Mypy for static type checking, with necessary stubs for key libraries.

This configuration significantly contributes to code quality and developer productivity by automating these checks.

Understanding .pre‐commit‐config.yaml file - inzamamshajahan/github-actions-learn4 GitHub Wiki

Documentation for: my_python_project/.pre-commit-config.yaml

⚠️ **GitHub.com Fallback** ⚠️

Documentation for: `my_python_project/.pre-commit-config.yaml`

⚠️ GitHub.com Fallback ⚠️