Understanding the pyporject.toml file - inzamamshajahan/github-actions-learn4 GitHub Wiki
my_python_project/pyproject.toml
Documentation for: Overall Purpose:
The pyproject.toml
file is the unified configuration file for modern Python projects. Introduced by PEP 518 and expanded by subsequent PEPs (like PEP 621 for project metadata and PEP 660 for editable installs with build backends), its primary purpose is to standardize how Python projects declare their build dependencies and, increasingly, how they configure various development tools.
In this project, pyproject.toml
serves several key functions:
- Defining Build System Requirements: It tells tools like
pip
what is needed to build your project from source (e.g.,setuptools
). - Declaring Project Metadata: It provides core information about your project, such as its name, version, author, dependencies, and Python version compatibility. This metadata is used when packaging your project for distribution.
- Configuring Development Tools: It centralizes the configuration for various tools used in the development lifecycle, such as:
setuptools
: For packaging.ruff
: For linting and formatting.mypy
: For static type checking.pytest
: For running tests.bandit
: For security analysis.
Why pyproject.toml
?
- Standardization: It provides a single, standard place for build system and tool configuration, replacing a multitude of older, disparate files (e.g.,
setup.py
for build logic,setup.cfg
for declarative metadata,MANIFEST.in
,.isort.cfg
,.flake8
,.mypy.ini
, etc.). - Declarative: For project metadata and many tool configurations, it encourages a declarative approach, which is easier for both humans and tools to read and parse.
- Tool Interoperability: Build frontends (like
pip
) can understand how to build any project that has apyproject.toml
specifying its build backend.
1. [build-system]
Table:
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"
#backend-path = ["."]
- Purpose: This section (defined by PEP 518) tells build tools (like
pip
) what dependencies are needed to build your package and how to invoke the build process. requires = ["setuptools>=61.0"]
:- What: A list of packages that must be installed before the build backend can be invoked.
- Why:
setuptools
is the build system chosen for this project. Version61.0
or higher is specified, likely to ensure compatibility with modernpyproject.toml
features and thesrc
-layout. - Alternatives: Other build systems like
flit
,poetry
, orhatch
.setuptools
is a long-standing and widely used system.
build-backend = "setuptools.build_meta"
:- What: Specifies the Python object (the build backend) that frontends like
pip
will use to build the project. Forsetuptools
, this is the standard entry point. - Why: This allows
pip
to delegate the build process tosetuptools
in a standardized way.
- What: Specifies the Python object (the build backend) that frontends like
#backend-path = ["."]
(commented out):- What: If the
build-backend
object couldn't be imported directly (e.g., if it was part of a script in your project not installed in the build environment), this would tell the frontend where to find it relative topyproject.toml
. - Why commented out: For standard backends like
setuptools.build_meta
which are installed packages themselves, this is not usually needed.
- What: If the
2. [project]
Table (PEP 621 Metadata):
[project]
name = "my_data_project_src_main"
version = "0.1.0"
description = "A simple data transformation project (src/main.py) using pandas and numpy, with CI/CD via GitHub Actions and logging." # Updated description
readme = "README.md"
requires-python = ">=3.8"
license = {text = "MIT"}
authors = [
{name = "Inzamam", email = "[email protected]"},
]
dependencies = [
"pandas",
"numpy"
]
- Purpose: This section, defined by PEP 621, provides standardized, declarative metadata for your project. This is used when building distributable packages (e.g., wheels).
name = "my_data_project_src_main"
:- What: The canonical name of your project/package. This is how it would be identified on PyPI if published.
- Why: A unique identifier.
version = "0.1.0"
:- What: The current version of your project.
- Why: Essential for version management and dependency resolution. Semantic Versioning (e.g., MAJOR.MINOR.PATCH) is a common practice.
description = "..."
:- What: A short, one-sentence summary of the project.
- Why: Used by package indexes and tools to display a brief overview.
readme = "README.md"
:- What: Specifies the file to be used as the long description for the package (e.g., on PyPI). The content type is usually inferred from the file extension (
.md
implies Markdown). - Why: Provides detailed information to users of the package.
- What: Specifies the file to be used as the long description for the package (e.g., on PyPI). The content type is usually inferred from the file extension (
requires-python = ">=3.8"
:- What: Specifies the minimum Python version required to run this project.
- Why: Ensures users don't try to install or run the project on incompatible Python versions.
pip
will enforce this.
license = {text = "MIT"}
:- What: Declares the license for the project. Here, it directly states "MIT".
- Alternatives: Can also be
{ file = "LICENSE" }
if the full license text is in a separate file (which is also present in this project). Both are valid ways to specify.
authors = [ {name = "Inzamam", email = "[email protected]"} ]
:- What: Lists the author(s) of the project.
dependencies = [ "pandas", "numpy" ]
:- What: A list of runtime dependencies. These are the packages that your project needs to function correctly when it's run.
- Why: When someone installs your package (e.g., via
pip install my_data_project_src_main
), these dependencies will also be installed automatically. - Alternatives for versioning: You could specify versions here (e.g.,
"pandas>=1.0,<2.0"
,"numpy==1.21.0"
), but often for libraries, it's common to specify looser constraints and let the end-user application pin more specific versions if needed. For applications, pinning can be more common here too.
3. [project.optional-dependencies]
Table:
[project.optional-dependencies]
dev = [
"pytest==8.3.5",
"pytest-cov",
"mypy==1.14.0",
"ruff==0.11.10",
"bandit==1.7.9",
"safety==3.5.1",
"pre-commit==3.5.0",
"types-PyYAML",
"pandas-stubs",
]
- Purpose: Defines optional sets of dependencies. These are not installed by default when someone installs your package but can be requested explicitly.
dev = [...]
:- What: Defines an "extra" group named
dev
. This group lists dependencies that are useful for development but not required for the core functionality of the script. - How to install:
pip install .[dev]
(orpip install -e .[dev]
for an editable install). - Contents & Pinned Versions:
"pytest==8.3.5"
: Testing framework. Version pinned to 8.3.5."pytest-cov"
: Pytest plugin for code coverage. Version not pinned, sopip
will pick the latest compatible."mypy==1.14.0"
: Static type checker. Version pinned."ruff==0.11.10"
: Linter and formatter. Version pinned. Note: This version is older than the latest Ruff versions as of May 2024. It's good practice to keep this aligned with therev
in.pre-commit-config.yaml
if both point to the same tool, and update them together."bandit==1.7.9"
: Security linter. Version pinned."safety==3.5.1"
: Dependency vulnerability scanner. Version pinned."pre-commit==3.5.0"
: Framework for managing pre-commit hooks. Version pinned."types-PyYAML"
: Type stubs for PyYAML."pandas-stubs"
: Type stubs for Pandas.
- Why this group? Separates development tools from runtime dependencies, keeping the core installation lean. Pinning versions for development tools (as done for most here) is good practice for ensuring a consistent development environment across a team and over time.
- Alternatives:
- Listing all dev tools in a
requirements-dev.txt
file (older practice).pyproject.toml
is the modern way. - Not pinning versions (e.g., just
"pytest"
). This can lead to unexpected breakages if a new version of a tool has incompatibilities. Pinning offers more stability. However, it also means you need to actively manage and update these pins.
- Listing all dev tools in a
- What: Defines an "extra" group named
4. [tool.setuptools.packages.find]
Table:
[tool.setuptools.packages.find]
where = ["src"]
- Purpose: Configuration specific to
setuptools
(the build backend). packages.find
: Instructssetuptools
on how to automatically discover Python packages in your project.where = ["src"]
:- What: Tells
setuptools
to look for packages inside thesrc/
directory. - Why the
src
layout?- Prevents the common issue where, if your package code is in the root,
import my_package
might accidentally import from the local directory during development instead of the installed version, masking import problems. - Clearly separates package code from other project files (tests, docs, scripts).
- Prevents the common issue where, if your package code is in the root,
- How it works: If you have
src/my_package_name/__init__.py
,setuptools
will findmy_package_name
. In your case,src/main.py
is treated as a top-level module within thesrc
directory for packaging purposes. Your previous instructions also hadpy_modules = ["main"]
under[tool.setuptools]
, which is an alternative way to specify a single module for inclusion if it's not a full package (directory with__init__.py
). Thepackages.find
withwhere = ["src"]
is more conventional ifmain.py
is intended to be part of an importable package structure likemy_data_project_src_main.main
.
- What: Tells
5. [tool.ruff]
Table (and sub-tables):
[tool.ruff]
line-length = 200
[tool.ruff.lint]
select = ["E", "W", "F", "I","C", "B", "UP", "PT", "SIM"]
ignore = []
[tool.ruff.format]
quote-style = "double"
indent-style = "space"
- Purpose: Configures the Ruff linter and formatter.
line-length = 200
:- What: Sets the maximum allowed line length to 200 characters.
- Why: This is a project-specific style choice. Standard Python (PEP 8) suggests 79 characters for code and 72 for docstrings/comments, but many projects adopt longer lengths like 88 (Black's default) or 120 for modern screens. 200 is quite long and less common but entirely valid if the team prefers it.
[tool.ruff.lint]
: Linter-specific options.select = ["E", "W", "F", "I", "C", "B", "UP", "PT", "SIM"]
:- What: A list of rule prefixes or specific rule codes that Ruff should enable.
E
: Pycodestyle errors (e.g., syntax errors, indentation issues).W
: Pycodestyle warnings (e.g., style warnings).F
: Pyflakes (checks for logical errors like unused imports/variables).I
: isort (import sorting).C
: McCabe complexity.B
: Flake8-bugbear (finds likely bugs and design problems).UP
: pyupgrade (helps upgrade syntax to newer Python versions).PT
: flake8-pytest-style (checks for common Pytest anti-patterns).SIM
: flake8-simplify (helps simplify code).
- Why: Provides a comprehensive set of checks for code quality, style, and potential bugs.
- What: A list of rule prefixes or specific rule codes that Ruff should enable.
ignore = []
: A list of specific rule codes to disable. Currently empty, meaning all selected rules are active.
[tool.ruff.format]
: Formatter-specific options.quote-style = "double"
: Enforces the use of double quotes ("
) for strings where possible.indent-style = "space"
: Enforces using spaces for indentation (as opposed to tabs). This is standard for Python. (The number of spaces is usually 4, which is Ruff's default if not specified further).
6. [tool.mypy]
Table:
[tool.mypy]
python_version = "3.8"
warn_return_any = true
warn_unused_configs = true
plugins = ["numpy.typing.mypy_plugin"]
mypy_path = "src"
- Purpose: Configures the Mypy static type checker.
python_version = "3.8"
:- What: Tells Mypy to assume Python 3.8 syntax and standard library features for type checking.
- Why: Ensures Mypy checks against the baseline Python version specified in
requires-python
. This is important because type hints and standard library modules can differ between Python versions.
warn_return_any = true
:- What: Issues a warning if a function returns a value of type
Any
when it's not explicitly annotated as such. - Why: Encourages more precise type hinting, as
Any
effectively bypasses type checking for that part of the code.
- What: Issues a warning if a function returns a value of type
warn_unused_configs = true
:- What: Warns about Mypy configuration options that are not recognized or used.
- Why: Helps keep the Mypy configuration clean and detect typos or outdated options.
plugins = ["numpy.typing.mypy_plugin"]
:- What: Enables the Mypy plugin for NumPy.
- Why: NumPy's dynamic nature can make static type checking challenging. This plugin enhances Mypy's understanding of NumPy arrays and operations, leading to more accurate type checking and fewer false positives/negatives when working with NumPy.
mypy_path = "src"
:- What: Tells Mypy to look for modules within the
src
directory. This is crucial for projects using thesrc
layout, as it helps Mypy resolve imports correctly (e.g.,from main import ...
when checkingtests/test_main.py
). - Why: Ensures Mypy can find and type-check your project's modules correctly.
- What: Tells Mypy to look for modules within the
7. [tool.pytest.ini_options]
Table:
[tool.pytest.ini_options]
minversion = "6.0"
addopts = "-ra -q --cov=main --cov-report=term-missing --cov-fail-under=60"
testpaths = ["tests"]
- Purpose: Configures Pytest, the testing framework.
minversion = "6.0"
:- What: Specifies the minimum required version of
pytest
. - Why: Ensures that the tests are run with a
pytest
version that supports all the features and syntax used in the tests.
- What: Specifies the minimum required version of
addopts = "-ra -q --cov=main --cov-report=term-missing --cov-fail-under=60"
:- What: Specifies additional command-line options that
pytest
should always use when run.-ra
: Shows a short summary for "all" (passed, failed, skipped, etc.) tests at the end of the test session.-r
controls which summaries are shown,a
means all except passes (which are shown by default anyway in verbose mode).-q
(quiet): Reduces verbosity during test collection and execution, showing less output per test unless it fails.--cov=main
: Enables code coverage reporting (usingpytest-cov
) specifically for themain
module (which corresponds tosrc/main.py
due to the[tool.setuptools]
config).--cov-report=term-missing
: Specifies the coverage report format.term-missing
shows a summary in the terminal, including which lines were not covered.--cov-fail-under=60
: Causes the test suite to fail if the total code coverage is below 60%.
- Why these options? Automates coverage reporting and enforces a minimum coverage threshold, promoting good testing practices.
-ra -q
provides a concise yet informative summary.
- What: Specifies additional command-line options that
testpaths = ["tests"]
:- What: Tells
pytest
to look for tests specifically in thetests
directory. - Why: Standard practice for organizing tests and helps
pytest
discover them quickly.
- What: Tells
8. [tool.bandit]
Table:
[tool.bandit]
- Purpose: This section is reserved for configuring Bandit, the security linter.
- Content: Currently empty.
- How Bandit uses it: If Bandit-specific configurations were needed (e.g., excluding certain tests, defining a baseline, setting severity levels for specific issues), they would be added here. For now, Bandit will use its default settings when run via the GitHub Action (
bandit -r src -c pyproject.toml
). The-c pyproject.toml
tells Bandit to look for this section.
Summary & Best Practices:
- Centralization: This
pyproject.toml
effectively centralizes build system definition, project metadata, and configurations for key development tools. - Modern Standards: Adheres to modern Python packaging PEPs.
- Reproducibility: Pinning versions of development dependencies (like
pytest
,mypy
,ruff
) in[project.optional-dependencies].dev
is a good step towards more reproducible local development and CI environments. - Clarity for
src
layout: Configurations like[tool.setuptools.packages.find].where = ["src"]
and[tool.mypy].mypy_path = "src"
are essential for correctly handling thesrc
layout.
Future Considerations:
- Pinning all dev dependencies: While some are pinned (e.g.,
pytest==8.3.5
), others likepytest-cov
are not. For maximum reproducibility of the dev environment, all dev dependencies could be pinned. Tools likepip-tools
(withpip-compile
) can help manage pinned dependencies from a more abstract list. - Ruff
rev
in.pre-commit-config.yaml
vs.ruff
inpyproject.toml
: Ensure the Ruff version used by the pre-commit hook (v0.11.10
based on therev
you provided forastral-sh/ruff-pre-commit
) is consistent with the Ruff version specified inpyproject.toml
'sdev
dependencies (alsoruff==0.11.10
in your example). It's good practice to update both together. Self-correction: Therev
forruff-pre-commit
itself is a tag of theruff-pre-commit
repository, which then specifies which version ofruff
it uses. Forrev: 'v0.11.10'
, it indeed usesruff==0.0.291
. There seems to be a mismatch here, as yourpyproject.toml
aims forruff==0.11.10
. Theruff-pre-commit
rev should ideally be something likev0.4.4
to get aruff
version in the0.4.x
series, or whatever is the latest stable pre-commit hook version. - Bandit Configuration: If specific Bandit checks need to be enabled/disabled or severities adjusted, this section could be populated.
This pyproject.toml
file sets a strong foundation for a well-managed and high-quality Python project.