Packaging a Python project - lmmx/devnotes GitHub Wiki
-
git init
-
.gitignore
Examples:
-
pyx-sys and mvdef
- packaged projects with vim swap files and Python build/dist artifacts (
build
,dist
,.eggs
etc.)
- packaged projects with vim swap files and Python build/dist artifacts (
-
dx
- a packaged project as above, also with
store/
andtmp/
data directories
- a packaged project as above, also with
- emoji-liif with Jupyter notebooks
-
pyx-sys and mvdef
-
README.md
More details:
- title e.g.
# myproject
(add a brief description, copy this to the GitHub repo description) -
## Usage
(brief note on what usage will look like if you've not written it yet) -
## Requirements
list- Python 3
and if there are many packages, put them on one line usually- These should be the proper human-readable names (the pip names may differ, they go in requirements.txt)
- If there are complex conda environment requirements or different options, write a little backticked block
to reproduce it (as here for
emoji-liif
)
- title e.g.
-
LICENSE
Details/example:
-
If you haven't pushed to GitHub yet, copy a LICENSE (the text will be detected)
- E.g. from dx (click 'Raw' to get plain text)
-
GitHub has a LICENSE picker tool at
https://github.com/USER/REPO/community/license/new?branch=master
- E.g. add
&template=agpl-3.0
for a GPL license, or&template=mit
for MIT license
- E.g. add
-
-
pyproject.toml
More details:
-
Handles the requirements for building binaries/wheels to ship (not required for package users), namely
setuptools
,wheel
,setuptools_scm
for auto-versioning -
Copy the one for mvdef
-
Versioning is incremented by git tags (see the tags for mvdef for examples). These allow you to do 'releases' (again see mvdef)
- The call to
git tag
has the formatwhere:git tag -a "v$INCREMENTED" -m "$TAG_MESSAGE"
-
$INCREMENTED
is the next version number (e.g.0.1.2
is the next after0.1.1
), and it's preceded by "v" -
$TAG_MESSAGE
is a description of the code state at the time it's tagged (similar to a git commit message, an empty or missing git tag message is not allowed).
-
- The call to
-
-
echo > requirements.txt
(if you haven't written any code yet, otherwise list requirements for pip) -
setup.py
Details/examples:
- Handles the metadata for pip installation
- Copy an example like dx
- Change the classifiers for 'intended audience', license, and so on, to match your project
-
find_packages
will auto-detect the package insidesrc
- If you want command-line entrypoints these can be added (e.g.
mvdef has 4 such
console_scripts
) - The
version_scheme
interprets thegit tag
to auto-version the package accordingly- I use helper functions
retag_republish_major
/minor/micro to automate the simultaneous git tag and PyPi republish steps, see Rebuild Python package for PyPi and republish for code/further details
- I use helper functions
- Alternatively you can use
setup.cfg
(PEP 517) but note that to be compatible with editable pip installs you will need a minimalsetup.py
(just to callsetuptools.setup()
if__name__ == "__main__"
)- See screed’s example of
setup.cfg
and minimalsetup.py
(via)
- See screed’s example of
Paths in the following sections are under src/pkgname
(replacing 'pkgname' with your package)
-
share
indicates/src/pkgname/share
-
Handling paths to directories
More details:
- Create modules for each section (e.g.
share
,dataset
,model
) as well as a top-level__init__.py
and any computation for things to be exported into the main namespace there inutils.py
(but do not crowd this directory with anything else!)
- Create modules for each section (e.g.
-
Organising different types of data
More details:
-
Make a deliberate plan about where your data will go within the package, not just the first place that comes to mind.
- Does it really need to be elsewhere on your system? (You can exclude it from the distributed package easily)
-
Keep data separate from computation on that data
-
src
contains both but not in the same subdirectories
-
-
Keep fixed data files used for computation in
share/data
- This will be included in the distributed package via
setup.py
by settinginclude_package_data=True
insetuptools.setup()
- This will be included in the distributed package via
-
Keep artifacts [outputs computed from running your package code] in
store/
- This will be excluded from the distributed package via
.gitignore
by the line**/store/*
- This will be excluded from the distributed package via
-
Don't put data at the top-level (above
src
) outside the package, your code won't be able to interact with it properly (by exposing directory paths from__init__.py
files)-
as done for
dx
inshare/data/__init__.py
from . import __path__ as _dir_nspath from pathlib import Path as _Path __all__ = ["dir_path"] dir_path = _Path(list(_dir_nspath)[0])
which is then provided to the
share.scraper.parse_topics
module:from ..data import dir_path as data_dir
-
-
Once you've done initial set up for your project locally, go to GitHub and create a repo, copy the description you put in your README to the description
More details
- If it's for a specific organisation (e.g. the engine to drive a website) put it there rather than your personal account
Immediately I run these steps (comments are my bashrc aliases)
- There is a gotcha on newer accounts whose default git branch
has been renamed to
main
. Either change your default branch name back to 'master' or modify to suit your default branch name. - GitHub added an extra line to their default suggested workflow, to explicitly declare the branch name, but this is unnecessary (I think just to alert users to the enforced changed defaults) as the line which pushes to the branch includes a branch name and implicitly sets it up for use
Code snippet to set up the new package repo
git add --all # gadd
git commit -m "Initial commit" # gcmmtinit
git remote add origin [email protected]:lmmx/pkgname.git # gorigadd pkgname
git push -u origin master # gpushu
git tag -a "v0.0.1" -m "Initial tag"
git push origin "v0.0.1"
-
Create a conda environment for local "editable" pip installation
More details
Having set up a git repo and put a working Python package in it, we can install the package to a Python 3.9 conda environment with
pip install -e .
(e for--editable
)conda create -n pkgname_edit python conda activate pkgname_edit # hiss pkgname_edit pip install -e .