Packaging a Python project - lmmx/devnotes GitHub Wiki

Initial setup for package metadata

  • git init

  • .gitignore

    Examples:

    • pyx-sys and mvdef
      • packaged projects with vim swap files and Python build/dist artifacts (build, dist, .eggs etc.)
    • dx
      • a packaged project as above, also with store/ and tmp/ data directories
    • emoji-liif with Jupyter notebooks
  • README.md

    More details:

    • title e.g. # myproject (add a brief description, copy this to the GitHub repo description)
    • ## Usage (brief note on what usage will look like if you've not written it yet)
    • ## Requirements list - Python 3 and if there are many packages, put them on one line usually
      • These should be the proper human-readable names (the pip names may differ, they go in requirements.txt)
      • If there are complex conda environment requirements or different options, write a little backticked block to reproduce it (as here for emoji-liif)
  • LICENSE

    Details/example:

    • If you haven't pushed to GitHub yet, copy a LICENSE (the text will be detected)

      • E.g. from dx (click 'Raw' to get plain text)
    • GitHub has a LICENSE picker tool at https://github.com/USER/REPO/community/license/new?branch=master

      • E.g. add &template=agpl-3.0 for a GPL license, or &template=mit for MIT license
  • pyproject.toml

    More details:

    • Handles the requirements for building binaries/wheels to ship (not required for package users), namely setuptools, wheel, setuptools_scm for auto-versioning

    • Copy the one for mvdef

    • Versioning is incremented by git tags (see the tags for mvdef for examples). These allow you to do 'releases' (again see mvdef)

      • The call to git tag has the format
        git tag -a "v$INCREMENTED" -m "$TAG_MESSAGE"
        where:
        • $INCREMENTED is the next version number (e.g. 0.1.2 is the next after 0.1.1), and it's preceded by "v"
        • $TAG_MESSAGE is a description of the code state at the time it's tagged (similar to a git commit message, an empty or missing git tag message is not allowed).
  • echo > requirements.txt (if you haven't written any code yet, otherwise list requirements for pip)

  • setup.py

    Details/examples:

    • Handles the metadata for pip installation
    • Copy an example like dx
      • Change the classifiers for 'intended audience', license, and so on, to match your project
      • find_packages will auto-detect the package inside src
      • If you want command-line entrypoints these can be added (e.g. mvdef has 4 such console_scripts)
      • The version_scheme interprets the git tag to auto-version the package accordingly
    • Alternatively you can use setup.cfg (PEP 517) but note that to be compatible with editable pip installs you will need a minimal setup.py (just to call setuptools.setup() if __name__ == "__main__")

Package source code

Paths in the following sections are under src/pkgname (replacing 'pkgname' with your package)

  • share indicates /src/pkgname/share

  • Handling paths to directories

    More details:

    • Create modules for each section (e.g. share, dataset, model) as well as a top-level __init__.py and any computation for things to be exported into the main namespace there in utils.py (but do not crowd this directory with anything else!)
  • Organising different types of data

    More details:

    • Make a deliberate plan about where your data will go within the package, not just the first place that comes to mind.

      • Does it really need to be elsewhere on your system? (You can exclude it from the distributed package easily)
    • Keep data separate from computation on that data

      • src contains both but not in the same subdirectories
    • Keep fixed data files used for computation in share/data

      • This will be included in the distributed package via setup.py by setting include_package_data=True in setuptools.setup()
    • Keep artifacts [outputs computed from running your package code] in store/

      • This will be excluded from the distributed package via .gitignore by the line **/store/*
    • Don't put data at the top-level (above src) outside the package, your code won't be able to interact with it properly (by exposing directory paths from __init__.py files)

      • as done for dx in share/data/__init__.py

        from . import __path__ as _dir_nspath
        from pathlib import Path as _Path
        
        __all__ = ["dir_path"]
        
        dir_path = _Path(list(_dir_nspath)[0])

        which is then provided to the share.scraper.parse_topics module:

        from ..data import dir_path as data_dir

GitHub

Once you've done initial set up for your project locally, go to GitHub and create a repo, copy the description you put in your README to the description

More details
  • If it's for a specific organisation (e.g. the engine to drive a website) put it there rather than your personal account

Immediately I run these steps (comments are my bashrc aliases)

Code snippet to set up the new package repo
git add --all # gadd
git commit -m "Initial commit" # gcmmtinit
git remote add origin [email protected]:lmmx/pkgname.git # gorigadd pkgname
git push -u origin master # gpushu
git tag -a "v0.0.1" -m "Initial tag"
git push origin "v0.0.1"

Local environment

  • Create a conda environment for local "editable" pip installation

    More details

    Having set up a git repo and put a working Python package in it, we can install the package to a Python 3.9 conda environment with pip install -e . (e for --editable)

    conda create -n pkgname_edit python
    conda activate pkgname_edit # hiss pkgname_edit
    pip install -e .
⚠️ **GitHub.com Fallback** ⚠️