Python Application Layouts and Packaging - BKJackson/BKJackson_Wiki GitHub Wiki


Python Package Lifecycle

Poetry Python packaging and dependency management made easy

Poetry - Poetry comes with all the tools you might need to manage your projects in a deterministic way. A single tool to manage Python projects from start to finish.
Integrating Python Poetry with Docker

Python Applications

Python Application Layouts: A Reference - Includes simple Python CLI app, as well as Django and Flask apps

Python Modules

Official Python docs on Modules
List Modules, Search Paths, Loaded Modules with pydoc

Tips for __init__.py files

The command import mypackage runs the code in mypackage/__init__.py.

Creating an __init__.py file
What's __init__ for me? - Designing for Python package import patterns, Jan 8, 2019
Making a Python Package - from The Joy of Packaging, a Scipy 2018 Tutorial on packaging from start to finish for both PyPI and conda
Overview of Python Packaging - Great overview, covers source and binary packaging

The code in the __init__.py file will be run automatically when the package is imported. So the import statement will run the code in mypackage/__init__.py.

Packaging and importing your local python modules

Three steps:

* put the .py file in a separate folder  
* add an empty __init__.py file to the folder  
* add that folder to the Python path with sys.path.append  

Example usage:

# add local python functions
import sys

# add the 'src' directory as one where we can import modules
src_dir = os.path.join(PROJ_ROOT, "src")
sys.path.append(src_dir)

# import my method from the source code
from features.build_features import remove_invalid_data 

Python Module Search Paths

If you import a module, let's say "import xyz", the interpreter searches for this module in the following locations and in the order given:

  1. The directory of the top-level file, i.e. the file being executed.
  2. The directories of PYTHONPATH, if this global environment variable of your operating system is set.
  3. The standard installation path Linux/Unix e.g. in /usr/lib/python3.5.

Find out where a module is located after it has been imported

>>> import numpy
>>> numpy.__file__
'/usr/lib/python3/dist-packages/numpy/__init__.py'

Packages vs. Modules

Module: A python module is a single namespace, usually corresponding to a single file, with a collection of values:

  • functions
  • constants
  • class definitions
  • really any old value

Packages:
A package is essentially a module, except that it can have other modules (and even other packages) inside of it. A package usually corresponds to a directory with a __init__.py file in it.

Setuptools and how to setup setup.py

THE setup.py explainer on github
Python Packaging Tutorial: The setup.py file
Running setup.py
capitalize setup example
setup.py vs requirements.txt- requirements.txt is bad for packaging python code
Configuring setup() using setup.cfg files
Setuptools docs

Libraries vs. applications

"While a library tends to want to have wide open ended version specifiers an application wants very specific dependencies. It may not have mattered up front what version of requests was installed but you want the same version to install in production as you developed and tested with locally."
Source: setup.py vs requirements.txt setup.py - abstract dependencies
requirements.txt - concrete dependencies
How to sync setup.py and requirements.txt:
Given a directory with a setup.py inside of it you can write a requirements file that looks like:

--index-url https://pypi.python.org/simple/

-e .  

Now your pip install -r requirements.txt will work just as before. It will first install the library located at the file path . and then move on to its abstract dependencies, combining them with its --index-url option and turning them into concrete dependencies and installing them. Source

Where to put your custom code

Where to put your custom code? - A suggestion for how to manage your personal library of python functions you might use for scripting, data analysis, etc. Make a personal python package.

Source vs. Binary Packages

Source:
A source package is all the source code required to build the package.
Package managers (like pip) can automatically build your package from source.
But:

  • Your system needs the correct tools installed, compilers, build tools, etc.
  • You need to have the dependencies available.

Binary:
A collection of code all ready to run.

  • Everything is already compiled and ready to go. But:
  • It's likely to be platform dependent.
  • May require dependencies to be installed.

Package overview PEP 427 -- The Wheel Binary Package Format 1.0

Native Binary Packages & Scientific Computing

SciPy 2018 Packaging Tutorial: Binaries and Dependencies

Advantage of native binaries over CPython

Greater performance is achieved with native binaries over CPython because:

  1. Tasks are compiled down to minimal processor operations, as opposed to high level programming language instructions that must be interpreted
  2. Parallel computing is not impared by CPython’s Global Interpreter Lock (GIL)
  3. Memory can be managed explicitly and deterministically

Why C?

Many existing scientific codes are written in programming languages other than Python. It is necessary to re-use these libraries since:

  • Resources are not available to re-implement work that is sometimes the result of multiple decades of effort from multiple researchers.
  • The scientific endeavor is built on the practice of reproducing and building on the top of the efforts of our predecessors.

The lingua franca of computing is the C programming language because most operating systems themselves are written in C.

Build host artifacts

These are files required on the host system performing the build. This includes header files, *.h files, which define the C program symbols, i.e. variable and function names, for the native binary with which we want to integrate. This also usually includes the native binaries themselves, i.e. the executable or shared library. An important exception to this rule is libpython, which we do not need on some platforms due to weak linking rules.

Target system artifacts

These are artifacts intended to be run on the target system, typically the shared library C-extension.

Host - Target cross-compiling

When the build host system is different from the target system, we are cross-compiling.

For example, when we are building a Linux Python package on macOS is cross-compiling. In this case macOS is the host system and Linux is the target system.

Python Packaging Authority docs

Python Packaging User Guide
Python Packaging Authority - The Python Packaging Authority (PyPA) is a working group that maintains many of the relevant projects in Python packaging. How To Package Your Python Code
A sample Python project - Github

pip - The Python Package Installer

pip Official docs home
pip User Guide
Installing from local packages with pip

Scipy 2018 Packing Tutorial

How to build python packages and distribute them in a central repository
Note: It is highly recommended to create a dedicated environment before developing or building your distribution.
Publishing to PyPI with twine
Setting up continuous integration with CircleCI

Articles on Python Packages, Projects, and Open Source

Current State of Python Packaging - 2019 - April 26, 2019
How To Package Your Python Code Hello, world! level tutorial.
Python Boilerplate Template
Open Sourcing a Python Project the Right Way Jeff Knupp
A Python Project Skeleton From Learn Python the Hard Way (book)
Structuring Your Python Project From The Hitchhiker's Guide to Python (book)
A Sample Python Repo Kenneth Reitz
Modules & Packaging Tutorial From Computational Statistics in Python (Duke)
The easy (and nice) way to do CLI apps in Python T. Stringer, 2016