Onboarding Guide - Telecominfraproject/oopt-gnpy GitHub Wiki

Onboarding Guide

The gnpy and gnpy-core projects are part of the Open Optical Packet Transport project group under the Telecom Infra Project. This project is lead by the Physical Layer Simulation Environment (PSE) working group.

The gnpy and gnpy-core repos seek to develop high-quality, community-driven open source tools for Guassian Noise modelling of amplified optical links.

This is the onboarding guide for new code contributors to the gnpy and gnpy-core code repositories.

Overall Philosophy

As part of the working group developing gnpy and gnpy-core, we will:

  • deliver high quality open-source software
  • driven by a community of both optical engineering and software development experts
  • developed using industry-standard software development practices

We measure our success in developing software along these lines by the ease of onboarding a new code contributor:

  • how long does it take a new code contributor to find the interesting code in this repo?
  • can a reasonably experience Python programmer with no prior optical engineering knowledge make sense of the code?
  • can a new code contributor quickly build all code on a brand-new laptop?
  • are tests sufficient that a new code contributor can make changes without worrying about non-local consequences?
  • is documentation sufficient that a party wishing to integrate this code can understand its APIs and other externally-facing behavior without needing to read through the code?
  • are tests and documentation sufficient that a code contributor can implement new features without needing excessive prior coordination with the core development team?
  • is the development process transparent enough that a code contributor can find roadmaps, tasks, and progres without needing excessive prior coordination with the core development team?

Simplifying Assumptions

The development process introduces a number of variables. These repositories rely on the following simplifying assumptions to reduce the number of variables in our development practice.

We pledge to support development on the following platforms:

  • Linux (specifically, Ubuntu 16.04 LTS or equivalent)
  • OS X (specifically, El Capitan or later)
  • Windows (specifically, Windows 10 or later)

We pledge to support deployment on the following platforms:

  • Linux (specifically, Ubuntu 16.04 LTS or equivalent)

We require the following baseline software dependencies for development and deployment:

  • Python 3.6 (or later)
  • numpy 1.1x (or later)
  • scipy 1.x (or later)

We may closely track versions of Python, numpy, and scipy. We will maintain a document of potential backwards-incompatibilities.

We support & encourage software dependency management via:

Quick Start

  1. Join the Telecom Infra Project as a member: http://telecominfraproject.com/apply-for-membership/
  2. Join the OOPT mailing list: link needed
  3. Fork this repository & clone locally.
  4. Install as outlined in the docs
  5. Run tests to set baseline.
  6. Make code changes.
  7. Run tests & ensure all pass.
  8. Submit patches to Gerrit (or, if you cannot do that, submit a pull request via Github).

Setting Up Your Python Environment

We heavily recommend the use of Anaconda for setting up your Python Environment. Anaconda is a Python distribution that includes up-to-date versions of all of the scientific computing libraries that are used in gnpy.

This will greatly simplify management of your development environment, especially if you are a Windows user.

To install Anaconda:

  1. Go to http://anaconda.com/downloads
  2. Download the Python 3 version for your operating system
  3. Follow the instructions to install Anaconda
  4. (NOTE: On Linux and macOS, the last step of the install process will asked if you want to add Anaconda to your $PATH. You probably do not want this.)
  5. When using Anaconda for application development on Windows: use the special Anaconda Shell in your Start Menu. When using Anaconda for application development on Linux or macOS, either source the activate script in anaconda/bin or always specify the full path to the python binary you want. (See below for more details.)

Notes on PATH, PYTHONPATH and Your Python Environment (Linux, macOS only)

On Linux and macOS, the location of command-line programmes is determined by an environmental variable called $PATH.

$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

$PATH is a colon (:) separated list of directories. When you type a command at the prompt, your shell will look through these directories to find an executable file with that name. The which command indicates which executable file will be run when you type a command at the prompt.

For example:

$ which python # where is Python located on my computer?
/usr/bin/python
$ python -V # what version of Python are you?
Python 3.6.5

We can verify the above from within python:

>>> from sys import executable # where is your executable located?
>>> executable
'/usr/bin/python'
>>> from sys import version_info # what is your version?
>>> version_info 
sys.version_info(major=3, minor=6, micro=5, releaselevel='final', serial=0)

If you have Anaconda installed, you can specify whether to use the Anaconda Python in two ways. Either:

  1. Specify the full path to the Python binary
  2. "Activate" the environment

Specifying the full path to the Python binary means typing /path/to/anaconda/bin/python instead of python:

$ which python # "system-level Python"
/usr/bin/python
$ python -V # an older version
3.5.1
$ python -c 'from sys import version_info; print(version_info)'
sys.version_info(major=3, minor=5, micro=1, releaselevel='final', serial=0)
$ which /path/to/anaconda/bin/python # "application-development Python" <- this is the one you want
$ /path/to/anaconda/bin/python -V # the latest version
3.6.5
$ /path/to/anaconda/bin/python -c 'from sys import version_info; print(version_info)'
sys.version_info(major=3, minor=6, micro=5, releaselevel='final', serial=0)

Activating the Anaconda environment can save you some typing. At your shell, use source /path/to/anaconda/bin/activate to activate the environment. When you activate the environment, it makes your default Python the Anaconda Python for the duration of that shell.

$ which python # before
/usr/bin/python
$ source /path/to/anaconda/bin/activate # activate Anaconda Python
(base) $ which python # after
/path/to/anaconda/bin/python
$ source /path/to/anaconda/bin/deactivate # deactivate Anaconda Python
$ which python # back to normal
/usr/bin/python

It is not unusual to have multiple versions of Python installed on your computer at the same time.

In fact, this is preferrable. On operating systems such as Ubuntu Linux, the "system-level" Python installation is needed to run core functionality for the operating system, such as the package manager. Your application development should not interfere with this system-level operation. Your application development environment should be isolated. This is why we recommend the use of Anaconda: Anaconda installs an isolated application development environment onto your computer that allows you to use the most up-to-date versions of the libraries in a safe and predictable fashion.

When importing a library in Python, Python checks its own library path. This is available from within a Python interpreter with the sys.path variable.

>>> from sys import path
>>> path
['', '/usr/lib/python36.zip', '/usr/lib/python3.6', '/usr/lib/python3.6/lib-dynload', '/usr/lib/python3.6/site-packages', '/usr/lib/python3.6/site-packages/linkgrammar']

According to the above, when we try to import a module like math or matplotlib, Python will search for that module in above locations. NOTE: the first entry in the path ('') means the current working directory.

When you install gnpy using python setup.py install, it will be installed to the site-packages/ directory. To verify that an installation was correct, you can do the following in your shell:

$ python -c 'import gnpy'

If you successfully installed gnpy using python setup.py install and you see an ImportError when trying the above, that typically means:

you have multiple installations of Python on your computer (e.g., one version for system software, one version for application development), and you did not install gnpy to the Python you're looking at currently.

Make sure you are installing gnpy to the same Python environment that you are using for development, testing, or production use.

You may not want to constantly python setup.py install gnpy every time you make a change when you are actively developing features. If you are actively developing gnpy, then you can use the PYTHONPATH environment variable to simplify your work.

Without PYTHONPATH:

>>> from sys import path
>>> path # Python will look in the install directories to find `gnpy`
['', '/path/to/anaconda/lib/python36.zip', '/path/to/anaconda/lib/python3.6', '/path/to/anaconda/lib/python3.6/lib-dynload', '/path/to/anaconda/lib/python3.6/site-packages']

Setting PYTHONPATH:

$ which python # check that we're using the right Python
/path/to/anaconda/bin/python
$ PYTHONPATH=/path/to/oopt-gnpy/gnpy python # tell Python to use the code from your git clone

Using PYTHONPATH:

>>> from sys import path
>>> path # Python will first look in the git clone directory for `gnpy`
['', '/path/to/oopt-gnpy/gnpy', '/path/to/anaconda/lib/python36.zip', '/path/to/anaconda/lib/python3.6', '/path/to/anaconda/lib/python3.6/lib-dynload', '/path/to/anaconda/lib/python3.6/site-packages']

Code Review & Pull Request Practices

Code contributors will not be allowed to push directly to this repo. Changes to this repo must be made via patches on GerritHub, or, if a contributor has trouble working with Gerrit, via a GitHub pull request.

The purpose of our repository is not to reflect the development process. Instead, it is to reflect how the software transitions from working state to working state while managing internal and external dependencies.

Pull requests must satisfy the following conditions:

  • to the best of your ability, rebase into logical commits (i.e., squash commits into those which isolate individual features or changes made, rather than the development process itself. No "typo" commits.)
  • where possible, must transition the repository from working state to working state.
  • if they affect existing functionality, must include amendments to all affected tests.
  • if they introduce new functionality, must include corresponding tests.
  • must include appropriate in-repo documentation.
  • must handle any merge conflicts (i.e., it is the responsibility of the party submitting the pull request to address merge conflicts.)
  • must comport with code quality standards listed below.
  • must comport with branching strategy detailed below.

The following users will have the authority to approve pull requests and tag releases:

The above set of code reviewers has been chosen to provide subject matter expertise in each area involved in gnpy (software development, physical simulation & modelling, and industry optical engineering.) The above set of code reviewers is subject to change.

Pull requests will be accepted via a consensus model, with agreement of all reviewing parties above, (understanding that reviewing responsibilities will often be delegated for efficiency.)

Git commit messages will be encouraged to follow the "seven rules of great Git commit messages":

  • Separate subject from body with a blank line
  • Limit the subject line to 50 characters
  • Capitalize the subject line
  • Do not end the subject line with a period
  • Use the imperative mood in the subject line
  • Wrap the body at 72 characters
  • Use the body to explain what and why vs. how

Reference:

Branching (gitflow) & Tagging (semvar)

This repo will follow a simplified gitflow branching model.

To outline:

  • Anything in the master branch must be deployable.
  • To work on a new feature, create a branch with a descriptive name off of master.
  • Commit to that branch locally, pushing your work to a named branch on Github.
  • Open a pull request when ready for merging.
  • After code review and signoff, code will be merged into master.
  • Once merged, "deploy" immediately.

References:

When code in this repo reaches a v1.0 state, we will begin tagging releases. We will tag and version releases in accordance with the semantic versioning model:

To outline:

  • Versions will have three parts: MAJOR.MINOR.PATCH
  • Bumps in MAJOR will indicate backwards-incompatible, usually API-level, changes.
  • Bumps in MINOR will indicate added functionality that may be backwards-compatible.
  • Bumps in PATCH will indicate backwards-compatible bug fixes.

References:

Code Quality: Style Guide

In order to maintain consistent code quality throughout the gnpy codebase, we adopt the following practices:

  • we will rely on Python's PEP-8 as our baseline for code style.
  • we will not mandate specific linting plugins (e.g., flake8) which often introduce busywork.
  • we will not overfocus on style as part of code-review: we will encourage a consistent and standard style but we will not reject code on the sole basis of style (unless in exceptional circumstances.)
  • we will enforce style standards on the basis of locality: code that sits in the same file MUST follow the same code style, code that sides in the same package SHOULD follow the same code style, standalone code is ENCOURAGED to follow a consistent style guide.

We will strongly enforce the following guidelines:

  • spaces must be used for indentation & alignment.
  • code files must be in UTF-8 encoding.
  • lines must be less than 120 characters in length.

We will provide additional guidance on numeric-computing-specific guidelines. For example:

  • we will allow both from numpy import arange and import numpy as np-style imports. (The former is preferred in the case of tight-loops.)

References:

Code Quality: Code Complexity

Our goal is to produce software that can be understood by a sufficiently motivated optical engineer with a reasonable code background.

We will make use of Python, numpy, and other language and library features insofar as they allow us to accomplish our goal. We will allow use of features or mechanisms that would not be immediately understandable to our audience, only with adequate provision of testing and documentation, and only in cases where this complexity is well-circumscribed.

Testing

The intention of testing is to help developers produce robust code within a "tight" iteration cycle.

It is through automated testing that new code contributors can be confident in their work.

We will adopt the following forms of testing:

  • doctests for "explanatory" testing
  • pytest for full-scale "unit" and "integration" testing
  • TODO: custom tests for "validation"

Further guidelines will be available via the code-review process and will be detailed here.

Reference:

Continuous Integration: We will use Travis-CI for continuous integration. Commits that "break the build" will be the responsibility of the author to resolve.

Documentation

We will adopt Sphinx for documentation: http://www.sphinx-doc.org/en/stable/

We will provide three forms of documentation:

  • ad-hoc text or Sphinx-structured text in function, class, and module docstrings for API-level documentation
  • Sphinx-structured text in a docs/ folder in the repository for supplementary (theoretical, architectural, or non-specific-code-related) documentation
  • pages in the Github wiki (such as this one)

Further guidelines will be available via the code-review process and will be detailed here.

Reference:

General Python Reference

For code contributors new to Python, there are many resources to quickly get up to speed.

For books, we recommend anything by David Beazley, including:

For online tutorials and videos, we recommend checking out https://pyvideo.org, especially any talks from a PyData or SciPy conference.

Below are a sampling of relevant talks that you can find on PyVideo: