Python Packages and Programming Environments - rmsouza01/AI2-Lab GitHub Wiki

Python Packages and Programming Environments

Python comes installed by default on many operational systems (OS), such as Linux and Mac OS. You can also install it using software distributions like Anaconda and Miniconda. There are other software distribution options, but Anaconda and Miniconda are very popular, especially in the data science community. Anaconda comes with over 150 data science packages, whereas Miniconda comes with a handful of the most commonly used libraries for data science applications.

Managing Python Packages

There are a number o freely available Python packages that are useful for a number of different domains. In this course, we focus mostly on the packages and functionalities that come natively with Python, but we will also discuss the Numerical Python (NumPy) library, which is used to process multi-dimensional arrays efficiently, and matplotlib, which is used for creating static, animated, and interactive visualizations.

There are two main tools for managing Python packages: pip and conda (do not confuse it with Anaconda or Miniconda). Installing a package is as simple as opening a command terminal window and typing:

rmsouza@bar:~$ pip install <package name>

or

rmsouza@bar:~$ conda install <package name>

pip only installs Python packages that are available in the Python Package Index(PyPI). conda on the other hand can install not just Python packages but Python itself, executable files and other third-party software. conda searches conda-forge for the software it is going to install. conda-forge offers a relatively smaller number of third-party Python packages, but all the main ones should be available there.

Virtual Environments

It is very common to work on multiple Python projects with different package requirements. For example, project A requires package X version 1.0, while project B requires the same package but version 2.0. A virtual environment is a tool that helps to keep dependencies required by different projects separate by creating isolated Python virtual environments for them.

Having one new virtual environment for every Python project you work on is generally recommended. So the dependencies of every project are isolated from the system (i.e., not in the global Python installation that comes in Linux and Mac computers) and each other. With pip, you can install a module named virtualenv, which is a tool to create isolated Python environments. virtualenv creates a folder which contains all the necessary executables to use the packages that a Python project would need. With conda, you have a built-in module to create virtual environments (link).

Programming Environments and Extensions

Integrated Development Environment (IDE)

An integrated development environment (IDE) is a software application that provides a place for computer programmers to develop software efficiently. An IDE often has at least a source code editor and a debugger. IDEs often allow you to install extensions to make your life as a programmer easier. For example, in this class, you are learning Python programming, and you are programming using your own local computer. You can install extensions to allow you to code using other programming languages, or you may want to run your code on a remote server (i.e., the cloud). For that effect, you can install an ssh extension (secure shell) to access the remote server from your locally installed IDE. There are extensions that help you format your code according to the style guide for Python code (PEP8), there are extensions to help you find Python syntax mistakes or to auto-complete your code as you type it to become more efficient. Finally, another important extension is the Git extension which allows you to manage and keep track of your source code history. GitHub is a popular, cloud-based hosting service that lets you manage Git repositories. All these extensions may seem overwhelming at first. For this course, it is good for you to know that they exist, but our focus will be on learning the Python programming language. Proficiency with all these extensions to increase the quality of your code and to help you become a better programmer will come with time and experience.

There are many IDEs that you can use for Python programming, we recommend the following two options below:

  • PyCharm: A powerful, multiplatform Python IDE. As an UCalgary student, you can get a free license! PyCharm was designed for coding in Python.
  • Visual Studio Code: Another powerful, multiplatform IDE with a rich set of extensions, including a Python extension. It is recommended if you will be dealing with projects across multiple programming languages.

Both Pycharm and Visual Studio Code graphical interfaces are very similar graphical interfaces.

PyCharm PyCharm

Visual Studio Code

Visual Studio Code

Jupyter Notebooks

Jupyter Notebook is an open-source web application that you can use to create and share documents that contain live code, Python in particular, equations, visualizations, and text. It is often used for educational purposes due to its interactive computing and the possibility to alternate between text and code.

Jupyter Notebook can be easily installed using either pip or conda. The Jupyter Notebook graphical interface is shown below.

Jupyter Notebook Jupyter Notebook

If you have a Google email account (e.g., gmail), you will have access to google COLAB for free. Google COLAB gives you similar functionalities compared to Jupyter Notebook, but it is cloud-based, and Jupyter is not. This means that if you work in Google COLAB, you do not have to worry about downloading and installing anything to your hardware. All the Python packages you need for this course are already installed there, so if you do not want to worry about installing Python, virtual environments, IDEs, and managing packages, Google COLAB is a great free resource to start learning how to code in Python!

Summary

We covered different ways to install Python (e.g., Anaconda and Miniconda) and manage Python packages (pip and conda). We also explained that setting up a separate virtual environment for each project is a good way to manage different package versions across your projects. We also covered some IDE options that allow you to install extensions to improve your Python code and increase your productivity.

Since you are in your first year and this might be your first programming course, starting with the Jupyter Notebook environment either installed on your computer or available online through google COLAB is a good option. After getting some experience with Python, you can work your way up to using more sophisticated IDEs and extensions.

Relevant Links

Python Installation

Package Managers

IDEs

Virtual Environment Managers

Jupyter Notebook