Python - kamialie/knowledge_corner GitHub Wiki

Contents

Modules

Global variables

__name__ global variable is set to __main__ when the module is called directly on the command line. To avoid code running on import event, but to automatically run it if invoked on the command line, put function invokation inside test block:

def function():
	pass

if __name__ == '__main__':
	function()

Text

Strings

Format specification mini-language

Encryption

Built-in hashlib module

import hashlib

secret = "password"
binary_secret = search.encode()
m = hashlib.md5(binary_secret)

m.digest()

cryptography third-party tool

$ pip install cryptography

Fernet is an implementation of AES algorythm (symmetric encryption - data is encrypted and descrypted with the same key)

from cryptography.fernet import Fernet

# if need to save key to a file, choose binary data type 
key = Fernet.generate_key()

f = Fernet(key)
message = b"secret message"
encrypted = f.encrypt(message)
descrypted = f.decrypt(encrypted)

RSA is a popular asymmetric encryption algorythm (public key to encrypt data, private key to decrypt)

from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives.asymmetric import rsa

private_key = rsa.generate_private_key(public_exponent=65537,
										key_size=4096,
										backend=default_backend())

from cryptography.hazmat.primitives.asymmetric import padding
from cryptography.hazmat.primitives import hashes

message = b"secret message"
padding = padding.OAEP(mgf=padding.MGF1(algorithm=hashes.SHA256()),
						algorithm=hashes.SHA256(),
						label=None)
encrypted = private_key.encrypt(message, padding)
decrypted = private_key.decrypt(encrypted, padding)

Regex

Parse CLF

import re

line = '127.0.0.1 - rj [13/Nov/2019:14:43:30] "GET HTTP/1.0" 200'

pattern = r'(?P<IP>\d+\.\d+\.\d+\.\d+)'
pattern += r' - (?P<User>\w+) '
pattern += r'\[(?P<Time>\d\d/\w{3}/\d{4}:\d{2}:\d{2}:\d{2})\]'
pattern += r' (?P<Request>".+")'

matched = re.search(pattern, line)

matched.group('IP')
matched.group('User')
matched.group('Time')
matched.group('Request')

matched = re.finditer(pattern, access_log)
for m in matched:
	print(matched.group('IP)

Output

pprint function pretty prints Python nested objects

from pprint import pprint

pprint(object)

Filesystem

Handling files

fh = open("file_path", 'r')

all_text = fh.read()
list_of_lines = fh.readlines()
fh.close()

with open("other_path"):
	pass

Dealing with big files

# process line by line
# automatically deals with different new line characters in OSs
for line in fh:
	# process line

# read by chunks, if binary file
chunk = fh.read(1024)

JSON

import json

with open("file.json") as fh:
	object = json.load(fh)

with open("result.json", "w") as fh:
	updated_object = {'nested': {'list_of_numbers': [1, 2, 3,], 'key': 'value'}}
	result = json.dump(updated_object, fh)

YAML

For YAML file most commonly used library is PyYAML (install via pip)

import yaml

with open("playbook.yaml") as fh:
	playbook = yaml.safe_load(fh)

with open("result.json", "w") as fh:
	# updated_object = ...
	yaml.dump(updated_object, fh)

CSV

import csv

with open("file.csv") as fh:
	off_reader = csv.reader(fh, delimiter=",")
	for _ in range(5):
		print(next(off_reader))

Pandas

import pandas as pd

df = pd.read_csv('sample.csv')

# Get statistical insight
df.describe()

# Show top 3 rows
df.head(3)

# Show single column
df['column']

CLI

When creating command line tool, insert the following line on top to avoid explicitly calling python:

#!/usr/bin/env python

Popular Python argument parser modules - argparse (standard library), click and python-fire.

os

Dealing with files:

import os

os.listdir('.')
os.rename('old_name', 'new_name')
os.chmod('file_path', mode)
os.mkdir('/path/to/dir')
os.mkdirs('/path/to/dir') # recursively
os.remove('file_path')
os.rmdir('/path/to/dir')
os.rmdirs('/path/to/dir') # recursively
os.stat('file_path')
os.chmod
os.chmod

OS related methods:

# Get the current working directory
os.getcwd()

# Change current directory
os.chdir('/tmp')

# Get or set environment variable
os.environ.get('LOGLEVEL')
os.environ['LOGLEVEL'] = 'DEBUG'

# Get current user
os.getlogin()

sys

import sys

# Little or big endian
sys.byteorder

sys.platform

# Execute actions depending on the Python version
if sys.version_info.major < 3:
    print("You need to update your Python version")
elif sys.version_info.minor < 7:
    print("You are not running the latest version of Python")
else:
    print("All is good.")

Simplest way to process arguments from cli - sys.argv:

import sys

print('first argument -', sys.argv[0])
print('second argument -', sys.argv[0])

subprocess

Used to run applications outside Python - command line tools, Bash scripts, etc. subprocess.run() returns CompletedProcess instance once the process completes. Pass check=True parameter to raise exception, if subprocess reports error.

import subprocess

cp = subprocess.run(['ls', '-l'], capture_output=True, universal_newlines=True)
print(cp.stdout)

argparse

Part of Standard Library.

import argparse

if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='Echo your input')
    parser.add_argument('message', help='Message to echo')
    parser.add_argument('--twice', '-t', help='Do it twice', action='store_true')

    args = parser.parse_args()

    print(args.message)
    if args.twice:
        print(args.message)
  • description parameter of constructor is displayed with help message
  • if name begins with a dash, -, it is treated as an optional flag argument, otherwise positional-dependent command; arguments are saved as attributes of the parser object
  • action='store_true stores optional argument as boolean

It is also possible to create a hierarchy of commands like git cli has.

click

Documentation page. Install from PyPI - pip install click. Only --help option is available by default.

import click

@click.command()
@click.option('--greeting', default='Hiya', help='How do you want to greet?')
@click.option('--name', default='Tammy', help='Who do you want to greet?')
def greet(greeting, name):
	print(f"{greeting} {name}")

if __name__ == '__main__':
	greet()
  • click.command indicates that a function should be exposed as command-line access
  • click.option adds argument and automatically links it to function parameter of the same name

fire

GitHub page

import fire

def greet(greeting='Hiya', name='Tammy'):
    print(f"{greeting} {name}")

if __name__ == '__main__':
    fire.Fire(greet)

Can enter an interactive mode making all functions and objects available - can be used for debugging and introducing yourself to new code:

$ ./script.py -- --interactive

OOP

Everything in Python is an object (even classes).

The other way to access attributes (except dot notation) is using getattr() and setattr() functions. Information about an object's class is contained in __class__.

A type is the foremost type that any class can be inherited from, therefore, if the type of class is requested, type is returned. In all other cases returns the class that was used in instantiate the object. (returns a new type object when type() is called with three arguments). Type on an object returns score where it resides and class that was used to instantiate it - <class '__main__.Class'>.

Class variables are specified in class definition. Instance variables are specifed using self reference or instance itself. Instance's dictionary (__dict__) - contains all it's contents (doesn't include class variables). Class's dictionary includes it's variables. Defining instance variable with the same name as class variable makes the latter inaccessible for instance.

isinstance function can be used to determine if a given object is an instance of a class.

super() is a reference to a parent class. Can be used to call it's constructor for whatever reason - super().__init__().

Method Resolution Order (MRO) - when method is called, it is first searched in the current class, then in depth-first order from left to right. First found method definition is executed.

Special attributes:

  • __name__ (classes) - name of class
  • __class__ (classes and instances) - info about the class to which class instance belongs (also can be obtained with type() function
  • __bases__ (classes) - tuple that contains info about the base classes of a class
  • __dict__ (classes are instances) - dictionary (or other type of mapping object) with object's attributes

Object

id() function return an integer (identity) that is guaranteed to be unique and constant throughout object's life (address of an object in memory, do not treat as absolute).

is operator is used to check whether both labels, or variables, refer to the same object, while == checks value equality.

Make distinct copy of a compound object(list, dictionary, custom object):

import copy

a = [1, "two, [3, 4]]

# shallow copy - only 1 level deep
b = list_a[:]

# deep copy (recursive)
c = copy.deepcopy(a)

copy() function from copy module can be used as universal shallow copying.

Serialization

pickle module is Python implementation of data serialization (can not be used to exchange data with programs written in other languages - used JSON or XML). Not secured against erroneous or maliciously constructed data (don't deserealize data from untrusted source).

dump() function expects an object to serialize and a file handle in binary mode. load() function treats file as a stack and loads objects in the same order the were dumped.

import pickle

a_dict = {'one': 1, 'two': 2}
a_list = ['a', 123, [10, 100, 1000]]

with open('data.pckl', 'wb') as fh:
    pickle.dump(a_dict, fh)
    pickle.dump(a_list, fh)

with open('data.pckl', 'rb') as fh:
    data1.load(fh)
    data2.load(fh)

print(type(data1))
print(data1)
print(type(data2))
print(data2)

dumps() and loads() functions are used when transmitting data over network or to database:

import pickle

a_list = ['a', 123, [10, 100, 1000]]
bytes = pickle.dumps(a_list)
# 'bytes' can be passed to appropriate driver

# deserialize received bytes object
b_list = pickle.loads(bytes)
print('A type of deserialized object:', type(b_list))
print('Contents:', b_list)

Classes and functions are serialized only by their names - no attributes or definition is included. That means environment, where class or function is deserialized, must know about that class or function.


shelve module is built on top of pickle and is used for organizing serialized data. Each object is associated with a key, which must be of type string (underlying dbm requires it). Data is persisted in a file-based database. Shelve object acts similar to a dictionary (len(), in, keys(), items(), update, del). Changes are placed to a buffer and periodically flushed to disk; call sync() method to enforce flushing, close() also flushes the buffer.

import shelve

shelve_name = 'new.shlv'

my_shelve = shelve.open(shelve_name, flag='c')
my_shelve['EUR'] = {'code':'Euro', 'symbol': '€'}
my_shelve['GBP'] = {'code':'Pounds sterling', 'symbol': '£'}
my_shelve.close()

new_shelve = shelve.open(shelve_name)
print(new_shelve['USD'])
new_shelve.close()

shelve.open() modes (flag):

  • r - open existing shelve, read only
  • w - open existing shelve, read and write
  • c - (default) read and write, create if didn't exist
  • n - always create new, read and write

Methods

Class method

Class method refers to the class itself, not the instance of the class. First arguments is cls (by convention) to refer to class methods and attributes.

class Example:
	__internal_counter = 0

	def __init__(self, value):
		Example.__internal_counter += 1

	@classmethod
	def get_internal(cls):
		return f'# of objects created: {cls.__internal_counter}'

Class method can also be used as an alternative constructor to handle more or different parameters.

class Car:
    def __init__(self, vin):
        print('Ordinary __init__ was called for', vin)
        self.vin = vin
        self.brand = ''

    @classmethod
    def including_brand(cls, vin, brand):
        print('Class method was called')
        _car = cls(vin)
        _car.brand = brand
        return _car

car1 = Car('ABCD1234')
car2 = Car.including_brand('DEF567', 'NewBrand')

Static method

Static method does not require nor expect a parameter indicating class object or class itself. Used as a utility method, when a particular function fits the class context. Can not access nor change the state of object or class.

class Bank_Account:
    def __init__(self, iban):
        print('__init__ called')
        self.iban = iban

    @staticmethod
    def validate(iban):
        if len(iban) == 20:
            return True
        else:
            return False

Magic methods

Operators and function calls are translated to special methods that the class must defined. Use dir() and help() functions to discover available magic methods.

On binary operations (such as +) left operand's method is called, while right operand becomes the argument of the call.


Extended function arguments

*args - refers to tuple of additional, not explicitly expected positional arguments; in other words, collects all unmatched positional arguments

**kwargs (key word arguments) - refers to a dictionary of all unexpected arguments that were passed in the form of key=value pairs

To pass extended arguments to the next function, they should be unpacked:

def fun1(*args, **kwargs):
	pass

def fun2(a, b, *args, **kwargs):
	fun1(*args, **kwargs)

Proper order of all types of function arguments:

def fun(a, b, *args, c=20, **kwargs):
	pass

Decorator

Basic principal is based on wrapping the original function with a new decorating function (or class). Original function is passed as a parameter to decorating function. Decorator returns a function that can be called later. Python is able to decorate functions, methods, and classes.

Can be used to perform operations befor or after or prevent execution of wrapped object at all.

Once function is decorated it no indicates the original function, but to the object that decorator return.

def simple_hello():
    print("Hello from simple function!")

def simple_decorator(function):
    print('We are about to call "{}"'.format(function.__name__))
    return function

@simple_decorator
def simple_hello_d():
	print('Hello')

decorated = simple_decorator(simple_hello)
decorated()

simple_hello_d()

Python allows multiple decorators to be applied to a single object. Execution order is similar to a stack: outer decorator is called first, which then calls inner decorator, which calls the original function. Once original function is finished, inner decorator gets control, and once finished, hands it to the outer decorator.

A decorator can be a class - since a decorator must be callable, decorator class must implement __call__ method. When decorator class is called with an argument, it is passed to class's __init__ method, while decorated function is passed to __call__.

A class can be decorated as well - object creation will go through decorator and will be extended in some way. Just like decorated functions, the original class is not longer available - decorator creates and returns an object.

Abstract class

Acts as a blueprint for other classes - sets required methods that must be implemented (abstract methods). Can also contain methods with definitions, therefore, any class with at least one abstract method is an abstract class. It is impossible to instantiate an abstract class - all methods must be overwritten by a subclass in order to be able to instantiate it.

abc (Abstract Base Classes) module provides helper class (ABC) to create an abstract class and abstractmethod decorator to mark a method as abstract.

import abc

class BluePrint(abc.ABC):
    @abc.abstractmethod
    def hello(self):
        pass

class GreenField(BluePrint):
    def hello(self):
        print('Welcome to Green Field!')


gf = GreenField()
gf.hello()

Attribute encapsulation

property decorator designates a method, which will be called when encapsulated attribute is read (getter). The name of the method is the same as the name of the attribute. Should be defined before methods for setting and deleting the value.

Setter and deleter methods should be named after the attribute by convention.

Properties are also inherited and can be called as if they were attributes.

class Example:
    def __init__(self):
        self.__var = 0

    @property
    def var(self):
        return self.__var

    @var.setter
    def var(self, value):
		self.__var = value

    @var.deleter
    def var(self):
		self.__var = None

Exceptions

BaseException - most general exception class. Custom exception class must be derived from the general or any derived exception class.

Except clause may specify a variable after exception name, which is bound to exception instance. Arguments used to create an object are store in args attribute. Different exceptions may have different other attributes.

try:
	x = 1 / 0
except BaseException as e:
	print(e.args)

Exception chaining occurs when another error was raised while handling the original exception. Latter exception instance has 2 reserved attributes to hold reference to the original exception instance:

  • __context__ - implicit
     l = []
     try:
         print(l[0])
     except Exception as e:
         try:
             print(1 / 0)
         except Exception as f:
             print(f.__context__ is e)
  • __cause__ - explicit
     class UnifiedException(Exception):
         pass
    
     def print_3_items(l):
         try:
     	    print(l[0])
     	    print(l[1])
     	    print(l[2])
     	except IndexError as e:
     		rause UnifiedException("big error") from e
    
     try:
         print_3_items([1, 2])
     except UnifiedException as e:
     	print(f"General exception {e} caused by {e.__cause__}")

Each exception object also has a __traceback__ attribute. print_tb and format_tb methods from traceback module can be used to print and return traceback item list respectively.

import traceback

try:
    print(1 / 0)
except Exception as e:
	traceback.print_tb(e.__traceback__)

Metaclass

A class whose instances are classes themselves. Redirects class instantiation to a dedicated logic; applied when class definition is read, way before class instantiation.

Use cases:

  • logging
  • registering classes at creation time
  • interface checking
  • automatically adding new methods or variables

Classes are instances of type special class - default metaclass. Subclasses of type class are also metaclasses.

type() function with 3 arguments creates a new class:

  1. name of the class (__name__)
  2. tuple of the base classes from which new class is inherited (__bases_)
  3. dictionary with method definitions and variables for the class body (__dict__)

type() function is called after class instruction is identified and the class body is read. type is responsible for calling __call__ method upon class instance creation, which calls __new__() (creates the class instance in computer memory) and __init__() (object initialization). Metaclasses usually implement both methods.

Metaclass is derived from type and calls __new__() method of the parent class, therefore, adding custom logic to class instantiation.

class new_meta(type):
	def __new__(mcs, name, bases, dictionary):
		obj = super().__new__(mcs, name, bases, dictionary)
		obj.custom_attribute = "added by metaclass"
		return obj

class new_class(metaclass=new_meta):
	pass

Metaclass can be used to make sure all classes are equipped with certain methods or attributes, thus, supplement to those, that don't have them.

Standardization

Python Enhancement Proposals types:

  • Standard Track - new language features and implementations
  • Informational - design issues, guidelines and information
  • Process - processes around Python (change proposal, recommendation...)

Popular PEPs:

  • PEP 1 (PEP purpose and guidelines)
  • PEP 8 (Style Guide for Python Code)
  • PEP 20 (The Zen of Python)
     import this
  • PEP 257 (Docstring conventions)
    • docstring is a first statement in a module, function, class or method definition (becomes __doc__ attribute, also accessible by help() function)
  • PEP 483, 484 - type hinting (is not used at runtime, but can be used with type checking or linting tools).

Python documentation generator:

Popular linters:

  • Flake8
  • Pylint
  • Pyflakes
  • Pychecker
  • Mypy
  • Pycodestyle

Popular fixers:

  • Black
  • YAPF
  • autopep8

GUI programming

Popular standards for visual programming in Unix world - GTK, Qt.

Tk (GUI toolkit, UX library) serves as an adapter to multiple standards and OSs, and available in multiple languages. TkInter (Tk Interface) is a Python module.

Modal window grabs whole application's focus - all widgets become deaf. F.e. messagebox.

Widgets

  • Button
  • Label - non-clickable, text representation
  • Frame - non-clickable, groups widgets to visually separate from other components in the window
  • Checkbutton - solitary check box
  • Radiobutton - selection group (only one can be selected); group is created by binding multiple widgets to the same variable, also each widget must define different value to be distinguished from each other
  • Entry - input field

Coordinates default to upper-left corner (x=0, y=0).

Geometry managers place widgets on window's interface. Can not be mixed in one application. Implemented as widget object methods.

Name Description Parameters
Place Specify exact location and size x, y, height, width in pixels
Grid Automatically splits the windows area into columns. Specify general wishes, and manager will try to deploy widgets according to them. column (starts and defaults to 0), row defaults to the first free row from the top, columnspan - number of neighboring columns that the widget will occupy (defaults to 1), rowspan - same as columnspan for rows
Pack tkinter guesses the intentions and finds the best location for each widget. Widgets are packed subsequently, thus, order matters. Default behaviour puts widgets in one column, one below another side=s (options: TOP default, BOTTOM, LEFT, RIGHT), fill=f expand widget differently that the default (NONE default, X - horizontally, Y - vertically, BOTH)

Change colors using fg, bg, activeforeground, activebackground attributes. All recognized colors in English here - specify in camel case. RGB is also accepted; start with hash - #RRGGBB.

Handler

Callback (or handler) - function designed to be called by something else. Invoking your own handler is strictly prohibited (can confuse the event controller).

Handler used by the button has to be parameterless.

Variables

To organize internal communication between components, special variables are used. They can not be set directly, but rather through methods. Can be used with Checkbutton widget to get user's input - 0 for unset, 1 for set. Changing the state of switch object also changes the state or related widget. ```python import tkinter as tk

switch = tk.Intvar()
switch.set(1)
```

Useful modules

  • random
  • datetime

References

  • Python for DevOps
  • Python institute
⚠️ **GitHub.com Fallback** ⚠️