Simplicity

Don't write a test when generating an email in an exception block will do
Don't write a mock when just creating a dictionary of the data you expect will do
Don't write a class when writing a util method will do
Use sqlite when pandas is write-only

Circular Imports

A decorator should do the trick as far as debugging:



import builtins
import functools

# Set to track modules currently being imported
being_imported = set()

def detect_circular_import(func):
    @functools.wraps(func)
    def wrapper(name, globals=None, locals=None, fromlist=(), level=0):
        if name in being_imported:
            print("Circular import detected:", name)
            print("Import stack:", list(being_imported))
        being_imported.add(name)
        print("Currently importing:", list(being_imported))
        result = func(name, globals, locals, fromlist, level)
        being_imported.remove(name)
        return result
    return wrapper

# Apply the decorator to __import__
builtins.__import__ = detect_circular_import(builtins.__import__)

# NOTE: Place this code in the __init__.py file of your top-level package
# Example: my_project/__init__.py
# This ensures the import tracker is activated early and applies globally

Code Review

Add TODOs separately
Review tests for holes in logic, not just lines of code
Review negative tests closer than positive tests for oversimplification
Review logging to avoid CVEs
Do not assert string values such as error messages
Do not use a dictionary when a tuple will do
Make sure classic string formatting is used for logging: logging.info("%d members: %s" % (len(potential_members), potential_members))
Compare name to the canonical naming conventions
Use time data where only binary data is available because timing is everything

Archetype: Assertion Modalities in Python

"Assertion" as dual-purpose epistemic and operational primitive.

Modalities Compared

Dimension	Bare Assertion Unit Tests	Inline Assertions in Production Code
Location	Isolated in test modules	Embedded in runtime logic
Purpose	Verification of behavior	Guardrails or sanity checks
Execution Context	Test-time only	Runtime (unless stripped via `python -O`)
Portability	Yes — portable across environments and CI	Risk — may be disabled or misused
Failure Mode	Controlled test failure	Runtime crash, possibly in production
Risk Profile	Low — scoped to test harness	High — may affect users or systems
Auditability	High — test logs, coverage, introspection	Medium — buried in logs, hard to trace
Tooling Compatibility	Full support from pytest and coverage tools	Often invisible to test tooling
Symbolic Lineage	"Proof-by-example", executable specification	"Runtime contracts", defensive programming
Narrative Role	Declarative claim of expected behavior	Imperative checkpoint in operational flow

Epistemic vs Operational Overlay

Bare Unit Test Assertions
- Epistemic stance: "This should always be true under these conditions."
- Temporal scope: Pre-deployment; build-time verification.
- Failure semantics: Indicates flaw in model or implementation.
- Lineage: Hoare logic → unit testing → pytest introspection.
Inline Runtime Assertions
- Epistemic stance: "This must be true right now, or something is dangerously wrong."
- Temporal scope: Runtime; operational enforcement.
- Failure semantics: Indicates breach of invariant, possibly catastrophic.
- Lineage: Design by contract → defensive programming → runtime guards.

Tradeoff Summary

Bare assertions offer safe, portable, and audit-friendly verification, ideal for CI pipelines and literate testing.
Inline assertions provide live invariant enforcement, but carry risk if misused or stripped, especially in production contexts.

Data Review

Compare the metadata between two data points
If there is no difference in each key/value pair, compare the data points referenced by ID in each value of each of the keys
If the referenced data has some difference, update the data point (a boolean value probably) and retry the automated job

Idioms

Quotes

a homogeneous tuple of arbitrary length is equivalent to a union of tuples of different lengths

Use tuples to return multiple values, as tuples can hold any number of values.

Product types are associative, not commutative

CI/CD

Use verbose arguments everywhere

Delete files at runtime in CI/CD scripts, such as lockfiles

Add files at runtime in CI/CD scripts, such as templated HTML

Function Simplicity

import logging

def foo(arg):
    """
    Keep it Simple, do one thing and do it well
    """
    if bool(arg):
        # Call and endpoint using arg
    else:
        logging.error(arg)
        raise(arg)

Passing

Use pass in an if/else block as a placeholder when you need to distinguish between a function and a procedure

Linting

Use a minimum set of things to exclude from pylint, for example: #pylint: disable=no-member, invalid-name, line-too-long

Use a comment to setup pylint rules for a single file; use pylintrc to do it for all files

Logging

Log all method locals and method args

String Concat

Use ''.join(foo, bar) instead of foo + bar to distinguish string manipulation from arithmetic and for easier portability to other languages

Pattern Matching

Pattern Matching treats code as data, for example, the word event for an event handler can use pattern matching instead of a dispatch system:

def handle(event):
    match event:
        case {"type": "on_click", "target": "button"}:
            return handle_button_click()
        case {"type": "on_timeout", "duration": d} if d > 5:
            return handle_long_timeout(d)
        case {"type": "callback", "status": "success"}:
            return finalize_callback()

Multiple Dispatch

Use polymorphism to implement the multiple dispatch pattern:

class Animal:
    def speak(self):
        pass

class Dog(Animal):
    def speak(self):
        return "Woof!"

class Cat(Animal):
    def speak(self):
        return "Meow!"

def make_animal_speak(animal: Animal):
    return animal.speak()

Use the split and replace methods in tandem multiple times for simple pattern matching. For example:

# Split the connection string into components
# example:
# postgresql://fakeuser:[email protected]:5432/fakedbname
_, username, password_and_host, port_and_dbname = self.cstring.split(':')
username = username.replace("//","")
password, host = password_and_host.split('@')
port, dbname = port_and_dbname.split('/')
# Copy password to paste during interactive prompt

Use the RegExp module for complex strings and JSONPath(https://pypi.org/project/jsonpath/) and JSONPointer (https://pypi.org/project/jsonpointer/) for everything else to avoid complex if/else/elif or case logic

Boxing

Use autoboxing via the box module(https://pypi.org/project/python-box/) to normalize access to nested key/value pairs

Use getattr to access methods within the box-python lib(https://github.com/cdgriffith/Box/blob/master/test/test_box.py#L856)

Nesting

Rewrite repetitive statements that use nested parentheses as decorators

Rewrite statements that use boolean or logic as addition:

# before
if foo or bar:

# after
if foo + bar == 1

Typos

List your assumptions to find typos in terms of:

Prefix spelling mistakes
Suffix spelling mistakes
Hyphenation
Pluralization

Type Checking

Use pydantic + jsonschema

Boolean Algebra

Use multiple dispatch(https://pypi.org/project/multipledispatch/) instead of if/else/elif or case logic to handle method calls which need to handle both structured and unstructured data

Safe Subset

Use a safe subset

State

Use the truths module to encapsulate boolean logic:

Data

Use adapters to maintain API conpatibility layers

Use featuretools to generate mock data

from truths import Truths

# Define your Boolean expressions
expressions = ['(a and b)', 'a and b or x', 'a and (b or x) or d']

# Generate the truth table
my_table = Truths(['a', 'b', 'x', 'd'], expressions)
print(my_table)

Pip vs Poetry

differences:
  - dependency_resolution:
      poetry:
        description: Uses a sophisticated dependency resolver and lockfile (poetry.lock), which can lead to failures (e.g., timeouts, conflicts) in workflows.
        issues:
          - Resolving dependencies may hang indefinitely.
          - Lockfile conflicts across environments.
      pip:
        description: Relies on requirements.txt, which simplifies installation and reduces dependency resolution complexity.
  - keyring_support:
      poetry:
        description: Integrates with keyring for credential management, which can fail if no valid keyring backend is found.
      pip:
        description: Does not rely on keyring, avoiding such issues.
  - virtual_environment_management:
      poetry:
        description: Automatically creates and manages virtual environments, which can cause conflicts in GitHub Actions.
        issues:
          - Virtualenv creation failure due to restricted access.
          - Path mismatches between runners.
      pip:
        description: Does not inherently manage virtual environments and works directly in the provided environment.
  - configuration_complexity:
      poetry:
        description: Requires additional configurations, such as experimental features or disabling keyring in workflows.
      pip:
        description: Simpler setup with fewer configurations needed.
  - caching_dependencies:
      poetry:
        description: More challenging due to reliance on poetry.lock and virtualenvs cache (~/.cache/pypoetry/virtualenvs).
        issues:
          - Improper caching may lead to redundant installations or failures.
      pip:
        description: Simpler caching using ~/.cache/pip, which is more reliable.
  - dependency_sources:
      poetry:
        description: Supports Git-based dependencies and additional repositories, but may cause issues such as Git protocol errors.
      pip:
        description: Handles Git-based dependencies with fewer issues.

summary:
  poetry:
    description: Advanced package management with sophisticated features like dependency resolution and keyring integration, but can complicate GitHub Actions workflows.
  pip:
    description: Simpler and more minimalistic package management, avoiding many of the issues Poetry encounters in workflows.

Install pip-audit

Add pip-audit to your requirements.txt and run it as part of the CI build to find vulnerabilities in installed modules

Poetry branch based development of a private repo

baz.git", branch = "develop" }

Poetry Github token SSH integration

https://stackoverflow.com/questions/68446604/how-to-specify-github-access-token-in-pyproject-toml-with-an-environment-variabl

Verbose poetry

Run poetry install -vvv to turn on verbose mode for troubleshooting errors

Upgrade poetry

Run poetry lock --no-update to update poetry

Clean up Dependencies

Move devdependencies to dependencies if they are only needed for runtime (boto3, jinja)
Move dependencies to devdependencies if they are only needed for build time and testing(pytest, urllib3)
Make sure there is no overlap between dependencies and devdependencies
Pin dependencies using < if there are compatibility issues

Optimize Deployment

Modules should be zipped to compress the deployed package
Make sure lockfiles are in the deployed package

Optimize Configuration

Make sure variable interpolation is using the correct syntax for XML/JSON/YAML files
Use functions instead of any other data type if there is an option
Make sure quotation marks are consistent (double vs single, unix vs windows)
Use a list instead of a dictionary if the spec requires a list

Use memoization as a poor man's idempotency

Memoize lambda success results to prevent duplicate database entries
Memoize lambda error results to prevent false negatives

Use sums of boolean values instead of `and`/`or` clauses

If you have conditions based on multiple boolean values, add up the sum of the boolean values rather than using complex logic:

# Replace this:
if a or b:
# With this:
if a + b == 1:

# Replace this:
if a and b:
# With this:
if a + b == 2:

Raise if you do not get the type you expected

Use type to verify assumptions; if they are not true, raise an exception:

    if bool(type(meta) is dict) + bool(type(meta) is list) !== 1
        raise Exception('Wrong')

Raise `TypeError` or `ValueError` instead of a generic error

Use NameError, TypeError or ValueError to make exceptions more specific for class, type, or value related code respectively:

class MyException(Exception):
    pass

Use assert to raise errors for impossible calculations

# no matter what, discounted prices cannot be lower than 0 or higher than the listed price
assert 0 <= price <= product['price']

# check if the value of `a plus b` is less than 3
assert a + b < 3, f'No, the answer is {a + b}, which means someone changed the input types from boolean to integer'

# assert a numeric string is numberic
def add_dollar_sign(numeric_string):
    assert numeric_string.isnumeric(), 'Not a numeric string'
    return '$' + numeric_string

Combine logging with uncaught exceptions to trace unexpected errors

def excepthook(*args):
  logging.getLogger().error('Uncaught exception:', exc_info=args)

sys.excepthook = excepthook

assert 1==2, 'Something went wrong'

Use init to share globals

"""
Foo
"""
import os
from foo import Foo
import requests

foo_api_secrets = get_secret("foo")
foo_token = foo_secrets["token"]
foo = Foo(token=foo_token)

from util.foo import foo

Use comments to determine method scope

Add comments first to avoid scope creep

Add a README and link to it in the comments for mission critical details Money

Use Markdown within comments to add complex mixed content, like tables

Use absolute imports instead of relative imports

Instead of:

from . import hubspot

Use the name of the top-level directory instead:

from foo.bar import hubspot

Even if the name of the directory matches the name of the module:

from foo.hubspot import hubspot

Use eval to convert a string to a list

foo = '[[1]]'
bar = eval(foo)
type(bar)

Check how many times a string is found in a list

lst = [1, 2, 3, 'Alice', 'Alice']

indices = [i for i in range(len(lst)) if lst[i]=='Alice']

print(indices)

Convert a one member tuple to a string for use with the DBAPI:

if isinstance(foo, tuple) and len(foo) == 1:
   bar = str(foo).replace(",","")

Convert a multi member tuple to a dict for use with the DBAPI:

   baz = [dict(bar) for bar in foo]

Use return statements instead of assignment statements with the `or` clause

return {"foo": foo or "N/A"}

try:
   foo = bar
except Exception as e:
   foo = "N/A"
   print e

Use asyncio instead of threads to avoid running out of processes

Use specific exceptions when using except clauses

Use library specific exceptions instead of the generic Exception class
Create custom Exception classes when writing user-defined modules
Use Python's built-in Exception classes when you expect a specific exception

Custom Exception

import argparse

class MyArgumentParser(argparse.ArgumentParser):
    def error(self, message):
        raise argparse.ArgumentError(None, message)

parser = MyArgumentParser(description="Example parser")
parser.add_argument('arg', type=str, help='An argument')

try:
    args = parser.parse_args()
except argparse.ArgumentError as e:
    print(f"Argument error: {e}")

Use nested try/except blocks to swallow errors:

try:
    foo
    try:
        bar
    except Exception as e:
        print("inner")
except Exception as e:
    print("outer")

Use urllib3.exceptions.ConnectionResetError in an except block to implement a retry mechanism

except urllib3.exceptions.ConnectionResetError:
     print(f"ConnectionResetError encountered. Retrying {retries}/{max_retries}...")
     time.sleep(retry_delay)
     if retries < max_retries:
         foo(bar, baz)
     else:
         raise("Max retries reached. Stopping")

Use raise and except to make non-200 responses throw exceptions

Use finally to print logs and timing statistics

    try:
        foo = bar
    except:
        bar = baz
    finally:
        print(foo)
        print(bar)
        print(baz)

Use the following built-ins to distinguish strings:

int(min("0", "hi")) # if you expect a numeric string, but get a alphanum string, this returns 0

str(max("9999", "A")) # if you expect a alphanum string, but get a numeric string, this returns 9999

If you have a list of dictionaries, use max to count the largest:

list(str(int((max(bool(dict([("foo","")])),bool(dict([("bar","")])))))))

If a sequence is length zero, it is falsy, so no need to check > 0

search result = []

if foo:
    print(f"foo is empty: {len(foo)}")

If you check the length of a dictionary, it will return the number of keys

   foo = {"hi":"mom","my":"name","is":"kid"}
   print(f"number of keys in foo: {len(foo)}")

Use getattr to check for existence in an object

foo = str(getattr(name, 'first_name', None))

Use a dictionary for caching

foo_cache = {}

if str(first_name) not in name_cache:
   new_name = search_name(first_name)
   if new_name:
       name_cache["first_name"] = new_name

Use the `lru_cache` decorator for caching function call results

from functools import lru_cache

@lru_cache(maxsize=100)
def get_foo(id: str, version: Optional[str] = None):
    """
    Foo
    """
    if not id:
        raise ValueError(f"ID not found: {id}")

    return registry.get(id, version)

Use .get() and `or` to search dictionaries

# if get() returns None, the or statement will return a string which can be parsed by the `in` operator
if str(foo) not in (bar.get("baz", "") or ""):
    bar["baz"] = str(foo)

Use `split` and `join` rather than string manipulation

number_list = ["1 2 3 4 "].strip().split(' ')
# remove extra whitespace between list indices
normalized_number_list = [index.strip() for index in number_list]
normalized_number_list.remove(4)
# convert list back to a string 
foo = " ".join(normalized_number_list)
# use the new length as a separate variable
foo_length = len(normalized_number_list)

Use str(), int() and min() to defensively add numbers

# Normalize all inputs as a string
foo = str("error")
bar = str(1)
# fallback to the numeric value if casting both as an integer fails
try:
    baz = int(foo) + int(bar) 
except:
    baz = int(min(foo, bar))

Logging to stdout and stderr

import logging
logging.info('This is the existing protocol.')
FORMAT = "%(asctime)-15s %(clientip)s %(user)-8s %(message)s"
logging.basicConfig(format=FORMAT)
d = {'clientip': '192.168.0.1', 'user': 'fbloggs'}
logging.warning("Protocol problem: %s", "connection reset", extra=d)

Fundamentals

isinstance => instanceOf

min/max => type coercion

repr => valueOf / toString

assert => console.assert

with => context scoping

The with statement makes access to named references inefficient, because the scopes for such access cannot be computed until runtime.

Use strict standards for loops:

use a break to get out of a loop, especially if it uses recursive calls
check a loop variable is a list before iterating
check a loop variable has more than the zero index before iterating; otherwise change the type to a dictionary

Use the following pattern to create a switch/case statement:

from collections import namedtuple

Case = namedtuple('Case', ['condition', 'code'])

cases = (Case('i > 0.5',
            """print 'greater than 0.5'"""),

         Case('i == 5',
            """print 'it is equal to 5'"""),

         Case('i > 5 and i < 6',
            """print 'somewhere between 5 and 6'"""))

def switch(cases, **namespace):
    for case in cases:
        if eval(case.condition, namespace):
            exec(case.code, namespace)
            break
    else:
        print 'default case'

switch(cases, i=5)

Use the following pattern to create a default value:

>>> li1 = None
>>> li2 = [1, 2, 3]

#  li1 is None so li2 is assigned 
a = li1 or li2

Gotchas

If you define a method with a default argument, its mutations will be cached, and all future calls will merge the new data and the mutated data.

Use None as a default argument to avoid this.

If you bind a default value to a lambda, it will be bound to the method:

f = lambda x=x: x

Decorating a Python method with staticmethod ensures that self will not be provided as an argument. Unlike other methods in Python, the first argument is always the class object.

Use the following argument to the print statement to clear the output buffer: flush=true

for i in range(10):
    print(i, end=" ", flush=True)
    time.sleep(.2)
    print()

Use raise to bubble up errors raised in a try clause:

try:
    raise NameError('HiThere')
    except NameError:
        print('An exception flew by!')
        raise

Use if/else and raise inside a try clause to conditionally bubble up non-programmatic errors in a REST API:

try:
   if foo:
       return jsonify({"foo":str(foo)})
   else:
       raise Exception(f'foo is None so the response cannot be parsed')
except Exception as e:
   return jsonify({"exception": str(e)})

Use the build environment variable and try/except blocks to create fallback responses for QA/Test environments:

try:
    foo(os.getenv('BENV'))
except Exception as e:
    if bool(os.getenv('WORKER')):
        return {"error": e, status: 500}
    else:
        return {"data": [], status: 500}

Use SQL regexp methods as a first resort, and re as a second resort (JS as a last resort) to do complex string manipulation:

import re

class Solution(object):
    
    def __init__(self):
        self.email_cache = {}
    
    def uniqueLocalName(self, email):
        sanitized_plus = re.sub('\+([^@]+)', '', email)
        unsanitized_local_name, domain_name = re.split('@', sanitized_plus)
        sanitized_local_name = re.sub(r'([^\.]+)\.?', r"\1", unsanitized_local_name)
        sanitized_email = sanitized_local_name + '@' + domain_name
        print(sanitized_email)
                                 
        if sanitized_email not in self.email_cache:
            self.email_cache[sanitized_email] = sanitized_email
            

    def numUniqueEmails(self, emails):
        """
        :type emails: List[str]
        :rtype: int
        """
        for email in emails:
            self.uniqueLocalName(email)
            
        return len(self.email_cache.keys())

Always catch exceptions when creating lists of dictionaries. If an exception happens, assign it to a variable in the except block then return it as a msg key/value pair. Otherwise, use the list index as the msg value:

        try:
            results = None
            payload = []
            results = db_session.execute(query).fetchall()
        except Exception as e:
            print(f'query: {e}')
            results = e
        finally:
            print(f'uuid: {uuid}')

        if isinstance(results, list):
            for idx, row in enumerate(results):
                payload.append(
                    {
                        "id": str(row[0]),
                        "msg": str(idx),
                    }
                )
        else:
                payload.append(
                    {    
                        "id": "",
                        "msg": str(results),
                    }
                )

Use a decorator to create a repeatable method for timing:

import time

def timer(func):
    def wrapper(*args, **kwargs):
        start_time = time.time()
        result = func(*args, **kwargs)
        end_time = time.time()
        print(f"Time taken to run {func.__name__}: {end_time - start_time:.4f} seconds")
        return result
    return wrapper

@timer
def example_method():
    # Simulate a task taking some time
    time.sleep(2)
    print("Method execution complete.")

# Call the method
example_method()

Use an enum class to store constants:

class Day(IntEnum):
    MONDAY = 0
    TUESDAY = 1
    WEDNESDAY = 2
    THURSDAY = 3
    FRIDAY = 4
    SATURDAY = 5
    SUNDAY = 6

Run things locally to get stack traces that are obscured by the logging needle in a haystack problem.

Run everything locally: the database, the application server, and the web server

flask dev

Use an event loop to control asyncio:

import asyncio

def hello_world(loop):
    """A callback to print 'Hello World' and stop the event loop"""
    print('Hello World')
    loop.stop()

loop = asyncio.new_event_loop()

# Schedule a call to hello_world()
loop.call_soon(hello_world, loop)

# Blocking call interrupted by loop.stop()
try:
    loop.run_forever()
finally:
    loop.close()

Use type hint to generalize type checking:

from typing import Dict, List, Optional

class Node:
    ...

class SymbolTable(Dict[str, List[Node]]):
    def push(self, name: str, node: Node) -> None:
        self.setdefault(name, []).append(node)

    def pop(self, name: str) -> Node:
        return self[name].pop()

    def lookup(self, name: str) -> Optional[Node]:
        nodes = self.get(name)
        if nodes:
            return nodes[-1]
        return None

'''
SymbolTable is a subclass of dict and a subtype of Dict[str, List[Node]].
'''

Use Slack webhooks as a poor man's Persistent Logger

Pydantic

Marshmallow

from marshmallow import Schema, fields

class RSSItemSchema(Schema):
    title = fields.String()
    link = fields.Url()
    description = fields.String()
    pubDate = fields.DateTime()
    guid = fields.String()

from marshmallow import Schema, fields

class SAMLAssertionSchema(Schema):
    issuer = fields.String()
    subject = fields.String()
    audience = fields.String()
    conditions = fields.String()
    authn_statement = fields.String()
    attribute_statement = fields.String()

Abstract Syntax Tree

import ast
import astunparse

class PrintVisitor(ast.NodeTransformer):
    def visit_Print(self, node):
        # Replace the old print statement with a new print function
        new_node = ast.Expr(
            value=ast.Call(
                func=ast.Name(id='print', ctx=ast.Load()),
                args=node.values,
                keywords=[],
            )
        )
        return ast.copy_location(new_node, node)

def convert_print_statements(source_code):
    # Parse the source code into an AST
    tree = ast.parse(source_code)

    # Transform the AST
    PrintVisitor().visit(tree)

    # Generate the new source code from the AST
    new_source_code = astunparse.unparse(tree)

    return new_source_code

# Read the old source code
with open('old.py', 'r') as f:
    old_source_code = f.read()

# Convert the print statements
new_source_code = convert_print_statements(old_source_code)

# Write the new source code
with open('new.py', 'w') as f:
    f.write(new_source_code)

List to CSV

Wrap each list item in braces: writer.writerow([uuid_str])

Docstrings

Docs as Code

https://www.writethedocs.org/guide/docs-as-code/

Best Practices

Use curly braces to do variable substitution within docstrings

name = "Foo"
bar = f"""
Hi, {name}!
"""
print(bar)

Index out of range

Use a try/catch block, an initial assignment statement, and a nested try/catch block to test and log index access:

try:
     foo = None

        try:
            foo = results["data"]
        except (KeyError, Exception) as e:
            logging.warning(f"failed to get results for: {results}")
        finally:
            foo = foo or {"ok": False}

Formatting

Split the formatting into steps, instead of doing it as a one-liner. For example:

import math

amount = math.pi  # Pi, approximately 3.14159...
absolute_amount = abs(amount)  # Still 3.14159...
formatted_amount = "$" + ("%0.2f" % absolute_amount)
print(formatted_amount)  # Output: "$3.14"

Open Source Ownership and Auditing

Glossary

Grammar

Gotchas

Data Model

pip

__init / sys.path

command-line args

https://opensource.com/article/17/3/python-tricks-artists-interactivity-Python-scripts

Executable Scripts as Modules

Separate the __main__ logic for the module itself. For example:

def my_function():
    # Your function implementation here

if __name__ == "__main__":
    # Code to run when the script is executed directly
    print("This will only run if you run the script explicitly, not import it")

Activestate

Recipes

CMIS

Reflection

Sample Projects

Sample setup.py

from distutils.core import setup
from setuptools import find_packages

setup(
    name="foobarbaz",
    version="0.9.8",
    description="utility belt",
    author="Foo Bar Bazman",
    author_email="foobarbaz@http://foobarbaz.example.com",
    url="",
    packages=find_packages(),
    package_data={'config': ['README.md']}, # full path: ~/foobarbaz/config/README.md
    install_requires=[
        "hubspot-api-client==3.4.2",
        "python-box>=5.3.0",
        "stripe==2.42.0",
    ],
)

from typing import Dict, List, Optional

class Node:
    ...

class SymbolTable(Dict[str, List[Node]]):
    def push(self, name: str, node: Node) -> None:
        self.setdefault(name, []).append(node)

    def pop(self, name: str) -> Node:
        return self[name].pop()

    def lookup(self, name: str) -> Optional[Node]:
        nodes = self.get(name)
        if nodes:
            return nodes[-1]
        return None

SymbolTable is a subclass of dict and a subtype of Dict[str, List[Node]].

Callable vs Non-Callable Native Properties

my_string = "Hello, World!"
string_methods = [method for method in dir(my_string) if callable(getattr(my_string, method))]
print("String methods:")
for method in string_methods:
    print(method)

Wrapper Method

Use a class to return a method wrapped in your own library rather than a method to avoid returning a function that has to be called rather than a class which includes the method as a property:

import bar

def foo:
    return bar

# import foo
# foo = foo()
# baz = foo.bar(True)

import bar

class Foo:
    def __init__(self):
        self.bar = bar

## import Foo
foo = Foo()
baz = foo.bar(True)

Import Error Handling

import traceback
import importlib

def find_bar_variable(module_path):
    try:
        # Import the module dynamically
        module = importlib.import_module(module_path)

        # Check if 'bar' is a callable attribute (method or function)
        if hasattr(module, 'bar') and callable(getattr(module, 'bar')):
            # Call the 'bar' method and print the returned value
            result = getattr(module, 'bar')()
            print(f"Variable returned by 'bar': {result}")
        else:
            print("Method 'bar' not found in the module.")
    except Exception as e:
        print(f"Exception occurred: {e}")
        traceback.print_exc()
    finally:
        print(f"Call Stack: {traceback.print_stack()}")

# Example usage:
module_path = 'your_module_name'  # Replace with the actual module name or file path
find_bar_variable(module_path)

Class and Method Definitions

DECOUPLE, Decouple, decouple
Use Class definitions to organize state, define naming conventions, and enforce the order of operations
Use arguments to make things testable

Method Tracing

import sys
from functools import wraps

class TraceCalls(object):
    """ Use as a decorator on functions that should be traced. Several
        functions can be decorated - they will all be indented according
        to their call depth.
    """
    def __init__(self, stream=sys.stdout, indent_step=2, show_ret=False):
        self.stream = stream
        self.indent_step = indent_step
        self.show_ret = show_ret

        # This is a class attribute since we want to share the indentation
        # level between different traced functions, in case they call
        # each other.
        TraceCalls.cur_indent = 0

    def __call__(self, fn):
        @wraps(fn)
        def wrapper(*args, **kwargs):
            indent = ' ' * TraceCalls.cur_indent
            argstr = ', '.join(
                [repr(a) for a in args] +
                ["%s=%s" % (a, repr(b)) for a, b in kwargs.items()])
            self.stream.write('%s%s(%s)\n' % (indent, fn.__name__, argstr))

            TraceCalls.cur_indent += self.indent_step
            ret = fn(*args, **kwargs)
            TraceCalls.cur_indent -= self.indent_step

            if self.show_ret:
                self.stream.write('%s--> %s\n' % (indent, ret))
            return ret
        return wrapper

And here's how we can use it:

@TraceCalls()
def iseven(n):
    return True if n == 0 else isodd(n - 1)

@TraceCalls()
def isodd(n):
    return False if n == 0 else iseven(n - 1)

print(iseven(7))

System Level Exception Handling

import sys

def custom_exception_hook(exctype, value, traceback):
    print(f"Caught {exctype.__name__}: {value}")
    # Handle the exception or perform other actions here

# Set the custom exception hook
sys.excepthook = custom_exception_hook

# Your code goes here...
# Any unhandled exceptions will now be caught by the custom_exception_hook.

tuples

Use tuples to return multiple values and implement multiple dispatch based on the method signature of a class method

https://www.tutorialspoint.com/how-to-concatenate-tuples-to-nested-tuples-in-python

metaclasses

Creating a metaclass can allow you to add behavior to a class, for example, adding dot notation to a dictionary:

1st party vs 3rd party libraries

Look for a first party library first
If there is one, try it
If not, look for a third party library second
If there is one, try it
Otherwise, build one yourself
Open source it

class DotDict(dict):
    def __getattr__(self, attr):
        return self.get(attr)

    def __setattr__(self, key, value):
        self[key] = value

    def __delattr__(self, item):
        if item in self:
            del self[item]

which can be used like this:

d = DotDict()
d.foo = 'bar'  # equivalent to d['foo'] = 'bar'
print(d.foo)  # equivalent to print(d['foo'])

2to3

2to3 is one good example of a good use of eval

Installation/Upgrading cpython

pylint

.pylintrc

Disable Reports and all non-error messages

[MESSAGES CONTROL]
disable=all
enable=E

[REPORTS]
reports=no

Bugs

cPython and Stdlib Tests

pytest

TDD

Best Practices

Use PytestReturnNotNoneWarning to mark tests that do not return None as outliers

Use `yield` based fixtures

yield splits a fixture into two phases:

Setup phase: Code before yield runs before the test.
Teardown phase: Code after yield runs after the test completes, even if it fails.

Keep Setup and Cleanup Together

@pytest.fixture
def resource():
    # Setup
    obj = acquire_resource()
    yield obj
    # Cleanup
    release_resource(obj)

Use Context Managers when available

@pytest.fixture
def temp_file():
    with open("temp.txt", "w") as f:
        f.write("data")
        yield "temp.txt"
    os.remove("temp.txt")

Use module or session scope for expensive fixtures

@pytest.fixture(scope="module")
def db_connection():
    conn = setup_db()
    yield conn
    conn.close()

Simplify Teardown

Use a mock to avoid the need for CRUD during teardown

Test WorkFlow

Config setup/teardown hooks

Comment code to ID mocks: pytester

Create fixture: mocker

Use fixture: pytest.mark.usefixtures

Capture stdout: capfd/capsys

Rerun failed tests: config.cache

Inspect: pytestconfig()

Optimize ENV: https://docs.pytest.org/en/stable/reference/reference.html#environment-variables

Examples

unittest lib

Mock Servers

jupyter

numpy

pandas

SQLite

ASCII Table to Markdown Table

#!/usr/bin/env python3
# ascii2md_sqlite.py

Convert an ASCII table (pipe-delimited) into a Markdown table using a
structured intermediate based on sqlite3.

Constraints followed:
- Minimize ad-hoc string/regex parsing by operating on raw bytes for input tokenization.
- Use sqlite3 as the structured store for rows and columns.
- No regular expressions are used.
- Public API: convert_bytes_table(input_bytes) -> markdown_bytes
"""

from typing import List, Tuple
import sqlite3
import io

# ----- Helpers operating on bytes (minimizes string parsing) -----


def _split_row_bytes(row: bytes) -> List[bytes]:
    """
    Split a pipe-delimited row encoded as bytes into trimmed cell bytes.
    Leading/trailing pipes are tolerated.
    """
    # Split on pipe byte, then strip ASCII spaces (32) from ends of each cell.
    parts = row.split(b'|')
    def trim(b: bytes) -> bytes:
        i, j = 0, len(b) - 1
        while i <= j and b[i] == 32:  # space
            i += 1
        while j >= i and b[j] == 32:
            j -= 1
        return b[i:j+1] if i <= j else b''
    return [trim(p) for p in parts if p is not None]  # keep empty cells if present


def _is_separator_line(row: bytes) -> bool:
    """
    Heuristic: a separator line between header and body typically contains
    at least one '-' and only ascii '-', ' ', '|' characters.
    """
    if not row:
        return False
    allowed = set(b"- |")
    s = set(row)
    return s.issubset(allowed) and b'-' in row


# ----- Core converter using sqlite3 -----


def convert_bytes_table(input_bytes: bytes) -> bytes:
    """
    Convert an ASCII pipe-delimited table provided as bytes into a Markdown table (bytes).
    The function:
      - tokenizes rows using bytes operations
      - detects header separator line
      - stores header and body rows into an in-memory sqlite3 table
      - reads back rows and emits a Markdown table as bytes

    Example input (bytes):
      b"| ColA | ColB |\n|------|------|\n| a1   | b1   |\n| a2   | b2   |"

    Returns markdown bytes:
      b"| ColA | ColB |\n|------|------|\n| a1   | b1   |\n| a2   | b2   |"
    """
    # Normalize line endings to \n and split into lines (bytes)
    data = input_bytes.replace(b'\r\n', b'\n').replace(b'\r', b'\n')
    lines = data.split(b'\n')

    # Tokenize into rows of cells (bytes). Keep empty lines out.
    tokenized: List[List[bytes]] = []
    sep_index = None
    for idx, ln in enumerate(lines):
        if not ln.strip():
            continue
        if _is_separator_line(ln):
            sep_index = len(tokenized)
            continue
        cells = _split_row_bytes(ln)
        if not cells:
            continue
        tokenized.append(cells)

    if not tokenized:
        return b''

    # If no explicit separator line found, assume first row is header
    if sep_index is None:
        header = tokenized[0]
        body = tokenized[1:]
    else:
        # separator line splits header (everything before it; usually single row) and body
        if sep_index == 0:
            # nothing before separator: fallback to first tokenized row as header
            header = tokenized[0]
            body = tokenized[1:]
        else:
            header = tokenized[0]
            body = tokenized[1:]  # tokenized already excludes separator rows

    ncols = max(len(header), max((len(r) for r in body), default=0))

    # Ensure header and body rows have consistent column counts by padding with empty bytes
    def _pad_row(r: List[bytes]) -> List[bytes]:
        return r + [b''] * (ncols - len(r))

    header = _pad_row(header)
    body = [_pad_row(r) for r in body]

    # Use sqlite in-memory DB to store rows with column names c0..cN-1
    conn = sqlite3.connect(':memory:')
    cur = conn.cursor()
    cols = ', '.join(f'c{i} TEXT' for i in range(ncols))
    cur.execute(f'CREATE TABLE tbl ({cols})')

    insert_q = f'INSERT INTO tbl VALUES ({", ".join("?" for _ in range(ncols))})'
    # Insert header as a special row with a marker; we will not store header in sqlite table,
    # but keep header separate — keep sqlite for body structured storage and potential queries.
    for row in body:
        # Convert bytes to UTF-8 text safely; replace undecodable bytes
        row_text = [cell.decode('utf-8', errors='replace') for cell in row]
        cur.execute(insert_q, row_text)
    conn.commit()

    # Read rows back to generate markdown table
    # Build ASCII header line and separator
    header_texts = [h.decode('utf-8', errors='replace') for h in header]
    # Compute column widths based on header and body
    widths = [len(h) for h in header_texts]
    cur.execute('SELECT * FROM tbl')
    rows_out = cur.fetchall()
    for r in rows_out:
        for i, v in enumerate(r):
            if v is None:
                v = ''
            widths[i] = max(widths[i], len(str(v)))

    # Helper to format a row into markdown bytes
    def _format_row_texts(texts: List[str]) -> bytes:
        cells = []
        for i, t in enumerate(texts):
            padded = t + ' ' * (widths[i] - len(t))
            cells.append(padded)
        line = "| " + " | ".join(cells) + " |"
        return line.encode('utf-8')

    # Build separator using dashes at least 3 or width length
    sep_cells = []
    for w in widths:
        dash_count = max(3, w)
        sep_cells.append('-' * dash_count)
    sep_line = "| " + " | ".join(sep_cells) + " |"
    sep_bytes = sep_line.encode('utf-8')

    # Assemble output lines
    out_lines: List[bytes] = []
    out_lines.append(_format_row_texts(header_texts))
    out_lines.append(sep_bytes)
    for r in rows_out:
        texts = [str(v) if v is not None else '' for v in r]
        out_lines.append(_format_row_texts(texts))

    conn.close()
    return b'\n'.join(out_lines) + b'\n'


# ----- CLI utility for convenience -----


def convert_file_to_markdown(inpath: str, outpath: str) -> None:
    with open(inpath, 'rb') as f:
        data = f.read()
    md = convert_bytes_table(data)
    with open(outpath, 'wb') as f:
        f.write(md)


# ----- Example usage when run as script -----


if __name__ == "__main__":
    import sys
    if len(sys.argv) not in (2, 3):
        sys.stderr.write("Usage: ascii2md_sqlite.py INPUT_FILE [OUTPUT_FILE]\n")
        raise SystemExit(2)
    inp = sys.argv[1]
    out = sys.argv[2] if len(sys.argv) == 3 else None
    with open(inp, 'rb') as fh:
        b = fh.read()
    md = convert_bytes_table(b)
    if out:
        with open(out, 'wb') as fh:
            fh.write(md)
    else:
        sys.stdout.buffer.write(md)

Clipboard Manager

import clipboard
import sqlite3
import time

# Function to create the SQLite database and table
def create_db():
    conn = sqlite3.connect('clipboard.db')
    cursor = conn.cursor()
    cursor.execute('''
        CREATE TABLE IF NOT EXISTS clipboard (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            content TEXT NOT NULL,
            timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
        )
    ''')
    conn.commit()
    conn.close()

# Function to store clipboard content in the database
def store_clipboard_content(content):
    conn = sqlite3.connect('clipboard.db')
    cursor = conn.cursor()
    cursor.execute('''
        INSERT INTO clipboard (content)
        VALUES (?)
    ''', (content,))
    conn.commit()
    conn.close()

# Monitor clipboard changes and store new content
def monitor_clipboard():
    create_db()
    previous_content = clipboard.paste()
    while True:
        current_content = clipboard.paste()
        if current_content != previous_content:
            store_clipboard_content(current_content)
            previous_content = current_content
        time.sleep(1)  # Check clipboard every second

if __name__ == "__main__":
    monitor_clipboard()

Simple Print of Dataframe

import pandas as pd

# Sample DataFrame
data = {'Column1': [1, 2, 3], 'Column2': [4, 5, 6]}
df = pd.DataFrame(data)

# Print the number of rows and the version of pandas
print("The number of rows is: %s and the pandas version is: %s" % (len(df), pd.__version__))

CSV Example

import os
import pandas as pd

def concatenate_csv_files(starting_value, file_path):
    # List all files in the given directory
    all_files = os.listdir(file_path)
    
    # Filter files that start with the specified value and end with .csv
    csv_files = [file for file in all_files if file.startswith(starting_value) and file.endswith('.csv')]
    
    # Sort the list of files by name
    csv_files.sort()

    # List to hold DataFrames
    dataframes = []
    
    # Read each CSV file and append the DataFrame to the list
    for file in csv_files:
        df = pd.read_csv(os.path.join(file_path, file))
        dataframes.append(df)
    
    # Concatenate all DataFrames
    final_df = pd.concat(dataframes, ignore_index=True)
    
    # Output the final DataFrame to a CSV file
    output_file = os.path.join(file_path, 'output.csv')
    final_df.to_csv(output_file, index=False)
    
    print(f"Concatenated {len(csv_files)} files into {output_file}")

# Example usage
concatenate_csv_files('data_', '/path/to/your/csv/files')

Stack/Queue

# ---
# title: "SQLite-stack/queue (Python) - Usage Notes"
# usage:
#   init: "Creates schema; call before enqueue/receive/delete"
#   enqueue: "Accepts JSON-serializable payloads; returns row id"
#   receive:
#     description: "FIFO receive with visibility timeout; returns receipt_handle"
#     params:
#       max_messages: "int"
#       visibility: "seconds"
#   delete: "Delete message by receipt_handle; returns bool"
#   change_visibility: "Adjust visibility for a held receipt_handle; returns bool"
#   peek: "Non-destructive peek at earliest n messages (ignores visibility)"
#   stats: "Returns counts: total, visible now, locked"
#   purge: "Delete all messages; useful for tests"
# concurrency:
#   notes: "WAL enabled; conditional WHERE clauses avoid races; single-writer pattern recommended under heavy load"
# auditability:
#   notes: "Persistent rows include payload, attempts, receipt_handle, timestamps for forensic inspection"
# ---

"""
SQLite-stack/queue (Python 3.14 with pattern matching)

A compact Python module that uses SQLite as a durable stack/queue store to emulate
core SQS semantics: enqueue, receive (with visibility timeout), delete by
receipt handle, change visibility, and approximate stats. Uses a single table as
the persistent queue and implements atomic operations via transactions.
Designed for localhost use, auditability, and easy inspection.

This version uses Python 3.14 structural pattern matching for the simple CLI
driver and for interpreting command results in a concise, modern style.
"""

from __future__ import annotations

import sqlite3
import uuid
import json
import time
import sys
from contextlib import contextmanager
from dataclasses import dataclass
from typing import Any, Dict, Iterable, List, Optional

DEFAULT_DB = "sqs_emu.db"
DEFAULT_VISIBILITY = 30  # seconds

_SCHEMA = """
CREATE TABLE IF NOT EXISTS queue (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    enqueued_at REAL NOT NULL,
    payload TEXT NOT NULL,
    visible_at REAL NOT NULL,
    receipt_handle TEXT NULL,
    attempts INTEGER NOT NULL DEFAULT 0
);
CREATE INDEX IF NOT EXISTS idx_visible_at ON queue(visible_at);
"""

@dataclass
class Message:
    id: int
    payload: Any
    receipt_handle: Optional[str]
    attempts: int
    enqueued_at: float

@contextmanager
def conn_ctx(path: str = DEFAULT_DB):
    con = sqlite3.connect(path, isolation_level=None, detect_types=sqlite3.PARSE_DECLTYPES)
    try:
        con.execute("PRAGMA journal_mode=WAL;")
        con.execute("PRAGMA synchronous=NORMAL;")
        yield con
    finally:
        con.close()

def init(path: str = DEFAULT_DB) -> None:
    with conn_ctx(path) as c:
        c.executescript(_SCHEMA)

def _now() -> float:
    return time.time()

def enqueue(payload: Any, path: str = DEFAULT_DB) -> int:
    """Push a message (any JSON-serializable) onto the queue. Returns row id."""
    payload_text = json.dumps(payload, separators=(",", ":"))
    now = _now()
    with conn_ctx(path) as c:
        cur = c.execute(
            "INSERT INTO queue (enqueued_at, payload, visible_at) VALUES (?, ?, ?)",
            (now, payload_text, now),
        )
        return cur.lastrowid

def _reclaim_expired(con: sqlite3.Connection, now: Optional[float] = None) -> None:
    """Clear receipt_handle for expired messages so they become visible again."""
    if now is None:
        now = _now()
    con.execute(
        "UPDATE queue SET receipt_handle = NULL WHERE receipt_handle IS NOT NULL AND visible_at <= ?",
        (now,),
    )

def receive(max_messages: int = 1, visibility: int = DEFAULT_VISIBILITY, path: str = DEFAULT_DB) -> List[Message]:
    """
    Receive up to max_messages messages.
    Returns list of Message dataclasses: {id, payload, receipt_handle, attempts, enqueued_at}
    """
    now = _now()
    results: List[Message] = []
    with conn_ctx(path) as c:
        _reclaim_expired(c, now)

        cur = c.execute(
            "SELECT id, payload, attempts, enqueued_at FROM queue WHERE receipt_handle IS NULL AND visible_at <= ? ORDER BY id LIMIT ?",
            (now, max_messages),
        )
        rows = cur.fetchall()
        for row in rows:
            row_id, payload_text, attempts, enq = row
            receipt = str(uuid.uuid4())
            new_visible = now + visibility
            updated = c.execute(
                "UPDATE queue SET receipt_handle = ?, visible_at = ?, attempts = attempts + 1 WHERE id = ? AND receipt_handle IS NULL",
                (receipt, new_visible, row_id),
            ).rowcount
            if updated:
                results.append(
                    Message(
                        id=row_id,
                        payload=json.loads(payload_text),
                        receipt_handle=receipt,
                        attempts=attempts + 1,
                        enqueued_at=enq,
                    )
                )
        return results

def delete(receipt_handle: str, path: str = DEFAULT_DB) -> bool:
    """Delete message by receipt_handle. Returns True if deleted."""
    with conn_ctx(path) as c:
        cur = c.execute("DELETE FROM queue WHERE receipt_handle = ?", (receipt_handle,))
        return cur.rowcount > 0

def change_visibility(receipt_handle: str, visibility: int, path: str = DEFAULT_DB) -> bool:
    """Change visibility for the message holding this receipt_handle. Returns True if updated."""
    now = _now()
    new_visible = now + visibility
    with conn_ctx(path) as c:
        cur = c.execute("UPDATE queue SET visible_at = ? WHERE receipt_handle = ?", (new_visible, receipt_handle))
        return cur.rowcount > 0

def peek(n: int = 10, path: str = DEFAULT_DB) -> List[Dict[str, Any]]:
    """Non-destructive peek at earliest n messages (ignores visibility)."""
    with conn_ctx(path) as c:
        cur = c.execute("SELECT id, payload, visible_at, receipt_handle, attempts FROM queue ORDER BY id LIMIT ?", (n,))
        return [
            {"id": r[0], "payload": json.loads(r[1]), "visible_at": r[2], "receipt_handle": r[3], "attempts": r[4]}
            for r in cur.fetchall()
        ]

def stats(path: str = DEFAULT_DB) -> Dict[str, int]:
    """Return simple counts: total, visible now, locked."""
    now = _now()
    with conn_ctx(path) as c:
        total = c.execute("SELECT COUNT(*) FROM queue").fetchone()[0]
        visible = c.execute("SELECT COUNT(*) FROM queue WHERE visible_at <= ? AND receipt_handle IS NULL", (now,)).fetchone()[0]
        locked = c.execute("SELECT COUNT(*) FROM queue WHERE receipt_handle IS NOT NULL").fetchone()[0]
        return {"total": total, "visible": visible, "locked": locked}

def purge(path: str = DEFAULT_DB) -> None:
    """Delete all messages (useful for tests)."""
    with conn_ctx(path) as c:
        c.execute("DELETE FROM queue")
        c.execute("VACUUM")

# Simple CLI driver demonstrating pattern matching usage
def _print_message(msg: Message) -> None:
    print(f"id={msg.id} receipt={msg.receipt_handle} attempts={msg.attempts} enqueued_at={msg.enqueued_at}")
    print(json.dumps(msg.payload, indent=2))

def _handle_command(argv: Iterable[str]) -> int:
    args = list(argv)
    if not args:
        print("usage: <cmd> [args...]  (init|enqueue|receive|delete|peek|stats|purge)")
        return 1
    cmd, *rest = args
    match cmd:
        case "init":
            init()
            print("initialized")
            return 0
        case "enqueue":
            payload = json.loads(rest[0]) if rest else {"task": "noop"}
            rowid = enqueue(payload)
            print("enqueued id:", rowid)
            return 0
        case "receive":
            max_messages = int(rest[0]) if rest else 1
            visibility = int(rest[1]) if len(rest) > 1 else DEFAULT_VISIBILITY
            msgs = receive(max_messages=max_messages, visibility=visibility)
            match msgs:
                case []:
                    print("no messages")
                case _:
                    for m in msgs:
                        _print_message(m)
            return 0
        case "delete":
            if not rest:
                print("delete requires receipt_handle")
                return 2
            ok = delete(rest[0])
            print("deleted" if ok else "not found")
            return 0
        case "peek":
            n = int(rest[0]) if rest else 10
            items = peek(n=n)
            for it in items:
                print(json.dumps(it, indent=2))
            return 0
        case "stats":
            s = stats()
            print(json.dumps(s, indent=2))
            return 0
        case "purge":
            purge()
            print("purged")
            return 0
        case _:
            print("unknown command:", cmd)
            return 3

if __name__ == "__main__":
    raise SystemExit(_handle_command(sys.argv[1:]))

arguments

https://linux.die.net/diveintopython/html/power_of_introspection/optional_arguments.html

special methods

https://www.pythonlikeyoumeanit.com/Module4_OOP/Special_Methods.html

return values

https://bugs.python.org/issue30220

internal boxing and unboxing

https://mail.python.org/pipermail/python-dev/2015-March/138669.html

switch/case method dispatch table

Performance Profiling

Debugging Modules

Recipes

site-packages

String Formatting

Async

Decorators / Mixins

Patterns

Parsers

BeautifulSoup

import requests
from bs4 import BeautifulSoup
import random
import string

# Helper to generate random text
def random_text(length=8):
    return ''.join(random.choices(string.ascii_letters + string.digits, k=length))

# Start session and fetch page
session = requests.Session()
response = session.get("https://example.com")
soup = BeautifulSoup(response.text, "html.parser")

# Find the first form
form = soup.find("form")
if not form:
    raise Exception("No form found")

# Extract form action and method
action = form.get("action")
method = form.get("method", "get").lower()

# Prepare form data
form_data = {}
for input_tag in form.find_all("input"):
    name = input_tag.get("name")
    input_type = input_tag.get("type", "text")
    
    if name in ["user", "username"]:
        form_data[name] = random_text()
    elif name == "password":
        form_data[name] = random_text()
    elif name == "captcha":
        form_data[name] = random_text()
    elif input_type == "submit":
        submit_name = name or "submit"
        submit_value = input_tag.get("value", "Submit")
        form_data[submit_name] = submit_value
    else:
        # Preserve default value if present
        form_data[name] = input_tag.get("value", "")

# Submit the form
target_url = requests.compat.urljoin(response.url, action)
if method == "post":
    submit_response = session.post(target_url, data=form_data)
else:
    submit_response = session.get(target_url, params=form_data)

print("Form submitted. Response status:", submit_response.status_code)

Quiz

Multiprocessing

Lazy Eval

JWT

WSGI/ASGI

Callables

SSI

Sample Code

References

ipinfo API Design

use Geo::IPinfo;
my $access_token = 'your_api_token_here';
my $ipinfo = Geo::IPinfo->new($access_token);
my $ip_address = '216.239.36.21';
my $details = $ipinfo->info($ip_address);
my $city = $details->city;  # Emeryville
my $loc = $details->loc;    # 37.8342,-122.2900

ASCII Art

+-------------------+       +-------------------+       +-------------------+
|                   |       |                   |       |                   |
|   Cron Job        | ----> |   Shell Script    | ----> |   Boto3 Script    |
|                   |       |                   |       |                   |
+-------------------+       +-------------------+       +-------------------+
        |                           |                           |
        |                           |                           |
        |                           |                           |
        v                           v                           v
+-------------------+       +-------------------+       +-------------------+
|                   |       |                   |       |                   |
|   Execute Shell   | ----> |   Execute Boto3   | ----> |   Interact with   |
|   Script          |       |   Script          |       |   SQLite Database |
|                   |       |                   |       |                   |
+-------------------+       +-------------------+       +-------------------+

Prompt

Other than lists, strings, tuples, dictionaries, integers, classes, functions, and metaclasses, what other data structures are in Python core without the use of module import statements?

Utility Functionality

lib 2to3

https://github.com/python/cpython/issues/84540

Definition

def run_if_main(): # Your code here print("This code runs only if the script is executed directly.")

if name == "main": run_if_main()

Usage

another_module.py

from my_utils import run_if_main

Other code here

Call the utility function

run_if_main()

Gotchas

Non-determinism

Most nondeterministic aspect of Django: Database operations when using Django ORM.
High level of abstraction in Django ORM can lead to unexpected behaviors.
Differences in database backends can cause nondeterminism.
Lazy evaluation can lead to unpredictable results.
Handling of transactions and concurrency in Django ORM is not always straightforward.
Order of query results can be nondeterministic if not explicitly ordered.
Operations like save() and delete() can have side effects that are not immediately obvious.

Interceptors

Modules

API	Description	Example Usage
Proxy API (`Proxy` class)	Intercepts object property access and function calls.	`proxy = Proxy(target, handler)`
Decorator Pattern (`@decorator`)	Wraps functions to modify behavior before execution.	`@interceptor def function(): pass`
Monkey Patching (`module.function = custom_function`)	Overrides built-in functions dynamically.	`sys.stdout.write = interceptor_function`
Event System (`observer pattern`)	Captures and modifies events dynamically.	`event.listen(object, event, callback)`
Reflect API (`inspect` module)	Provides fine-grained control over object attributes.	`inspect.getmembers(target)`

psycopg2

API	Description	Example Usage
Connection Hooks (`register_type`)	Alters database connection behavior before queries.	`psycopg2.extensions.register_type(custom_hook)`
Logging Interceptors (`logging.enable`)	Intercepts SQL execution to modify logs.	`conn.set_trace_callback(trace_callback)`
Custom Cursors (`cursor_factory`)	Wraps cursor executions for interception.	`conn.cursor(cursor_factory=CustomCursor)`
Execute Interceptors (`execute_wrapper`)	Modifies queries before sending them to the database.	`cursor.execute = interceptor_function`

Doc

https://www.psycopg.org/docs/advanced.html

SQLAlchemy

Session Lifecycle

from sqlalchemy.orm import sessionmaker
from contextlib import contextmanager

@contextmanager
def session_scope(engine):
    Session = sessionmaker(bind=engine)
    session = Session()
    try:
        yield session
        session.commit()
    except:
        session.rollback()
        raise
    finally:
        session.close()

Events

API	Description	Example Usage
Event Listeners (`event.listen`)	Hooks into query execution to modify SQL behavior.	`event.listen(engine, "before_execute", callback)`
Session Events (`before_flush`)	Modifies transactions before committing.	`event.listen(Session, "before_flush", callback)`
Hybrid Properties (`@hybrid_property`)	Intercepts attribute access in ORM models.	`@hybrid_property def modified_attr(self): return self.value * 2`
Custom Query Classes (`Query.override`)	Overrides query methods for controlled execution.	`class CustomQuery(Query): def filter(self, args): return super().filter(modified_args)`
Reflection (`inspect` module)	Enables deep introspection of ORM models.	`inspect(User).columns`

Tool	Use Case
TypeScript	Static type checking in JavaScript applications
mypy	Static type checking for Python
zod	Runtime validation for TypeScript
pydantic	Runtime validation for Python
Hypothesis	Property-based testing in Python
fast-check	Property-based testing in JavaScript
Rust	Strict type enforcement at compile time
Haskell	Strong type safety through functional programming

Session Debugging

https://stackoverflow.com/questions/69307554/why-would-my-pytest-tests-hang-before-dropping-my-sqlalchemy-db

Design Patterns

Abstract Factory

Python Module	Date Created
abc	2008

Builder

Python Module	Date Created
dataclasses	2017

Factory Method

Python Module	Date Created
abc	2008
multiprocessing	2008

Prototype

Python Module	Date Created
copy	2001
pickle	1994

Singleton

Python Module	Date Created
threading.local	2003
logging.getLogger	1996

Adapter

Python Module	Date Created
functools.wraps	2007

Bridge

Python Module	Date Created
abc	2008

Composite

Python Module	Date Created
xml.etree.ElementTree	2001

Decorator

Python Module	Date Created
functools.wraps	2007

Facade

Python Module	Date Created
os	1991
shutil	1999

Flyweight

Python Module	Date Created
sys.intern	1991
functools.lru_cache	2007

Proxy

Python Module	Date Created
urllib.request.ProxyHandler	1995

Chain of Responsibility

Python Module	Date Created
logging	1996

Command

Python Module	Date Created
cmd	2000
argparse	2007

Interpreter

Python Module	Date Created
ast	2006

Iterator

Python Module	Date Created
collections.abc.Iterator	2008
itertools	2001

Mediator

Python Module	Date Created
socketserver	1996

Memento

Python Module	Date Created
pickle	1994
json	2002

Observer

Python Module	Date Created
logging	1996

State

Python Module	Date Created
enum	2014

Strategy

Python Module	Date Created
operator	1994

Template Method

Python Module	Date Created
abc	2008

Visitor

Python Module	Date Created
ast.NodeVisitor	2006

PyPa - sgml/signature GitHub Wiki

Simplicity

Circular Imports

Code Review

Archetype: Assertion Modalities in Python

Modalities Compared

Epistemic vs Operational Overlay

Tradeoff Summary

Data Review

Idioms

Quotes

CI/CD

Function Simplicity

Passing

Linting

Logging

String Concat

Pattern Matching

Multiple Dispatch

Boxing

Nesting

Typos

Type Checking

Boolean Algebra

Safe Subset

State

Data

Pip vs Poetry

Install pip-audit

Poetry branch based development of a private repo

foo = { git = "ssh://[email protected]/bar/baz.git", branch = "develop" }

Poetry Github token SSH integration

Verbose poetry

Upgrade poetry

Clean up Dependencies

Optimize Deployment

Optimize Configuration

Use memoization as a poor man's idempotency

Use sums of boolean values instead of and/or clauses

Raise if you do not get the type you expected

Raise TypeError or ValueError instead of a generic error

Use assert to raise errors for impossible calculations

Combine logging with uncaught exceptions to trace unexpected errors

Use init to share globals

Use comments to determine method scope

Use absolute imports instead of relative imports

Use eval to convert a string to a list

Check how many times a string is found in a list

Convert a one member tuple to a string for use with the DBAPI:

Convert a multi member tuple to a dict for use with the DBAPI:

Use return statements instead of assignment statements with the or clause

Use specific exceptions when using except clauses

Custom Exception

Use nested try/except blocks to swallow errors:

Use urllib3.exceptions.ConnectionResetError in an except block to implement a retry mechanism

Use raise and except to make non-200 responses throw exceptions

Use finally to print logs and timing statistics

Use the following built-ins to distinguish strings:

If you have a list of dictionaries, use max to count the largest:

If a sequence is length zero, it is falsy, so no need to check > 0

If you check the length of a dictionary, it will return the number of keys

Use getattr to check for existence in an object

Use a dictionary for caching

Use the lru_cache decorator for caching function call results

Use .get() and or to search dictionaries

Use split and join rather than string manipulation

Use str(), int() and min() to defensively add numbers

Logging to stdout and stderr

Fundamentals

Gotchas

Pydantic

Marshmallow

Abstract Syntax Tree

List to CSV

Docstrings

Docs as Code

Best Practices

Index out of range

Formatting

Open Source Ownership and Auditing

Use sums of boolean values instead of `and`/`or` clauses

Raise `TypeError` or `ValueError` instead of a generic error

Use return statements instead of assignment statements with the `or` clause

Use the `lru_cache` decorator for caching function call results

Use .get() and `or` to search dictionaries

Use `split` and `join` rather than string manipulation

Use `yield` based fixtures