Python_Basics - RicoJia/notes GitHub Wiki

========================================================================

PR Review

========================================================================

  1. If you feel like there're a lot of params, then put them in a kwargs
  2. If you have an xray that needs to be squished into a unit test file, make a switch so it won't be run automatically.
  3. "double negatives" "disable_something = False", try to avoid that

========================================================================

Basics

========================================================================

  1. cache invalidation, naming things, and off-by-one errors.

    • As they say, the two toughtest problems in Compsci are cache invalidation and naming things.
  2. Scope, see code

    • test if scope
    • create_global, read_global
  3. For loop

  4. Except

    1. Use Exception can ignore KeyboardInterrupt, which otherwise get caught by except. see code
  5. Naming: __var introduces "name-mangling"

    • externally, __var will adopt a different name (with _classname at the front), to avoid child class's naming conflicts
      class Foo:
          def __init__(self): 
              self.__bar = 23
          def print_bar(self):
              print(self.__bar)
              
      f = Foo()
      f.print_bar()  #fine
      print(f.__bar)  #Error, cuz it's now _Foo_bar_ from dir(f)
    • self._func will be ignored by import *, self.__func will protect from name mangling
  6. help() is a really good function

  7. line profiler https://github.com/pyutils/line_profiler

    • Cprofile, pstats
      import cProfile, pstats
      
      def run_test():
          pr = cProfile.Profile()
          pr.enable()
          BT = BlobTracker(algorithm_config=data["algorithm_config"],
                           zone_geometry=data["zone_geometry"],
                           on_entry=lambda *args, **kw: None,
                           on_exit=lambda *args, **kw: None,
                           get_image_callback=lambda *args, **kw: None,
                           debug=False)
          for (timestamp, heatmap) in data["heatmaps"]:
              BT.update(heatmap, timestamp)
          pr.disable()
          stats = pstats.Stats(pr)
          stats.dump_stats('blob_tracker_test.prof')
      • mprof run roslaunch ... = <node pkg="my_package" type="my_node" name="my_node_instance" launch-prefix="mprof" /> mprof can be run; and launch prefix can be used to launch profiling
  8. pysimple gui

  9. math

Philosophy

  1. flaws of the python repo:
    1. so many files with the same name - if you're not familiar, you will be easily confused. do Repo_Module.py

========================================================================

Keywords and Operators

========================================================================

  1. del, see code

  2. return keyword: automatically return none

  3. with keyword

    • a context manager has __enter and __exit
    • with will first call __enter, then the code inside, then __exit
  4. see code

    • test is
  5. Keyword in:

    1. test if a contaienr has a value:
      a = [1,2,3]
      3 in a
    2. Traverse thru a loop: for i in ls
  6. any see if anything in an iterable is true

    ls = [1,2,3,4,5,5,4]
    any(num>10 for num in ls)
  7. / and //

    • / always gives floating pointi, 17/5=3.4
    • // gives integer 17//5=3
  8. and boolean and. & bitwise. not boolean not

    • 100 == (100 and 150) True.
  9. Precedence: and, or have different precedence!

    • ==
    • not
    • and
    • or
      • (1 or 3) is 1, weird logic, Don't write it this way!
  10. Assignment =

    • Different from C++: referenced object don't change if you assign the reference to something else. see code

    • Function pass by reference: link

      • variables are passed in by reference.
      • However the outside can see changes only in mutable variables

========================================================================

Functions

========================================================================

  1. args

    • (arg:type) called "type hints", won't be enforced.
    • Syntax:
      def headline(text: str, align: bool = True) -> str:   #note, there's a hint for return type as well
    • style:
      1. space around =
      2. space around ->
      3. no space before :, space after :
    • any number of args using *args, **kwargs TODO: https://www.geeksforgeeks.org/args-kwargs-python/
    • * is to make anything associated with it an iterable. So you can iterate over it
    • *args is basically a tuple of variables, that can correspond to arg.
      def func(arg1 ,*args):
          for i in args: 
            print(i)      #see ("haha", "lal"
      func(1, "haha", "lal") 
    • kwargs, see code
      1. kwargs is just a dictionary
    • Normalmente args.flag es false por defecto
    • test_default_arg():
      1. use immutables only as default args. mutables, such as list, is shared with future function calls, hence they can be changed
        • This is really tricky!
  2. test_control_flow():

    1. decorator that wraps a generator function, which launches an output queue
      • test() is the generator class here
      • send(None) to start generator
    2. Return Async, which takes in lambda as a computation func and a callback
  3. nested function and nonlocal

    • nested function can access members
      class Foo:
          def __init__(self):
              self.var = 100
          def bar(self):
              def baz():
                  self.var = 999
              baz()
      
      f = Foo()
      f.bar()
      print(f.var)
  4. closure

  • once a tmp data is passed into an function object, its value gets saved. Three conditions must be met:
    1. you have a nested function in an outer func
    2. the outer func returns the nested function as an object
    3. the nested func accesses an variable in the outer func.
  • e.g,
    def outer_func(msg): 
      def print_msg(): 
        print(msg)      #nested function can access variable from the outer function. By-default, it's read-only
      return print_msg
    
    another_func = outer_func("hola")
    del outer_func    # Everything in python is an object, a function object here
    another_func()    # see "hola", which is saved, even tho the original outer_func is deleted
  • When to use closure:
    • when you need to save some input data, but just need one function, this is better than class
      func1 = outer_func("hola")
      func2 = outer_func("holi")
  • You can see what's in the closure:
    func1.__closure__   # returns a closure attribute object
    func1.__closure__[0].cell_contents   # see "hola"
  1. decorators:

    1. New examples
      1. Decorator is not to execute a function with extra args. Instead it's a fucntion returning a wrapped function
        • This is called "metaprogramming", which is to write a program that modified an existing program
        • decorators heavily use closures and returns a callable
        • in python, an object with call() is a callable
      2. Func(a)(b) is actually calling a nested function
        • Decorator is just a syntactic sugar for Foo = decorate(Foo) (if decorate takes in Foo), or Foo=decorate(args1)(Foo) (if nested func takes in Foo and decorate takes in args 1)
      3. With functions.wraps, we can have:
        1. name being "some_func" instead of "wrapper"
        2. Access the wrapped function
      4. Of course wraps is optional. Decorator can work without it
      5. chaining decorators - execute top to bottom . func1(a)(b)...
  2. partial, see code

    1. How it works: (apart from kwargs support). partial is returning a wrapper with extended args
    2. equivalent implementation
  3. test_closure_as_class():

    1. You can make a fake class by using a function:
      1. In the current function, you can get the functions using sys._getframe(1).f_locals
      2. You can add attributes to the class by adding to self.dict
      3. callable(value) to see if the value is a callable
    2. This method is a bit faster than the conventional method because we are using a function, which doesn't have self.

========================================================================

OOP

========================================================================

  1. New code

    • test_sort_by_attr
      1. by default, sorted is ascending order. operator.attrgetter
    • test_abstract_method
      1. abstract function is parent class function with no implementation. Parent class having at least 1 abstract function is an abstract class. Same as in C++
      2. Should use ABC, else there won't be error
        • but abstractmethod doesn't seem to do anything?
  2. Making private variables with _ prefix (but this not really "private", python doesn't have private members. )

    class MyClass: 
        def __init__(self): 
          self._temperature = 123
    • Also, self._temperature and self.temperature are two different variables.
  3. property

    • motivation: you need getter and setter functions, but you've had a lot of code that gets and sets a variable directly. property will make those direct getting and setting go thru the new getter and setter functions
    • e.g,
      class foo: 
          def get_property(self): 
              # just return something
              return 100
          def set_property(self, val): 
              # will be called when setting something
              pass
          temperature = property(get_property, set_property)
          
      f = foo()
      f.temperature = 99
      print(f.temperature)
    • if you want to use the @property, (@ is called a decorator)
      • we can reuse the same function name as the variable, so property can create the property object accordingly
        # Making Getters and Setter methods
        class Celsius:
            def __init__(self, temperature=0):
                self._temperature = temperature
        
            def to_fahrenheit(self):
                return (self._temperature * 1.8) + 32
        
            # getter method
            @property
            def temperature(self):      #note: the name needs to be changed
                print("getter")
                return self._temperature
        
            # setter method
            @temperature.setter
            def temperature(self, value):   # name needs to match the outside variable
                print("setter")
                if value < -273.15:
                    raise ValueError("Temperature below -273.15 is not possible.")
                self._temperature = value
        
        human = Celsius(37) #in init we're already calling the setter method
        print(human.temperature)    # calling get_temperature
        # print(human.to_fahrenheit())  #calling set_temperature
        human.temperature = -12    #calling set_temperature
      • READ ONLY properties (getter only functions) are great for encapsulation
        @property
        def entries(self) -> dict:
            return self._entries
        for bounds in self.entries.values()
  4. You can instantiate a class variable even after the object's instantiation, for more flexibility

    class foo:
        pass
    
    f = foo()
    f.temperature = 800
    print(f.temperature)
  5. Static Method

    class WindowManager(object):
        def get_start_goal(self, mp):
            WindowManager.show_map(mp, 1)
    
        @staticmethod
        def show_map(mp: np.ndarray, wait_time=0):
            cv2.imshow(WINDOW_NAME, mp)
            cv2.waitKey(wait_time)
    WindowManager.show_map(some_random_map, 1)

Inheritance

  1. super is the ctor of childclass

    class BagWriter01(BagWriter):
        def __init__(self, stream):
            super(BagWriter01, self).__init__(stream)       #ctor of BagWriter
  2. New-style class: in python 3 there're no differences. In python 2, to unify built-in types and user defined types

    #new style
    class Obj(object):
    
    #old style
    class Obj:
  3. Base class function can call child class functions.

    # Parent class calls function in derived class
    class A:
        def foo(self):
            self.bar()
    
        def bar(self):
            print("from A")
    
    class B(A):
        def foo(self):
            super().foo()
    
        def bar(self):
            print("from B")
    
    B().foo()   #calls "from B"

Misc

  1. Helper functions:
  • isinstance: checks if an object is an instance of class/type
    isinstance(some_list, list) #checks if this is a list
  1. see a class internal functions using dir:
class Bday: 
  pass    #pass works in class too
bday = Bday()
print(dir(bday))    # this will print __class__, __init__....
#if Bday is an iterable, you will see __iter__ and __next__

========================================================================

Resource Management

========================================================================

  1. context manager:
    1. motivation: we want to open a file and close the file automatically. File is a resource, you can't open so many files at the same time.
      • can be used for opening/closing a connection as well.
    2. contextmanager can manage that:
      class ContextManager():
        def __init__(self):
          #initialize the context manager
        def __enter__(self):
          #open a file
        def __exit__(self):
          #exit a file
      
      with ContextManager as manager:   #this will create context manager, then trigger __enter__()
        # stuff AFTER __enter__()
    3. use contextmanager decorator so any function following the following pattern will act as a context manager:
      from contextlib import contextmanager
      @contextmanager
      def some_func():
        #stuff you'd put into __enter__
        yield     #this is like a separator
        #stuff you'd put into __exit__
      with some_func() as manager:
        #stuff
      • seems like you can return another object from yield:
        def some_func():
          #...
          yield SOME_OBJ
          #...
        with some_func() as manager:  will get SOME_OBJ

========================================================================

Iterables

========================================================================

  1. "Iterable": everywhere, like list, tuple, etc. see code

    1. You have to get an iterator, Then you can do next()
    2. or for i in calls the iterator inside an iterable
    3. Once we have reached the bottom of an iterator, raises StopIteration
    4. you must have __iter__ and __next__ for iterators
  2. Generator: iterable that can be iterated only once. see code

    1. Use yield, which is like return, but returns a generator object, which can be iterated only once.
      • Do not store all values in memory at once, generated on the fly
    2. By default, it raises StopIteration exception
    3. So use for i in.... For loop returns a generator
    4. Design Patterns:
      1. Good for stuff that's generated indefinitely, real time
      2. Good for search, which decouples search process from the upper stream code
    5. Elegant way to reduce and transform data. max(), sum(), join(), no need to create a list
      • alternatively, min(dic, key = ...)

Functions that supports iterables

  1. higher order functions:
    • map: apply a transform to the old iterable, and store in a new iterable
    • filter
      • apply predict to the old iterable, and store the "true" ones into a new iterable
        filter(function, sequence)
  2. unpack, see code
    • star expression

========================================================================

Data Types

========================================================================

Basic Types

  1. msg_buf = b'' this is bytes.

    • 'A' != b'A', cuz in Python3, byte is not the same as char. b'' is byte array, used to present low level stuff.
    • number to bytes: (1024).to_bytes(number_of_bytes, BYTEORDER). Byte order can be little endian or big endian.
    • Concatenate raw bytes:
      raw_bytes = b''
      raw_bytes += (1024).to_bytes(num_of_bytes, BYTEORDER)
      
    • 0xff is the same as 0xFF in Python
  2. the ptr of the same object may change

    • id: each python object has an id. When a variable is assigned a new value, it becomes a different object.
      >>> x = 888
      >>> hex(id(x))
      '0x107550050'
      >>> x = 888
      >>> hex(id(x))
      '0x10736aef0'
    • exception: small ints, in [0, 50] are singletons, so their ids won't change. None is a singleton too.
    • python tries to abstract away pointer as it's against the zen of python. Also Python focuses on usability of speed.
    • modules in Python is singletons
  3. time complexity. list find, pop are o(n). set is o(1)

  4. mutables & immutables

    • Mutables: data can change over time
      1. int
      2. str
      3. tuple
      4. Frozen Sets
    • Immutables: data inside cannot change
      1. Lists
      2. Sets
      3. Dictionaries

String

  1. misc see code

    • ljust
    • str.split
    • if str.endswith("s")
    • str_.center()
    • test_or():
      1. Or actually returns the first input when it's 'truthy' or you've reached the end. And this is because it Works with "short-circuit", i.e., keep searching until it finds a true
        • So if "sdf" is the first "truthy" value, then it will return it
      2. and - returns the first "falsy" input
      3. great for handling corner cases, where you might have None, or empty list.
    • startswith
      1. you can search for strings that starts with one of the following. But you need tuple
  2. Read pickled string:

    input_dict = pickle.loads(payload)

List

  1. create a list 3. With n elements: ls = ["ls1"]*10 - [None] * 3 gives [None, None, None] - But in other cases (mutables), you're creating lists of references to lists: ls = [[]] * 2 has 2 refs to the same list. So if you do ls[0].append(1), you will see ls - [[1], [2]]. - To avoid, ls = [[] for _ in range(2)]

  2. Loop through multiple lists at the same time

    ls = [1,2,3]
    ls2 = [4,5,6,7]
    for i, j in zip(ls1, ls2): 
      print (i,j)   #you'll see 1,4; 2,5; 3,6; 
  3. Indexing list[:2][::-1]

    • search for index list.index("CONTENT")
    • ls[::-1]
  4. Operations:

    • add items: list.append()
      • Don't do list.append(another_list), do list+=another_list
      • insert at the beginning: ls.insert(INDEX, item)
      • Differences bw append, +=: list.append(number), list+=another_list
    • Remove Items
      • list.clear()
      • list.remove("content")
      • list.pop(INDEX)
    • Duplicates
      ls = [1,2,3]
      [ls] * 3    # ([[1,2,3], [1,2,3], [1,2,3]]), but are references to the same mutable. (so int is fine)
      [ls for i in range(3)]  # these are separate ls.
      (1,) * 3        # same thing for tuple

========================================================================

Multithreading

========================================================================

  1. you actually just have one thread

    # foo_thread = threading.Thread(target=start_foo, args=())
    # foo_thread.start()
    foo_thread.join()
    • Python multithreading is still better than single-threaded, because for optimizable functions such as sleep(), you can wait while executing another one!!
  2. Many data structures in Python is atomic, because of the single-threaded model. As long as there's only one store/read from the variable in the bytecode

    import dis 
    def foo(): 
      a+=1
    dis.dis(foo)
    # see this: 
      0 LOAD_GLOBAL              0 (true)
      2 STORE_FAST               0 (a)
      4 LOAD_CONST               0 (None)
      6 RETURN_VALUE
  3. concurrent.features

    • ThreadPoolExecutor and ProcessPoolExecutor (launch some processes)
    • still needs GIL
    • example:
      from concurrent.futures import ThreadPoolExecutor
      from time import sleep
      
      def return_after_5_secs():
          sleep(2)
          return "holi"
      
      pool = ThreadPoolExecutor(3)
      fut = pool.submit(return_after_5_secs)
      print(fut.done())       # see false
      print(fut.result())     # returning "holi"
  4. RLock- reentrant lock, or recursive lock. Can be locked by the same thread without block, but also need to be unlocked by the same thread.

    import threading 
    lk = RLock()
    with lk: 
  5. thread.event(): like conditional variable. It has a flag inside, and wait, until another thread calls event.set()

    import threading
    import time
    def foo(ev):
        print(f"flag: {ev.isSet()}")
        ev.wait(20) #timeout
        print(f"flag: {ev.isSet()}")
    ev = threading.Event()
    th1 = threading.Thread(name="Th1", target=foo, args=(ev,))
    th1.start()
    time.sleep(3)
    ev.set()  # set event flag to true, then ev will stop waiting
    #reset the event flag to false
    ev.clear()
  6. multiprocess: see code

    • test_multiprocess_queue
      1. Queue(max_size)
      2. queue.put(), queue.get() by default will block the main thread and wait for an item to come
      3. queue.put_nowait(), queue.get_nowait() will not block
  7. threadpool executor

Lessons Learned

  1. If a thread has no response, maybe it's dead.
    • if a thread appears with no response, then the thread is dead.

========================================================================

Low level

========================================================================

Logging and Warning

  1. You can set what level of log can be printed to console. By default it'd be "ERROR". So do logging.info("something") may not print

  2. Logger can be attached to a filter, but that filter can filter based on the name of the filter.

    • So if name doesn't match, no log comes out
    • module_name.filter_name
  3. StructLog: https://www.structlog.org/en/stable/standard-library.html#suggested-configurations

    • fully format the string;
    • only pass in the logging dict without formatting;
    • or having logging.log to use StructLog formatting at the same time.
  4. Other loggers,

    • Sentry: Sentry is an event logging platform primarily focused on capturing and aggregating exceptions
    • Struct Log: structlog makes structured logging in Python easy by augmenting your existing logger. It allows you to split your log entries up into key/value pairs and build them
    • JSON Logging https://github.com/bobbui/json-logging-python
  5. See ELK logs.

    • How to push a log and have it seen?

OS

  1. signals:

    • SIGUSR
      os.killpg(os.getppid(), signal.SIGUSR2)
  2. print in sighandler

    • due to its implementation, print is not thread-safe. issue, Work around:
      def thread_print(str_msg):
      t = threading.Thread(target=print, args=(str_msg,), kwargs={"flush": True})
      t.start()
      t.join()

Import

  1. relative import

    โ””โ”€โ”€ project
        โ”œโ”€โ”€ package1
        โ”‚   โ”œโ”€โ”€ module1.py
        โ”‚   โ””โ”€โ”€ module2.py
        โ””โ”€โ”€ package2
            โ”œโ”€โ”€ __init__.py
            โ”œโ”€โ”€ module3.py
            โ”œโ”€โ”€ module4.py
            โ””โ”€โ”€ subpackage1
                โ””โ”€โ”€ module5.py
    
  2. If we have the above class and func definitions:

    package1/module2.py contains a function, function1.
    package2/__init__.py contains a class, class1.
    package2/subpackage1/module5.py contains a function, function2.
    
    1. import function1 into the package1/module1.py from .module2 import function1
    2. import class1 and function2 into the package2/module3.py
    from . import class1 #. means the current package, and __init__ is automatically imported as a module
    from .subpackage1.module5 import function2
    1. import function 1 into module5.py
    from ...package1.module2 import function1     # .. ... can be used as well.
  3. Cautions:

    1. No . .. ... in import
    2. if launched as a single file, "name" becomes "main", so it doesn't contain package structure info
      • Python doesn't look at the file system structure, so you cannot do from .. import
      • If a file is launched by another file, then "name" will become the relative file path.
      • Why do we need if__name__ == main: cuz otherwise if it's imported, the program will start.

OS

  1. Command line argument

    import sys
    print(sys.argv[1])   #will see -t, or something like that, be careful with -
  2. simulate key strokes

  3. remove a file:

    import os
    os.remove(file)
  4. Get file path

    path = os.path.dirname(os.path.abspath(__file__)) + "/test.png"
    • basename
      import os
      os.path.exists("path")
      os.path.basename("/home/jj")  #basename is jj
  5. remove a file: os.remove(pickled_bag)

  6. print to stderr: sys.stderr.write("something")

  7. glob - find files with wild card:

    for name in glob.glob("/path/*md")

time

  1. time.time(), float, seconds, since epoch
  2. time.strftime("%Y, %m, %d") #returns string that corresponds to the formatted string
  3. perf_counter(): tick is much faster than time()

========================================================================

Applications

========================================================================

  1. Read lines:

    with open("/var/log/diligent/rico_test.log", "r") as f:
        while line:= f.readline():        
            di = json.loads(line)
  2. How to test the vaVlidity of YAML?

    import yaml
    with open(input_file, 'r') as file: #file handler
      data = yaml.safe_load(file)
    with open(output_file, 'w') as file:  #without w will see "object not writable error"
      yaml.dump(data, output_file)
    • yaml.dump: the same key-value order
  3. catching the python exception, add it with something else.

    # Method 1
    try:
      do_Stuff()
    except:
      #do something else
    
    #Method 2
    try:
      do_Stuff()
    except BaseException as error:
      print(error)
    • Exception before Python 3.11 is kinda expensive if no exception is raised.
  4. colored text

    from termcolor import colored
    print(colored('hello', 'red'))

========================================================================

Testing

========================================================================

  1. mock
  • patch
    • to substitute an object or function in a test (so you don't have to actually call something, like an RTSP call)
    • Set up:
      # src.py
      import unittest.mock    #python3
      from some_module import some_func
      def func()
        return some_func()
      
      # test_src.py
      # mock.patch("src.some_func")     #IMPORTANT: should be where it's used, not where it'd defined.
      # mock.patch("src.some_func", return='some_value')    # return the value
      • side_effect
        # mock.patch("src.some_func", side_effect=ITERABLE)    # return the next value of the iterable
        # mock.patch("src.some_func", side_effect=SOME_EXCEPTION)    # raise the exception every time
        # mock.patch("src.some_func", side_effect=SOME_VALUE)    # return the value every time
      • import sequence: import function to be patched -> import patcher function, then make a patch -> start the patch -> import the big process you want to run.
    • mock test
      1. decorator method
      from src import func
      #mock.patch('src.some_func', return_value = b'lol')
      def test_func(mock_check_input):   #mock_check_input as a dummy input, required by mock
        actual_res = func()   #Now mock is acting as some_func in func
        assertIn(actual_res, "actual_result")
      1. context manager TODO
      2. inline
      class TestExamples(TestCase):
        def setUp(self):    #in Python Unit Test, setUp is called first
          self.patcher = mock.patch('src.some_func', return_value = b'lol')
          self.patcher.start()
      
        def test_func(self):
          actual_res = func()   #Now mock is acting as some_func in func
          assertIn(actual_res, "actual_result")
      
        def tearDown(self): #tearDown is called last
          self.patcher.stop()
  1. Pytest

    1. Pytest by default. Not recommend pytest ... -s, since it may run with python2
    • runs files that start with test_
    • or end in _test. Test method will start with test, not necessarily test_
    • run pytest:
      1. py.test will run all tests in the current directory
      2. py.test test_blah.py
      3. python3 -m module_name, module_name is on the python module path - python3 -m pytest test_blah.py
    1. pytest fixture
      • code run before running every single test. Return value will be stored as input
      • marked as fixture: @pytest.fixture
      • e.g,
          import pytest
          class Fruit:
              def __init__(self, name):
                  self.name = name
              def __eq__(self, other):
                  return self.name == other.name
        
          @pytest.fixture
          def my_fruit():
              return Fruit("apple")
          @pytest.fixture
          def fruit_basket(my_fruit):
              return [Fruit("banana"), my_fruit]
        
          def test_my_fruit_in_basket(my_fruit, fruit_basket):
              assert my_fruit in fruit_basket
      • If you want to have class level setup / teardown:
        class TestMapIdInitialization:
            @classmethod
            def setup_class(cls):
    2. Good practice:
    • try to use assert() for exit
    1. pytest: to enable printing: pytest myfile -s

    2. Will just Skip, reason not printed here.

      @pytest.mark.skip(reason="Looks like encoding.py and video.py have been deprecated. Need to confirm with Santi")
      def test_extract_keyframes():

========================================================================

Lessons Learned

========================================================================

Bad Design of Python - Reduced Visibility!

  1. stage.bags_port.connect_to(bags) - names are not consistent
  2. conditions can be combined:
    // example 1 - uploading.py
    if feed_id in self.__config.witness_feeds:
        self.__executor.submit(self.__upload, bag_unit)
    elif not safe_to_discard:
        self.__executor.submit(self.__upload, bag_unit)
    
    // example 2 -uploading.py
    logging.basicConfig(level=logging.DEBUG)        # need logging level to be above debug
    if len(self.__relevance_score_history) < self.__relevance_score_history.maxlen:
        logging.debug("can't discard bag since relevance history is not full")
        return False
    has_any_relevance = any(self.__relevance_score_history)
    if has_any_relevance:
        logging.debug("can't discard bag, there's relevance in history")
        return False
  3. Ipython
    • breakpoint: b file:line

========================================================================

Profiling

========================================================================

  1. py-spy

    sudo env "PATH=$PATH" py-spy record -o ~/Desktop/logger.speedscope.json --format speedscope --pid $(ps aux | grep "python test_logger.py" | grep -v grep | tr -s ' ' | cut -d' ' -f2)
    • speedscope does measure different threads. calls you see on each thread does happen on the same thread.
    • Total time is the time spent in a function and all its callees. Self time is the time spent in a function, excluding its callees.
    • Sandwich time is a very convenient time aggregation that we can use. One confusing point is that the same function can appear multiple times. That is because sandwich mode aggregates functions based on their callstack info. So each major step under this function shows up once under the same function name.
  2. pdb: python -m pdb xxx.py

    • b lineno, or b filename:lineno, b function
    • try: tb (temporary break, which is deleted after first time it gets run)
    • cl: lineno
    • w: where (current stack); u (up the stack), h (help); d (down the stack)
    • s, n, r, c are the same;
    • l: list source code
    • run: restart debugging
    • q: quit

========================================================================

Common Code

========================================================================

  1. Math

    sys.maxsize: 2**31 - 1 or 2** 63 -1
  2. tools_and_hardware

    • counter
      1. counter is a dictionary that counts how many times an item has shown up
        • you can incorporate in another list
      2. can do +, - counters, where - is to remove the intersection of two dictionaries
    • test_serial
      1. convert byte using to_bytes
      2. serial write, read, etc.
        • ord(char) -> unicode int
  3. other data structures see code

    • string related

      1. ljust returns a 20 char long str, with "O" padding char
      2. to split a string into a list of words, based on delim
      3. str.strip() remove trailing/ leading spaces
      4. See if start with, end with
      5. find substring start index
      6. Print number with certain digits f"{12.456:10.1f}"
      7. nota: no escribe f"{arr[""]}", "" no se lleva bien con formatted string
    • deque

      1. Natural choice for FIFO queue. pop, push O(1), while list is O(N)
        • pros: append / popleft is O(1), while list is O(N)
        • cons: internally it's a doubly linked list (larger memory, 2 pointers/node instead of 1), while list is array.
      2. Uses:
        1. if no length specified, the queue is unbounded
        2. popleft(): the left most element, pop(): the right most element
    • heapq

      1. By default, returns the smallest element
        • heapify - first element being the smallest
      2. nLargest, nSmallest. Uses heapq, but also, if N == 1, just get min(). if N -> len(list), will do sorting first
        • heapq.heappop(li) returns the smallest element
        • use nsmallest(key=...) to find the smallest items
      3. NOT threadsafe.
      4. Can work with tuple
      5. Bugs:
        • Error: truth value of array with more than one element error. link. This is because the np array will be compared if the first element is the same
    • JoinableQueue vs Queue

      put(block=true): will block until a free slot is available. Else, an ``` queue.Full``` will be raised. 
      
      • queue.queue is a threadsafe data structure. see code
  4. OOP

    • print class name
    • inheritance
    • class variable
  5. file related, see examples

    • zipfile
    • argparse
    • os path join, etc.
  6. Functions

    • closure in python: must use non local
    • func
    • typing
    • test_get_attribute
      1. When you call a member in class A, you will call A.getattribute() as well. So you might get recursion error
      2. the calling with super().getattribute() will access the proper attribute. Not sure why?
  7. Iterable Basics

    • iterator_basics
      1. Iterable is something you can iterate over, like a list, dictionary, using the iterator inside them.
        • Called Iterable protocol
      2. you must have __iter__ and __next__ for iterators
        • Use next() on iterables
        • Iterator is an object with next(). Use iter() to get an iterator.
        • Once we have reached the bottom of an iterator, raises StopIteration
      3. iterable is an object with iter(), which returns an iterator
        • Use for i in to loop over
      4. . dict is an iterable. But iter(di) gives you the keys
    • generator_basics
      1. Use yield, which is like return, but returns a generator object, which can be iterated only once. Generators are iterators
        • Do not store all values in memory at once, generated on the fly
        • It's a generator function, which has next(), like iterator. But it could be easier
      2. By default, it raises StopIteration exception
      3. So use for i in.... For loop calls next(iter(iterable)), and returns a generator
      4. Design Patterns:
        1. Good for stuff that's generated indefinitely, real time
        2. Good for search, which decouples search process from the upper stream code
    • test_coroutine_basic_idea
      1. A function with yield can be constructed as a generator object
      2. To start the generator object, you need to call next.
        • generator object has the function send(), which is a bi-directional communcation to/from the generator
        • Note that next is essentially send(None). So you're only retrieving the yielded value back.
      3. Each send() (including next()) function will start from the current yield, do the bi-directional comm, execute, and WAIT at the next yield
      4. The basic idea of coroutine, is to pause a function, and come back into it. Two functions can use yield to achieve bi-directional comm, and the generator pauses at the next yield
  8. Iterable Related

    • Set

      1. Basics: add, remove; intersection (&), union(|), rest (-)
        • discard will not raise an error, pop will
        • pop() popping a random value
        • does not support +=
      2. Set comprehension
      3. Can be used to remove duplicates in a Hashable function
      4. frozenset
        1. does not support indexing,
        2. but can be used to unpack
    • Dictionary

      • Theory. Python Dictionary implementation

        1. ๆ•ฃๅˆ—่กจ

        2. Keys, values may be stored separately, So to return them is O(1), github explanation, Source code

      • dict operations

        1. dict to list - need to convert items, values (values-view object) to list explicitly
          • filter
        2. Sort a dictionary based on key
          1. Use operator.itemgetter is a bit faster. itemgetter is a callable that calls getitem
          2. you can do sorted(dic, key = lambda k : dic[k])
        3. max, min can use itemgetter as well
      • dictionary_basics see code

        1. if no value is found, get() will return default value (like None), dict[key]will raise an error
        2. delete an element
          • pop doesn't throw errors
        3. when you do list(dict), iter(dict), they operate on keys.
          • next(iter(dict)). need iter because dict is not an iterable
          • my_dict.values() gives you an "value-view" object, not a list. Similarly, my_dict.items(), .keys() gives you "item-view" keys-view objects
        4. merging 2 dicts
        5. Sort dictionary by value (ascending order) and return items in a list
      • test_dict_less_known_features

        1. using np array as a key in dictionary, have to use to_bytes
        2. Find min, max of keys, or values:
          • zip(keys(), values()), zip(values(), keys())
          • just find the min key or min value.
          • just return the value of the min key.
        3. Finding commonalities bw two dicts: items-view, keys-view objects support set operations, but not values-view objects pq values can have duplicates.
          • Make a new dict with certain elements removed
      • default_dict:

        1. default value being 0
        2. default value being list
        3. provide custom default value
      • chainmap

        1. ChainMap keeps a list of keys of multiple dictionaries, and can behave as one. Changes on each dict -> ChainMap; changes on ChainMap -> first dict
          • value from the first dictionary will be returned, if there're repeating keys
          • alternative: update, but that creates a totally new dict
        2. You can add ChainMap, which is useful for variables of different scope.
          • or return a new chainmap (without the first one) for searching.
      • iterable_extra_state():

        1. Have extra state in the object: make iterable and add the state to it
          • enumuerate(ls, start_index)
    • List

      1. None in list
      2. Sort
      3. list reverse.
      4. Initialize 2D list, do not use *, use list comprehension
      5. unpack a list. if not enough params, we will run into error. Also we can do this in u, v
      6. find average
    • zip

      • Note zip creates an iterator (so it can be iterated only once)
    • range

      • you can access range object like list
      • create a set using range
    • deep copy

    • tuple

      1. can print type,
      2. del tuple[1] won't work
      3. ==, < do work, by position
      4. Quirk about tuple: SINGLE element, we have to append ',' but no need in other cases (e.g, no elements)
    • namedtuple

      1. Still a tuple, can be unpacked, but instead of [1], you use field name
      2. Can use _replace() to make a new namedtuple with new fields
  9. misc see code

    • ignore warning
  10. prompt window. see code

โš ๏ธ **GitHub.com Fallback** โš ๏ธ