开发Notes - duxiaoyao/pdp GitHub Wiki

excerpts from math as code

Don't use exact == equality on floats! Use math.isclose or numpy.testing.assert_almost_equal or abs(x - y) < 10 ** -epsilon
some functions: math.sqrt np.sqrt math.pow np.pow math.floor math.ceil np.floor np.ceil np.dot np.cross np.sum np.prod np.cumsum np.cumprod np.linalg.norm np.linalg.det np.linspace

matplotlib 相关

使用 non-interactive backend

在批处理后台生成图片等不需要显示图片的场景中,推荐使用 non-interactive backend 以免出现键盘和鼠标frozen的异常情况
Agg is a non-interactive backend that can only write to files

import matplotlib as mpl

# decorator
@mpl.rc_context({'backend': 'Agg'})  

# context manager
with mpl.rc_context({'backend': 'Agg'}):
    plt.plot(data)

# set rc property
mpl.rcParams['backend] = 'Agg'

# change the property in .venv/lib/python3.10/site-packages/matplotlib/mpl-data/matplotlibrc

# Setting the [MPLBACKEND](https://matplotlib.org/stable/users/faq/environment_variables_faq.html#envvar-MPLBACKEND) environment variable

# API
mpl.use('Agg')  # mpl.get_backend

plt.switch_backend('Agg')

提升图片清晰度

import matplotlib as mpl
import matplotlib.pyplot as plt

# decorator
@mpl.rc_context({'figure.dpi': 300})  

# context manager
with mpl.rc_context({'savefig.dpi': 300}):
    plt.plot(data)

# set rc property
mpl.rcParams['savefig.dpi'] = mpl.rcParams['figure.dpi'] * 3

# change the property in .venv/lib/python3.10/site-packages/matplotlib/mpl-data/matplotlibrc

# API
plt.figure(dpi=300)

fig = plt.figure()
fig.dpi = 100

fig.savefig('Plot1.jpg', dpi=300)

plt.savefig('Plot1.jpg', dpi=300)

IPython & Jupyter

%load_ext autoreload
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style="darkgrid")
import dask as ds
from distributed import Client
from dynaconf import settings
import talib
from gm.api import *
from qtr.trend123 import index, basic, td, tv, pool, ohlcv, ma, signal, trade

%load_ext line_profiler

%lprun -f generate_trades  generate_trades(signal_root_path)  
%lprun -f process_bar process_bars(book, sp, sl, bt_date2signals, s_idx_down_index, df_td_bt)
%lprun -f detect_close -f buy process_bars(book, sp, sl, bt_date2signals, s_idx_down_index, df_td_bt)

Linux和Windows文件兼容

  1. View text file in binary
    xxd file  
    xxd -b file  
    hexdump file    
    
  2. Check file type: file *.csv
  3. Convert from one encoding to another: iconv
  4. Convert line-ending character between linux (LF, line-feeding, \n) and windows (CR+LF, carriage-return, \r\n)
    sed s/$/"\r"/g
    sed s/"\r$"//g
    
  5. Convert utf-8 to utf-8 with BOM (MS office requires this)
    sed -i '1s/^\(\xef\xbb\xbf\)\?/\xef\xbb\xbf/' *.csv
    sed -i '1s/^/\xef\xbb\xbf/' *.csv
    
  6. 用excel打开csv文件中文乱码问题
    原因:没有BOM,把UTF8当unicode显示
    解决办法:1) 文件开始位置加BOM 2) 打开excel,数据 - 自文本,导入即可

split & combine a large file

split -n 3 -d data.tgz data.tgz.
cat data.tgz.* > data.tgz.combined
md5sum data.tgz data.tgz.combined

Linux常用文本处理命令

cut -d , -f2 EC.csv | sed -e "s/^M//" | tail -n +2 > X.txt
tail -n +2 N1_3-N2_15/N3_3-BLIST.csv > B.txt
paste -d , X.txt B.txt | sed -e "s/^M//" > XB.csv
paste -d , EC.csv N3_10-BLIST.csv Y_0.80-Z_2.00-BS_BSRE.csv | sed -e "s/^M//g" > CFFEX.IF_60MIN-N1_5-N2_50-N3_10-N_25-Y_0.80-Z_2.00.csv

find . -type f -name "*.csv" ! -name "EC.csv" -cmin +5 -exec gzip -f "{}" \;
find . -type f -name "*.csv" ! -name "EC.csv" -cmin +5 | parallel gzip -f {}
find . -type f -name "*.csv" ! -name "EC.csv" -cmin +5 | xargs -n 1 -P 0 gzip -f

find . -name "*.csv.gz" -exec rename 's/\.csv\.gz$/\.csv/' '{}' \;
find . -name "report.xlsx" -o -name "signals.csv.gz" -o -name "*-trades.csv.gz" -o -name "*-summaries.csv.gz" -o -name "*-positions.csv.gz"  | tar zcvf bt-report.tgz -T -

git empty blob

How to fix GIT object file is empty error and the gist
git: remove dangling blobs

cleanup:

git fsck --full
git clean -f
git gc --prune=all
git gc --auto