因子预处理方法、多因子合成 - ChannelCMT/OFO GitHub Wiki

目录

  1. 因子比较与筛选
  2. 常用的因子预处理方法-调整正负、去极值、行业市值中性化、标准化
  3. 多因子组合方法
from jaqs_fxdayu.data import DataView 
import warnings

warnings.filterwarnings("ignore")
dataview_folder = '../Factor'
dv = DataView()
dv.load_dataview(dataview_folder)
dv.add_formula("momentum", "Return(close_adj, 20)", is_quarterly=False, add_data=True)
Dataview loaded successfully.
symbol 000001.SZ 000002.SZ 000008.SZ 000009.SZ 000012.SZ 000024.SZ 000027.SZ 000039.SZ 000046.SZ 000059.SZ ... 601998.SH 603000.SH 603160.SH 603288.SH 603699.SH 603799.SH 603833.SH 603858.SH 603885.SH 603993.SH
trade_date
20140102 -0.100735 -0.085812 -0.057592 -0.006342 -0.100442 -0.051708 -0.068143 0.012426 -0.074534 -0.089580 ... -0.065375 0.104574 NaN NaN NaN NaN NaN NaN NaN -0.084892
20140103 -0.111690 -0.102975 -0.052910 -0.040881 -0.116740 -0.078923 -0.082474 0.048699 -0.091097 -0.111111 ... -0.075426 0.105497 NaN NaN NaN NaN NaN NaN NaN -0.091437
20140106 -0.121896 -0.137255 -0.095643 -0.059129 -0.165380 -0.111576 -0.106164 0.011311 -0.098121 -0.134470 ... -0.085575 0.132137 NaN NaN NaN NaN NaN NaN NaN -0.123726
20140107 -0.118271 -0.138051 -0.109342 -0.060228 -0.174342 -0.122535 -0.104991 0.039841 -0.095745 -0.139847 ... -0.088020 0.076545 NaN NaN NaN NaN NaN NaN NaN -0.118594
20140108 -0.115124 -0.144175 -0.159346 -0.063224 -0.179235 -0.160665 -0.093103 0.066347 -0.081023 -0.156604 ... -0.085575 0.118630 NaN NaN NaN NaN NaN NaN NaN -0.127941
20140109 -0.079439 -0.126464 -0.149474 -0.050273 -0.171525 -0.145109 -0.087873 0.109015 -0.075107 -0.143411 ... -0.081481 0.118595 NaN NaN NaN NaN NaN NaN NaN -0.157895
20140110 -0.070755 -0.134818 -0.147799 -0.048087 -0.186020 -0.158701 -0.097002 0.088464 -0.066667 -0.141199 ... -0.077114 0.032493 NaN NaN NaN NaN NaN NaN NaN -0.150665
20140113 -0.075697 -0.152225 -0.117032 -0.038419 -0.189342 -0.207055 -0.083333 0.074930 -0.069414 -0.193490 ... -0.062344 0.109781 NaN NaN NaN NaN NaN NaN NaN -0.113264
20140114 -0.080784 -0.142349 -0.100105 -0.044662 -0.089120 -0.187791 -0.063063 0.123944 -0.039560 -0.177007 ... -0.062814 0.100159 NaN NaN NaN NaN NaN NaN NaN -0.094225
20140115 -0.082547 -0.133011 -0.092827 -0.015385 -0.077739 -0.182381 -0.060000 0.153846 -0.040000 -0.162921 ... -0.078481 0.209846 NaN NaN NaN NaN NaN NaN NaN -0.095679
20140116 -0.083856 -0.129540 -0.105152 -0.045259 -0.070071 -0.178147 -0.047016 0.177477 -0.042411 -0.148571 ... -0.081013 0.235830 NaN NaN NaN NaN NaN NaN NaN -0.081413
20140117 -0.081600 -0.130221 -0.129474 -0.064725 -0.098927 -0.168452 -0.043956 0.160142 -0.024943 -0.172147 ... -0.084184 0.230942 NaN NaN NaN NaN NaN NaN NaN -0.101538
20140120 -0.051217 -0.091720 -0.110410 -0.049834 -0.074390 -0.141717 -0.042831 0.183333 -0.011494 -0.152174 ... 0.002793 0.216201 NaN NaN NaN NaN NaN NaN NaN -0.097638
20140121 -0.049372 -0.073171 -0.117284 -0.044199 -0.053528 -0.131671 -0.037175 0.226483 0.000000 -0.139721 ... -0.042328 0.280011 NaN NaN NaN NaN NaN NaN NaN -0.090625
20140122 0.007692 -0.026820 -0.109645 -0.011086 -0.008750 -0.061973 -0.024074 0.222865 0.013793 -0.109091 ... -0.034211 0.282502 NaN NaN NaN NaN NaN NaN NaN -0.070203
20140123 -0.001706 -0.053097 -0.085263 -0.017544 -0.009828 -0.081773 -0.025878 0.137441 -0.037445 -0.115152 ... -0.037037 0.199393 NaN NaN NaN NaN NaN NaN NaN -0.075000
20140124 0.018341 -0.001282 -0.038803 0.016484 0.015152 -0.039098 -0.018587 0.117726 -0.028139 -0.103659 ... -0.021448 0.246027 NaN NaN NaN NaN NaN NaN NaN -0.063665
20140127 -0.030638 -0.047323 -0.053512 0.007600 -0.017478 -0.086402 -0.034926 0.056916 -0.039216 -0.117172 ... -0.042216 0.162772 NaN NaN NaN NaN NaN NaN NaN -0.077399
20140128 -0.020443 -0.027990 -0.049889 0.006515 0.003727 -0.064851 -0.033088 0.059133 -0.024609 -0.112903 ... -0.023810 0.114058 NaN NaN NaN NaN NaN NaN NaN -0.074419
20140129 -0.056327 -0.061021 -0.048889 0.000000 -0.009828 -0.093840 -0.043636 0.057873 -0.020045 -0.116232 ... -0.038760 0.220854 NaN NaN NaN NaN NaN NaN NaN -0.078582
20140130 -0.067866 -0.076345 -0.047778 -0.003191 -0.025767 -0.099318 -0.047532 0.017442 -0.040268 -0.146586 ... -0.064767 0.181957 NaN NaN NaN NaN NaN NaN NaN -0.073899
20140207 -0.046102 -0.054847 -0.027933 0.045902 0.007481 -0.082219 -0.018727 0.015267 -0.013667 -0.127083 ... -0.042105 0.205879 NaN NaN NaN NaN NaN NaN NaN -0.052716
20140210 -0.008569 0.004011 0.038778 0.160970 0.097754 -0.010989 0.021073 0.046711 0.013889 -0.054705 ... 0.016043 0.212243 NaN NaN NaN NaN NaN NaN NaN 0.013289
20140211 0.031814 0.016151 0.052443 0.171271 0.095618 0.005350 0.028846 0.007024 0.032941 -0.020045 ... 0.120643 0.125982 NaN NaN NaN NaN NaN NaN NaN 0.014950
20140212 0.015306 0.013477 0.089915 0.203600 0.125166 0.033003 0.039924 0.050032 0.011601 -0.008949 ... 0.208556 0.080123 NaN NaN NaN NaN NaN NaN NaN 0.032040
20140213 0.033841 -0.006702 0.092822 0.250863 0.115020 -0.004338 0.040462 0.011972 0.039443 0.009050 ... 0.330645 0.055781 NaN NaN NaN NaN NaN NaN NaN 0.055556
20140214 0.021997 -0.001355 0.107011 0.259472 0.149584 0.003861 0.066406 0.048765 0.041475 0.018018 ... 0.309973 0.108285 NaN NaN NaN NaN NaN NaN NaN 0.071304
20140217 0.031897 0.017956 0.108876 0.252283 0.181818 0.042750 0.075435 0.053420 0.062937 0.017937 ... 0.279255 0.054795 NaN NaN NaN NaN NaN NaN NaN 0.043697
20140218 -0.005119 0.005533 0.080796 0.299886 0.072427 0.012622 0.053846 -0.013158 0.054920 0.002217 ... 0.257373 0.038174 NaN NaN NaN NaN NaN NaN NaN 0.026846
20140219 0.028278 0.027894 0.090698 0.252232 0.067688 0.041351 0.061896 -0.036137 0.064815 0.038031 ... 0.417582 -0.021494 NaN NaN NaN NaN NaN NaN NaN 0.044369
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
20171120 0.273458 0.114539 0.003382 -0.106933 0.073761 NaN -0.051084 0.062434 -0.076167 -0.037791 ... -0.012598 -0.095861 0.072122 0.074252 0.116348 -0.059919 0.034426 -0.205057 0.033520 -0.126404
20171121 0.268657 0.177407 -0.003378 -0.105077 0.090686 NaN -0.046512 0.098352 -0.075887 -0.033203 ... -0.010955 -0.081181 0.053217 0.093856 0.123353 -0.060377 0.044169 -0.205060 0.037089 -0.153740
20171122 0.339840 0.184834 -0.004484 -0.104460 0.132927 NaN -0.049536 0.073584 -0.069767 -0.027079 ... 0.020408 -0.086637 0.068177 0.108031 0.118350 -0.030201 -0.048057 -0.236038 0.048184 -0.141667
20171123 0.283542 0.136942 -0.065539 -0.129070 0.092457 NaN -0.065015 0.027455 -0.045455 -0.075914 ... 0.001585 -0.109250 0.163450 -0.007020 0.080305 -0.052761 -0.055597 -0.243516 -0.007785 -0.148098
20171124 0.259516 0.156364 -0.071506 -0.096085 0.141983 NaN -0.057632 0.030655 -0.025894 -0.065789 ... 0.000000 -0.109810 0.208555 -0.009451 0.071470 -0.041119 -0.065847 -0.248355 0.014144 -0.078404
20171127 0.205017 0.087647 -0.108719 -0.076638 0.124069 NaN -0.050394 0.029714 -0.015365 -0.076493 ... 0.019293 -0.073933 0.088818 -0.035897 0.062900 -0.104015 -0.069122 -0.247189 0.000000 -0.045190
20171128 0.187175 0.061809 -0.091503 -0.066258 0.115942 NaN -0.051643 0.003638 -0.012870 -0.047214 ... 0.009646 -0.078550 0.121669 -0.012093 0.036179 -0.119827 -0.038106 -0.248399 -0.038624 -0.047420
20171129 0.212281 0.160206 -0.088428 -0.045735 0.144242 NaN -0.053208 0.012671 0.003851 -0.018112 ... 0.011364 -0.088722 0.004145 -0.011621 0.041572 -0.157026 -0.074108 -0.242310 -0.037657 -0.033103
20171130 0.159445 0.060102 -0.075245 -0.045226 0.086288 NaN -0.036220 -0.010660 -0.014175 -0.006843 ... 0.039216 -0.073933 -0.014040 -0.031875 0.036261 -0.121046 -0.065931 -0.243670 -0.048193 0.008559
20171201 0.141352 0.090103 -0.070346 -0.024453 0.084524 NaN -0.036335 0.015377 -0.029601 0.006958 ... 0.042414 -0.051938 -0.004686 -0.035088 0.024650 -0.111065 -0.049722 -0.225288 -0.062633 0.028986
20171204 0.179078 0.123452 -0.071817 -0.052363 0.102259 NaN -0.045741 -0.011946 -0.050761 -0.018465 ... 0.055738 -0.061303 -0.090380 -0.077368 -0.012379 -0.059388 -0.102398 -0.269209 -0.005780 0.014514
20171205 0.115772 0.114183 -0.062569 -0.085025 0.046484 NaN -0.065831 -0.050197 -0.053571 -0.087324 ... 0.055105 -0.111872 -0.110107 -0.103594 -0.028479 -0.093309 -0.099602 -0.273266 -0.013889 -0.042029
20171206 0.078318 0.113242 -0.051339 -0.093284 0.058683 NaN -0.070203 -0.030378 -0.043871 -0.081056 ... 0.033926 -0.120986 -0.065739 -0.095816 -0.018095 -0.108374 -0.086371 -0.269365 -0.018248 -0.047688
20171207 0.040552 0.082400 -0.068462 -0.088639 0.034606 NaN -0.067083 -0.037365 -0.032258 -0.078154 ... 0.035599 -0.124720 -0.112329 -0.108496 -0.022608 -0.126953 -0.139220 -0.269252 -0.027201 -0.072674
20171208 0.064228 0.083969 -0.062153 -0.090683 0.060680 NaN -0.054859 -0.071258 -0.029525 -0.090226 ... 0.037582 -0.113619 -0.047843 -0.096609 -0.001064 -0.117551 -0.119264 -0.247828 -0.034261 -0.023669
20171211 0.046512 0.106061 -0.034792 -0.069913 -0.030905 NaN -0.045455 -0.033157 -0.032092 -0.080299 ... 0.029126 -0.097358 -0.085663 -0.047806 0.004787 -0.034339 -0.098188 -0.246069 -0.013591 0.010432
20171212 0.005405 0.037544 -0.061590 -0.052564 -0.029834 NaN -0.047393 -0.018031 -0.020699 -0.082464 ... 0.024631 -0.069124 -0.069708 -0.021772 0.012299 0.085595 -0.077405 -0.242025 -0.022615 0.014793
20171213 0.017829 0.052743 -0.084895 -0.045103 -0.003386 NaN -0.048742 0.006317 -0.019506 -0.060136 ... 0.027915 -0.090701 0.006163 0.007490 0.023555 0.132361 -0.070957 -0.217561 0.013204 0.056250
20171214 -0.007634 0.040428 -0.073579 -0.063226 -0.010169 NaN -0.039683 0.020088 0.006729 -0.071636 ... 0.028286 -0.101592 0.003662 -0.015745 0.015508 0.143427 -0.044972 -0.213025 0.013841 0.108766
20171215 -0.034901 0.007266 -0.056561 -0.025401 -0.022196 NaN -0.016103 0.050813 0.069307 -0.062937 ... -0.001631 -0.070356 0.004673 0.005391 0.030769 0.113165 0.008809 -0.144016 0.009729 0.097682
20171218 -0.105263 -0.015225 -0.053933 -0.046053 -0.074324 NaN -0.013051 0.019422 0.007979 -0.060423 ... -0.022329 -0.067470 -0.048306 -0.021110 -0.004276 -0.013904 -0.005952 -0.095776 -0.035811 0.059486
20171219 -0.080969 -0.082730 -0.038418 -0.044855 -0.074157 NaN -0.013008 0.068248 0.010596 -0.045455 ... -0.020570 -0.059438 -0.024976 -0.029272 -0.004264 -0.024461 0.019060 -0.085810 -0.021592 0.075286
20171220 -0.121854 -0.120000 -0.040541 -0.064220 -0.127018 NaN -0.014658 0.077909 -0.006579 -0.076541 ... -0.047692 -0.072347 -0.087828 -0.005519 0.010689 -0.051383 0.049189 -0.111150 -0.032645 0.035599
20171221 -0.056446 -0.050479 -0.041855 -0.052069 -0.087973 NaN 0.003311 0.137718 -0.033462 -0.051724 ... -0.022152 -0.072772 -0.171216 0.063826 0.026587 -0.032383 0.079497 -0.082461 0.052782 0.036683
20171222 -0.071429 -0.061321 -0.007001 -0.051181 -0.114684 NaN 0.016529 0.113846 -0.050633 -0.048290 ... -0.022117 -0.073191 -0.221484 0.063208 0.027115 -0.057867 0.075556 -0.080550 0.040446 -0.038806
20171225 -0.048816 -0.036485 0.028986 -0.040161 -0.105960 NaN 0.004975 0.143232 -0.024707 -0.073737 ... -0.028391 -0.113580 -0.171475 0.099632 0.031233 0.048498 0.099036 -0.077581 0.058182 -0.016794
20171226 -0.002920 -0.008130 0.049161 -0.055191 -0.109307 NaN 0.001650 0.144485 -0.027379 -0.074331 ... -0.007962 -0.113115 -0.176979 0.042646 0.044190 -0.026917 0.054350 -0.059463 0.086194 -0.060029
20171227 -0.038350 -0.089592 0.044311 -0.067358 -0.136653 NaN -0.001653 0.079580 -0.046036 -0.092233 ... -0.004815 -0.115512 -0.158704 0.070140 0.037726 -0.011718 0.065846 -0.062373 0.057971 -0.064194
20171228 -0.012706 -0.016656 0.021226 -0.055263 -0.108814 NaN -0.014706 0.124166 -0.026144 -0.076772 ... -0.028302 -0.111934 -0.148128 0.089162 0.032805 -0.042601 0.055350 -0.074094 0.102755 -0.008487
20171229 0.023077 0.010739 0.018626 -0.046174 -0.072448 NaN -0.006557 0.153458 -0.010610 -0.059230 ... -0.029734 -0.114473 -0.122397 0.099081 0.051394 -0.069258 0.045153 -0.066275 0.160972 -0.030986

977 rows × 488 columns

import numpy as np

def mask_index_member():
    df_index_member = dv.get_ts('index_member')
    mask_index_member = ~(df_index_member >0) #定义信号过滤条件-非指数成分
    return mask_index_member

def limit_up_down():
    # 定义可买卖条件——未停牌、未涨跌停
    trade_status = dv.get_ts('trade_status')
    mask_sus = trade_status == 0
    # 涨停
    dv.add_formula('up_limit', '(close - Delay(close, 1)) / Delay(close, 1) > 0.095', is_quarterly=False, add_data=True)
    # 跌停
    dv.add_formula('down_limit', '(close - Delay(close, 1)) / Delay(close, 1) < -0.095', is_quarterly=False, add_data=True)
    can_enter = np.logical_and(dv.get_ts('up_limit') < 1, ~mask_sus) # 未涨停未停牌
    can_exit = np.logical_and(dv.get_ts('down_limit') < 1, ~mask_sus) # 未跌停未停牌
    return can_enter,can_exit

mask = mask_index_member()
can_enter,can_exit = limit_up_down()

接下来,我们对pb、pe、ps、float_mv、momentum五个因子进行比较、筛选

from jaqs_fxdayu.research.signaldigger import multi_factor

ic = dict()
factors_dict = {signal:dv.get_ts(signal) for signal in ["pb","pe","ps","float_mv","momentum"]}
for period in [5, 15, 30]:
    ic[period]=multi_factor.get_factors_ic_df(factors_dict,
                                              price=dv.get_ts("close_adj"),
                                              high=dv.get_ts("high_adj"), # 可为空
                                              low=dv.get_ts("low_adj"),# 可为空
                                              n_quantiles=5,# quantile分类数
                                              mask=mask,# 过滤条件
                                              can_enter=can_enter,# 是否能进场
                                              can_exit=can_exit,# 是否能出场
                                              period=period,# 持有期
                                              benchmark_price=dv.data_benchmark, # 基准价格 可不传入,持有期收益(return)计算为绝对收益
                                              commission = 0.0008,
                                              )
Nan Data Count (should be zero) : 0;  Percentage of effective data: 58%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 58%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 58%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 58%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 58%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 58%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 58%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 58%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 58%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 58%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 58%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 58%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 58%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 58%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 58%
import pandas as pd
ic_mean_table = pd.DataFrame(data=np.nan,columns=[5,15,30],index=["pb","pe","ps","float_mv","momentum"])
ic_std_table = pd.DataFrame(data=np.nan,columns=[5,15,30],index=["pb","pe","ps","float_mv","momentum"])
ir_table = pd.DataFrame(data=np.nan,columns=[5,15,30],index=["pb","pe","ps","float_mv","momentum"])
for signal in ["pb","pe","ps","float_mv","momentum"]:
    for period in [5, 15, 30]:
        ic_mean_table.loc[signal,period]=ic[period][signal].mean()
        ic_std_table.loc[signal,period]=ic[period][signal].std()
        ir_table.loc[signal,period]=ic[period][signal].mean()/ic[period][signal].std()

print(ic_mean_table)
print(ic_std_table)
print(ir_table)
                5         15        30
pb       -0.039948 -0.069184 -0.106428
pe       -0.038036 -0.065607 -0.098353
ps       -0.032231 -0.057777 -0.087181
float_mv  0.006833  0.021287  0.044382
momentum -0.041551 -0.053251 -0.047145
                5         15        30
pb        0.231587  0.259397  0.245520
pe        0.210134  0.220244  0.210795
ps        0.176345  0.193792  0.188749
float_mv  0.222908  0.229546  0.229144
momentum  0.207719  0.215057  0.209887
                5         15        30
pb       -0.172496 -0.266712 -0.433481
pe       -0.181008 -0.297881 -0.466578
ps       -0.182774 -0.298140 -0.461889
float_mv  0.030655  0.092735  0.193688
momentum -0.200034 -0.247614 -0.224622

可视化比较

%matplotlib inline
ic_mean_table.plot(kind="barh",xerr=ic_std_table,figsize=(15,5))
<matplotlib.axes._subplots.AxesSubplot at 0x18b5268e710>

  • IC_IR:方差标准化后的ic均值
  • 一般而言,认为|IC_IR|>0.6,因子的稳定性合格
%matplotlib inline
ir_table.plot(kind="barh",figsize=(15,5))
<matplotlib.axes._subplots.AxesSubplot at 0x18b4fb3fd68>

因子预处理

保留momentum、ps、pe、pb 进一步处理并尝试构建组合因子

  • 根据之前的分析,这几个因子在几个持有期下与股票收益的关系(ic)都是负的,先统一调整成正相关关系
  • 去极值
  • 标准化 -- z-score、rank
  • 行业市值中性化
from jaqs_fxdayu.research.signaldigger import process

factor_dict = dict()
index_member = dv.get_ts("index_member")
for name in ["pb","pe","ps","momentum"]:
    signal = -1*dv.get_ts(name) # 调整符号
    process.winsorize(factor_df=signal,alpha=0.05,index_member=index_member)#去极值
    signal = process.standardize(signal,index_member) #z-score标准化 保留排序信息和分布信息
#     signal = process.rank_standardize(signal,index_member) #因子在截面排序并归一化到0-1(只保留排序信息)
#     # 行业市值中性化
#     signal = process.neutralize(signal,
#                                 group=dv.get_ts("sw1"),# 行业分类标准
#                                 float_mv = dv.get_ts("float_mv"), #流通市值 可为None 则不进行市值中性化
#                                 index_member=index_member,# 是否只处理时只考虑指数成份股
#                                 )
    factor_dict[name] = signal

多因子组合

对筛选后的因子进行组合,一般有以下常规处理:

  • 因子间存在较强同质性时,先使用施密特正交化方法对因子做正交化处理,用得到的正交化残差作为因子(也可以不使用,正交化会破坏因子的经济学逻辑,并剔除一些信息)
  • 因子组合加权,常规的方法有:等权重、以某个时间窗口的滚动平均ic为权重、以某个时间窗口的滚动ic_ir为权重、最大化上个持有期的ic_ir为目标处理权重、最大化上个持有期的ic为目标处理权重
  • 注:因为计算IC需要用到下一期股票收益,因此在动态加权方法里,实际上使用的是前一期及更早的IC值(向前推移了holding_period)计算当期的权重
# 因子间存在较强同质性时,使用施密特正交化方法对因子做正交化处理,用得到的正交化残差作为因子
new_factors = multi_factor.orthogonalize(factors_dict=factor_dict,
                           standardize_type="rank",#输入因子标准化方法,有"rank"(排序标准化),"z_score"(z-score标准化)两种("rank"/"z_score")
                           winsorization=False,#是否对输入因子去极值
                           index_member=index_member) # 是否只处理指数成分股
new_factors
{'momentum': symbol      000001.SZ  000002.SZ  000008.SZ  000009.SZ  000012.SZ  000024.SZ  \
 trade_date                                                                     
 20140102     0.715719   0.511706        NaN   0.354515   0.290970   0.709030   
 20140103     0.668896   0.491639        NaN   0.351171   0.260870   0.655518   
 20140106     0.722408   0.488294        NaN   0.354515   0.230769   0.645485   
 20140107     0.725753   0.488294        NaN   0.384615   0.190635   0.605351   
 20140108     0.745819   0.498328        NaN   0.367893   0.200669   0.471572   
 20140109     0.816054   0.531773        NaN   0.371237   0.234114   0.511706   
 20140110     0.856187   0.505017        NaN   0.394649   0.227425   0.474916   
 20140113     0.842809   0.484950        NaN   0.444816   0.220736   0.334448   
 20140114     0.812709   0.418060        NaN   0.290970   0.414716   0.341137   
 20140115     0.792642   0.451505        NaN   0.397993   0.464883   0.337793   
 20140116     0.792642   0.428094        NaN   0.317726   0.481605   0.294314   
 20140117     0.822742   0.461538        NaN   0.257525   0.414716   0.394649   
 ...               ...        ...        ...        ...        ...        ...   
 20171221     0.759197   0.351171   0.792642        NaN        NaN        NaN   
 20171222     0.759197   0.351171   0.795987        NaN        NaN        NaN   
 20171225     0.698997   0.301003   0.812709        NaN        NaN        NaN   
 20171226     0.779264   0.334448   0.795987        NaN        NaN        NaN   
 20171227     0.250836   0.715719   0.197324        NaN        NaN        NaN   
 20171228     0.799331   0.304348   0.809365        NaN        NaN        NaN   
 20171229     0.832776   0.321070   0.779264        NaN        NaN        NaN   
 
 symbol      000027.SZ  000039.SZ  000046.SZ  000059.SZ    ...      601998.SH  \
 trade_date                                                ...                  
 20140102          NaN   0.668896   0.622074        NaN    ...       0.812709   
 20140103          NaN   0.852843   0.541806        NaN    ...       0.795987   
 20140106          NaN   0.816054   0.628763        NaN    ...       0.822742   
 20140107          NaN   0.852843   0.658863        NaN    ...       0.809365   
 20140108          NaN   0.892977   0.695652        NaN    ...       0.829431   
 20140109          NaN   0.892977   0.719064        NaN    ...       0.806020   
 20140110          NaN   0.866221   0.765886        NaN    ...       0.819398   
 20140113          NaN   0.846154   0.769231        NaN    ...       0.856187   
 20140114          NaN   0.862876   0.769231        NaN    ...       0.829431   
 20140115          NaN   0.903010   0.755853        NaN    ...       0.795987   
 20140116          NaN   0.926421   0.749164        NaN    ...       0.789298   
 20171219          NaN        NaN        NaN        NaN    ...       0.257525   
 20171220          NaN        NaN        NaN        NaN    ...       0.210702   
 20171221          NaN        NaN        NaN        NaN    ...       0.816054   
 20171222          NaN        NaN        NaN        NaN    ...       0.816054   
 20171225          NaN        NaN        NaN        NaN    ...       0.866221   
 20171226          NaN        NaN        NaN        NaN    ...       0.829431   
 20171227          NaN        NaN        NaN        NaN    ...       0.147157   
 20171228          NaN        NaN        NaN        NaN    ...       0.826087   
 20171229          NaN        NaN        NaN        NaN    ...       0.812709   
 
 symbol      603000.SH  603160.SH  603288.SH  603699.SH  603799.SH  603833.SH  \
 trade_date                                                                     
 20140102     0.959866        NaN        NaN        NaN        NaN        NaN   
 20140103     0.949833        NaN        NaN        NaN        NaN        NaN   
 20140106     0.953177        NaN        NaN        NaN        NaN        NaN   
 20140107     0.916388        NaN        NaN        NaN        NaN        NaN   
 20140108     0.913043        NaN        NaN        NaN        NaN        NaN   
 20140109     0.946488        NaN        NaN        NaN        NaN        NaN   
 20140110     0.946488        NaN        NaN        NaN        NaN        NaN   
 20140113     0.976589        NaN        NaN        NaN        NaN        NaN   
 ...               ...        ...        ...        ...        ...        ...  
 20171212          NaN   0.973244        NaN        NaN   0.986622   0.916388   
 20171213          NaN   0.976589        NaN        NaN   0.986622   0.923077   
 20171214          NaN   0.976589        NaN        NaN   0.986622   0.939799   
 20171215          NaN   0.976589        NaN        NaN   0.986622   0.953177   
 20171218          NaN   0.973244        NaN        NaN   0.979933   0.953177   
 20171219          NaN   0.979933        NaN        NaN   0.976589   0.953177   
 20171220          NaN   0.963211        NaN        NaN   0.979933   0.953177   
 20171221          NaN   0.023411        NaN        NaN   0.026756   0.053512   
 20171222          NaN   0.026756        NaN        NaN   0.023411   0.053512   
 20171225          NaN   0.013378        NaN        NaN   0.030100   0.086957   
 20171226          NaN   0.023411        NaN        NaN   0.026756   0.056856   
 20171227          NaN   0.976589        NaN        NaN   0.973244   0.936455   
 20171228          NaN   0.016722        NaN        NaN   0.026756   0.063545   
 20171229          NaN   0.016722        NaN        NaN   0.026756   0.063545   
 
 symbol      603858.SH  603885.SH  603993.SH  
 trade_date                                   
 20140102          NaN        NaN   0.605351  
 20140103          NaN        NaN   0.548495  
 20140106          NaN        NaN   0.521739  
 20140107          NaN        NaN   0.565217  
 20140108          NaN        NaN   0.511706  
 20140109          NaN        NaN   0.438127  
 20140110          NaN        NaN   0.521739  
 20140113          NaN        NaN   0.648829  
 20140114          NaN        NaN   0.655518  
 20140115          NaN        NaN   0.591973  
 20140116          NaN        NaN   0.632107  
 20140117          NaN        NaN   0.632107  
 20140120          NaN        NaN   0.638796  
 20140121          NaN        NaN   0.759197  
 20140122          NaN        NaN   0.782609  
 20140123          NaN        NaN   0.752508  
 20140124          NaN        NaN   0.709030  
 20140127          NaN        NaN   0.354515  
 ...               ...        ...        ...  
 20171225     0.287625        NaN   0.953177  
 20171226     0.384615        NaN   0.963211  
 20171227     0.615385        NaN   0.033445  
 20171228     0.274247        NaN   0.969900  
 20171229     0.337793        NaN   0.969900  
 
 [977 rows x 488 columns],

用正交化前的因子,分别进行等权、以某个时间窗口的滚动平均ic为权重、以某个时间窗口的滚动ic_ir为权重、最大化上个持有期的ic_ir为目标处理权重、最大化上个持有期的ic为目标处理权重的加权组合方式,然后测试组合因子表现

# rollback_period代表滚动窗口所用到的天数,即用前多少期的数据来计算现阶段的因子权重。 通常建议设置时间在半年以上,可以获得相对稳定的预期结果

#  多因子组合-动态加权参数配置
props = {
    'price':dv.get_ts("close_adj"),
    'high':dv.get_ts("high_adj"), # 可为空
    'low':dv.get_ts("low_adj"),# 可为空
    'ret_type': 'return',#可选参数还有upside_ret/downside_ret 则组合因子将以优化潜在上行、下行空间为目标
    'benchmark_price': dv.data_benchmark,  # 为空计算的是绝对收益 不为空计算相对收益
    'period': 30, # 30天的持有期
    'mask': mask,
    'can_enter': can_enter,
    'can_exit': can_exit,
    'forward': True,
    'commission': 0.0008,
    "covariance_type": "shrink",  # 协方差矩阵估算方法 还可以为"simple"
    "rollback_period": 120}  # 滚动窗口天数
comb_factors = dict()
for method in ["equal_weight","ic_weight","ir_weight","max_IR","max_IC"]:
    comb_factors[method] = multi_factor.combine_factors(factor_dict,
                                                        standardize_type="rank",
                                                        winsorization=False,
                                                        weighted_method=method,
                                                        props=props)
    print(method)
    print(comb_factors[method].dropna(how="all").head())
equal_weight
symbol      000001.SZ  000002.SZ  000008.SZ  000009.SZ  000012.SZ  000024.SZ  \
trade_date                                                                     
20140102     0.762542   0.819398        NaN   0.143813   0.397993   0.628763   
20140103     0.745819   0.822742        NaN   0.187291   0.401338   0.648829   
20140106     0.712375   0.842809        NaN   0.190635   0.434783   0.678930   
20140107     0.705686   0.849498        NaN   0.190635   0.458194   0.702341   
20140108     0.678930   0.842809        NaN   0.204013   0.444816   0.769231   

symbol      000027.SZ  000039.SZ  000046.SZ  000059.SZ    ...      601998.SH  \
trade_date                                                ...                  
20140102          NaN   0.464883   0.364548        NaN    ...       0.765886   
20140103          NaN   0.431438   0.377926        NaN    ...       0.749164   
20140106          NaN   0.441472   0.351171        NaN    ...       0.715719   
20140107          NaN   0.404682   0.347826        NaN    ...       0.719064   
20140108          NaN   0.408027   0.324415        NaN    ...       0.712375   

symbol      603000.SH  603160.SH  603288.SH  603699.SH  603799.SH  603833.SH  \
trade_date                                                                     
20140102          0.0        NaN        NaN        NaN        NaN        NaN   
20140103          0.0        NaN        NaN        NaN        NaN        NaN   
20140106          0.0        NaN        NaN        NaN        NaN        NaN   
20140107          0.0        NaN        NaN        NaN        NaN        NaN   
20140108          0.0        NaN        NaN        NaN        NaN        NaN   

symbol      603858.SH  603885.SH  603993.SH  
trade_date                                   
20140102          NaN        NaN   0.344482  
20140103          NaN        NaN   0.347826  
20140106          NaN        NaN   0.364548  
20140107          NaN        NaN   0.351171  
20140108          NaN        NaN   0.367893  

[5 rows x 488 columns]
Nan Data Count (should be zero) : 0;  Percentage of effective data: 57%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 57%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 57%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 57%
ic_weight
symbol      000001.SZ  000002.SZ  000008.SZ  000009.SZ  000012.SZ  000024.SZ  \
trade_date                                                                     
20140812     0.775920   0.826087        NaN   0.297659   0.655518   0.839465   
20140813     0.755853   0.789298        NaN   0.311037   0.642140   0.869565   
20140814     0.762542   0.799331        NaN   0.307692   0.652174   0.849498   
20140815     0.762542   0.852843        NaN   0.153846   0.595318   0.906355   
20140818     0.765886   0.913043        NaN   0.083612   0.585284   0.929766   

symbol      000027.SZ  000039.SZ  000046.SZ  000059.SZ    ...      601998.SH  \
trade_date                                                ...                  
20140812     0.829431   0.698997        NaN        NaN    ...       0.836120   
20140813     0.775920   0.705686        NaN        NaN    ...       0.856187   
20140814     0.846154   0.719064        NaN        NaN    ...       0.852843   
20140815     0.792642   0.719064        NaN        NaN    ...       0.826087   
20140818     0.749164   0.719064        NaN        NaN    ...       0.836120   

symbol      603000.SH  603160.SH  603288.SH  603699.SH  603799.SH  603833.SH  \
trade_date                                                                     
20140812     0.070234        NaN        NaN   0.207358        NaN        NaN   
20140813     0.083612        NaN        NaN   0.197324        NaN        NaN   
20140814     0.080268        NaN        NaN   0.204013        NaN        NaN   
20140815     0.090301        NaN        NaN   0.183946        NaN        NaN   
20140818     0.050167        NaN        NaN   0.163880        NaN        NaN   

symbol      603858.SH  603885.SH  603993.SH  
trade_date                                   
20140812          NaN        NaN   0.153846  
20140813          NaN        NaN   0.157191  
20140814          NaN        NaN   0.311037  
20140815          NaN        NaN   0.301003  
20140818          NaN        NaN   0.270903  

[5 rows x 488 columns]
Nan Data Count (should be zero) : 0;  Percentage of effective data: 57%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 57%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 57%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 57%
ir_weight
symbol      000001.SZ  000002.SZ  000008.SZ  000009.SZ  000012.SZ  000024.SZ  \
trade_date                                                                     
20140812     0.769231   0.859532        NaN   0.311037   0.658863   0.842809   
20140813     0.732441   0.819398        NaN   0.331104   0.655518   0.876254   
20140814     0.732441   0.819398        NaN   0.331104   0.665552   0.849498   
20140815     0.739130   0.872910        NaN   0.170569   0.605351   0.919732   
20140818     0.732441   0.933110        NaN   0.073579   0.595318   0.936455   

symbol      000027.SZ  000039.SZ  000046.SZ  000059.SZ    ...      601998.SH  \
trade_date                                                ...                  
20140812     0.829431   0.732441        NaN        NaN    ...       0.816054   
20140813     0.769231   0.739130        NaN        NaN    ...       0.839465   
20140814     0.839465   0.749164        NaN        NaN    ...       0.826087   
20140815     0.785953   0.745819        NaN        NaN    ...       0.816054   
20140818     0.715719   0.752508        NaN        NaN    ...       0.819398   

symbol      603000.SH  603160.SH  603288.SH  603699.SH  603799.SH  603833.SH  \
trade_date                                                                     
20140812     0.080268        NaN        NaN   0.234114        NaN        NaN   
20140813     0.093645        NaN        NaN   0.247492        NaN        NaN   
20140814     0.090301        NaN        NaN   0.240803        NaN        NaN   
20140815     0.110368        NaN        NaN   0.227425        NaN        NaN   
20140818     0.080268        NaN        NaN   0.207358        NaN        NaN   

symbol      603858.SH  603885.SH  603993.SH  
trade_date                                   
20140812          NaN        NaN   0.140468  
20140813          NaN        NaN   0.147157  
20140814          NaN        NaN   0.321070  
20140815          NaN        NaN   0.307692  
20140818          NaN        NaN   0.247492  

[5 rows x 488 columns]
Nan Data Count (should be zero) : 0;  Percentage of effective data: 57%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 57%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 57%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 57%
max_IR
symbol      000001.SZ  000002.SZ  000008.SZ  000009.SZ  000012.SZ  000024.SZ  \
trade_date                                                                     
20140813     0.331104   0.468227        NaN   0.682274   0.230769   0.645485   
20140814     0.374582   0.478261        NaN   0.682274   0.237458   0.561873   
20140815     0.414716   0.652174        NaN   0.387960   0.160535   0.705686   
20140818     0.408027   0.739130        NaN   0.173913   0.163880   0.765886   
20140819     0.501672   0.765886        NaN   0.120401   0.210702   0.826087   

symbol      000027.SZ  000039.SZ  000046.SZ  000059.SZ    ...      601998.SH  \
trade_date                                                ...                  
20140813     0.625418   0.779264        NaN        NaN    ...       0.461538   
20140814     0.755853   0.795987        NaN        NaN    ...       0.458194   
20140815     0.678930   0.799331        NaN        NaN    ...       0.451505   
20140818     0.591973   0.799331        NaN        NaN    ...       0.461538   
20140819     0.662207   0.792642        NaN        NaN    ...       0.481605   

symbol      603000.SH  603160.SH  603288.SH  603699.SH  603799.SH  603833.SH  \
trade_date                                                                     
20140813     0.371237        NaN        NaN   0.147157        NaN        NaN   
20140814     0.371237        NaN        NaN   0.170569        NaN        NaN   
20140815     0.391304        NaN        NaN   0.130435        NaN        NaN   
20140818     0.280936        NaN        NaN   0.100334        NaN        NaN   
20140819     0.026756        NaN        NaN   0.183946        NaN        NaN   

symbol      603858.SH  603885.SH  603993.SH  
trade_date                                   
20140813          NaN        NaN   0.100334  
20140814          NaN        NaN   0.418060  
20140815          NaN        NaN   0.341137  
20140818          NaN        NaN   0.224080  
20140819          NaN        NaN   0.257525  

[5 rows x 488 columns]
Nan Data Count (should be zero) : 0;  Percentage of effective data: 57%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 57%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 57%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 57%
max_IC
symbol      000001.SZ  000002.SZ  000008.SZ  000009.SZ  000012.SZ  000024.SZ  \
trade_date                                                                     
20140221     0.030100   0.324415        NaN   0.903010   0.494983   0.127090   
20140224     0.020067   0.163880        NaN   0.956522   0.568562   0.076923   
20140225     0.193980   0.672241        NaN   0.451505   0.428094   0.521739   
20140226     0.341137   0.903010        NaN   0.120401   0.471572   0.849498   
20140227     0.471572   0.799331        NaN   0.170569   0.571906   0.849498   

symbol      000027.SZ  000039.SZ  000046.SZ  000059.SZ    ...      601998.SH  \
trade_date                                                ...                  
20140221          NaN   0.464883   0.511706        NaN    ...       0.387960   
20140224          NaN   0.464883   0.401338        NaN    ...       0.244147   
20140225          NaN   0.749164   0.240803        NaN    ...       0.317726   
20140226          NaN   0.983278   0.280936        NaN    ...       0.013378   
20140227          NaN   0.866221   0.180602        NaN    ...       0.160535   

symbol      603000.SH  603160.SH  603288.SH  603699.SH  603799.SH  603833.SH  \
trade_date                                                                     
20140221     0.311037        NaN        NaN        NaN        NaN        NaN   
20140224     0.571906        NaN        NaN        NaN        NaN        NaN   
20140225     0.040134        NaN        NaN        NaN        NaN        NaN   
20140226     0.461538        NaN        NaN        NaN        NaN        NaN   
20140227     0.193980        NaN        NaN        NaN        NaN        NaN   

symbol      603858.SH  603885.SH  603993.SH  
trade_date                                   
20140221          NaN        NaN   0.381271  
20140224          NaN        NaN   0.451505  
20140225          NaN        NaN   0.170569  
20140226          NaN        NaN   0.294314  
20140227          NaN        NaN   0.254181  

[5 rows x 488 columns]

比较组合前和组合后的因子在30日持有期下的表现(统一到2014年9月后进行比较)

period = 30
ic_30  =   multi_factor.get_factors_ic_df(comb_factors,
                                          price=dv.get_ts("close_adj"),
                                          high=dv.get_ts("high_adj"), # 可为空
                                          low=dv.get_ts("low_adj"),# 可为空
                                          n_quantiles=5,# quantile分类数
                                          mask=mask,# 过滤条件
                                          can_enter=can_enter,# 是否能进场
                                          can_exit=can_exit,# 是否能出场
                                          period=period,# 持有期
                                          benchmark_price=dv.data_benchmark, # 基准价格 可不传入,持有期收益(return)计算为绝对收益
                                          commission = 0.0008,
                                          )
ic_30 = pd.concat([ic_30,-1*ic[30].drop("float_mv",axis=1)],axis=1)
ic_30.head()
Nan Data Count (should be zero) : 0;  Percentage of effective data: 57%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 48%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 48%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 48%
Nan Data Count (should be zero) : 0;  Percentage of effective data: 56%
equal_weight ic_weight ir_weight max_IR max_IC pb pe ps momentum
trade_date
20140102 NaN NaN NaN NaN NaN NaN NaN NaN NaN
20140103 -0.046945 NaN NaN NaN NaN -0.053375 -0.018784 -0.004749 -0.050374
20140106 -0.075316 NaN NaN NaN NaN -0.085169 -0.053065 -0.018863 -0.065761
20140107 0.027397 NaN NaN NaN NaN 0.026080 0.023327 0.056947 0.013767
20140108 0.131549 NaN NaN NaN NaN 0.084499 0.081695 0.158560 0.132101
ic_30_mean = dict()
ic_30_std = dict()
ir_30 = dict()
for name in ic_30.columns:    
    ic_30_mean[name]=ic_30[name].loc[20140901:].mean()
    ic_30_std[name]=ic_30[name].loc[20140901:].std()
    ir_30[name] = ic_30_mean[name]/ic_30_std[name]
import datetime

trade_date = pd.Series(ic_30.index)
trade_date = trade_date.apply(lambda x: datetime.datetime.strptime(str(x), '%Y%m%d'))
ic_30.index = trade_date

可视化比较

pd.Series(ic_30_mean).plot(kind="barh",xerr=pd.Series(ic_30_std),figsize=(15,5))
<matplotlib.axes._subplots.AxesSubplot at 0x18b5240e828>

print(ic_30_mean["equal_weight"])
print(ic_30_mean["ic_weight"])
print(ic_30_mean["pe"])
0.11461587810097988
0.10435470638726971
0.1067541063545408
pd.Series(ir_30).plot(kind="barh",figsize=(15,5))
<matplotlib.axes._subplots.AxesSubplot at 0x18b5086f9b0>

print(ir_30["equal_weight"])
print(ir_30["ic_weight"])
print(ir_30["pe"])
0.5528241142805751
0.48673093039146453
0.4986503963545165
ic_30[["equal_weight","ic_weight","pe"]].plot(kind="line",figsize=(15,5),)
<matplotlib.axes._subplots.AxesSubplot at 0x18b4ff6bf28>

ic_30.loc[datetime.date(2017,1,3):,][["equal_weight","ic_weight","pe"]].plot(kind="line",figsize=(15,5),)
<matplotlib.axes._subplots.AxesSubplot at 0x18b4fcc6940>

查看等权合成因子的详情报告

import matplotlib.pyplot as plt
from jaqs_fxdayu.research.signaldigger.analysis import analysis
from jaqs_fxdayu.research import SignalDigger

obj = SignalDigger()
obj.process_signal_before_analysis(signal=comb_factors["equal_weight"],
                                   price=dv.get_ts("close_adj"),
                                   high=dv.get_ts("high_adj"), # 可为空
                                   low=dv.get_ts("low_adj"),# 可为空
                                   n_quantiles=5,# quantile分类数
                                   mask=mask,# 过滤条件
                                   can_enter=can_enter,# 是否能进场
                                   can_exit=can_exit,# 是否能出场
                                   period=30,# 持有期
                                   benchmark_price=dv.data_benchmark, # 基准价格 可不传入,持有期收益(return)计算为绝对收益
                                   commission = 0.0008,
                                   )
obj.create_full_report()
plt.show()
Nan Data Count (should be zero) : 0;  Percentage of effective data: 56%


Value of signals of Different Quantiles Statistics
               min       max      mean       std  count    count %
quantile                                                          
1         0.000000  0.538462  0.103221  0.060599  53388  20.145655
2         0.180602  0.628763  0.308244  0.060081  53003  20.000377
3         0.371237  0.695652  0.510054  0.059086  52990  19.995472
4         0.565217  0.849498  0.708706  0.057843  53003  20.000377
5         0.755853  1.000000  0.904544  0.056265  52626  19.858119
Figure saved: E:\2018_Course\HighSchool\Final\5_因子研发工具实操Richard\returns_report.pdf
Information Analysis
                 ic
IC Mean       0.120
IC Std.       0.205
t-stat(IC)   17.957
p-value(IC)   0.000
IC Skew      -0.128
IC Kurtosis  -0.719
Ann. IR       0.584
Figure saved: E:\2018_Course\HighSchool\Final\5_因子研发工具实操Richard\information_report.pdf



<matplotlib.figure.Figure at 0x18b506e5048>

print(analysis(obj.signal_data,is_event=False,period=30))
{'ic':                 return_ic  upside_ret_ic  downside_ret_ic
IC Mean      1.199666e-01      -0.025340     2.590128e-01
IC Std.      2.054836e-01       0.203069     1.710594e-01
t-stat(IC)   1.795679e+01      -3.838066     4.657144e+01
p-value(IC)  2.933009e-62       0.000132    6.274188e-247
IC Skew     -1.281287e-01       0.369542    -4.648359e-01
IC Kurtosis -7.191112e-01      -0.728975    -1.038014e-01
Ann. IR      5.838256e-01      -0.124786     1.514168e+00, 'ret':              long_ret  long_short_ret  top_quantile_ret  bottom_quantile_ret  \
t-stat       5.009707       11.970392         27.353712           -21.213261   
p-value      0.000000        0.000000          0.000000             0.000000   
skewness    -0.049712        0.305483          2.104621             1.352262   
kurtosis     4.585943        1.671780         13.188646             6.368882   
Ann. Ret     0.034979        0.085021          0.097492            -0.105558   
Ann. Vol     0.075573        0.076875          0.287875             0.404814   
Ann. IR      0.462853        1.105960          0.338663            -0.260757   
occurance  946.000000      946.000000      52626.000000         53388.000000   

              tmb_ret  all_sample_ret  
t-stat      12.331461       -2.447519  
p-value      0.000000        0.014390  
skewness     0.245940        1.495618  
kurtosis     1.438639        9.227692  
Ann. Ret     0.203389       -0.004609  
Ann. Vol     0.178518        0.341296  
Ann. IR      1.139320       -0.013503  
occurance  946.000000   265010.000000  , 'space':                long_space  top_quantile_space  bottom_quantile_space  \
Up_sp Mean       0.128720            0.126579               0.136456   
Up_sp Std        0.085865            0.140843               0.158185   
Up_sp IR         1.499101            0.898725               0.862640   
Up_sp Pct5       0.041861            0.004368               0.004635   
Up_sp Pct25      0.076599            0.036603               0.038489   
Up_sp Pct50      0.103032            0.085234               0.090519   
Up_sp Pct75      0.143446            0.165011               0.176643   
Up_sp Pct95      0.331293            0.391718               0.421635   
Up_sp Occur    946.000000        52626.000000           53388.000000   
Down_sp Mean    -0.137471           -0.108866              -0.191665   
Down_sp Std      0.088789            0.202787               0.282507   
Down_sp IR      -1.548294           -0.536849              -0.678443   
Down_sp Pct5    -0.343109           -0.384268              -1.000800   
Down_sp Pct25   -0.147208           -0.097760              -0.171967   
Down_sp Pct50   -0.109330           -0.046730              -0.086841   
Down_sp Pct75   -0.089392           -0.019800              -0.039916   
Down_sp Pct95   -0.063714           -0.003965              -0.008188   
Down_sp Occur  946.000000        52626.000000           53388.000000   

                tmb_space  all_sample_space  
Up_sp Mean       0.320615          0.130071  
Up_sp Std        0.162529          0.143170  
Up_sp IR         1.972659          0.908508  
Up_sp Pct5       0.152553          0.004635  
Up_sp Pct25      0.215860          0.038288  
Up_sp Pct50      0.269578          0.088612  
Up_sp Pct75      0.355337          0.172445  
Up_sp Pct95      0.648456          0.395784  
Up_sp Occur    946.000000     265010.000000  
Down_sp Mean    -0.247340         -0.152250  
Down_sp Std      0.110376          0.253253  
Down_sp IR      -2.240885         -0.601178  
Down_sp Pct5    -0.477266         -1.000800  
Down_sp Pct25   -0.304182         -0.133021  
Down_sp Pct50   -0.211650         -0.063759  
Down_sp Pct75   -0.167795         -0.027543  
Down_sp Pct95   -0.121475         -0.005339  
Down_sp Occur  946.000000     265010.000000  }

进一步测试下等权合成因子的绝对收益效果

obj.process_signal_before_analysis(signal=comb_factors["equal_weight"],
                                   price=dv.get_ts("close_adj"),
                                   high=dv.get_ts("high_adj"), # 可为空
                                   low=dv.get_ts("low_adj"),# 可为空
                                   n_quantiles=5,# quantile分类数
                                   mask=mask,# 过滤条件
                                   can_enter=can_enter,# 是否能进场
                                   can_exit=can_exit,# 是否能出场
                                   period=30,# 持有期
                                   #benchmark_price=dv.data_benchmark, # 基准价格 可不传入,持有期收益(return)计算为绝对收益
                                   commission = 0.0008,
                                   )
obj.create_full_report()
plt.show()
Nan Data Count (should be zero) : 0;  Percentage of effective data: 56%


Value of signals of Different Quantiles Statistics
               min       max      mean       std  count    count %
quantile                                                          
1         0.000000  0.538462  0.103221  0.060599  53388  20.145655
2         0.180602  0.628763  0.308244  0.060081  53003  20.000377
3         0.371237  0.695652  0.510054  0.059086  52990  19.995472
4         0.565217  0.849498  0.708706  0.057843  53003  20.000377
5         0.755853  1.000000  0.904544  0.056265  52626  19.858119
Figure saved: E:\2018_Course\HighSchool\Final\5_因子研发工具实操Richard\returns_report.pdf
Information Analysis
                 ic
IC Mean       0.120
IC Std.       0.205
t-stat(IC)   17.957
p-value(IC)   0.000
IC Skew      -0.128
IC Kurtosis  -0.719
Ann. IR       0.584
Figure saved: E:\2018_Course\HighSchool\Final\5_因子研发工具实操Richard\information_report.pdf



<matplotlib.figure.Figure at 0x18b51124cf8>

将Quantile5的选股结果保存成excel

excel_data = obj.signal_data[obj.signal_data['quantile']==5]["quantile"].unstack().replace(np.nan, 0).replace(5, 1)
print (excel_data.head())
excel_data.to_excel('./equal_weight_quantile_5.xlsx')
symbol      000001.SZ  000002.SZ  000012.SZ  000024.SZ  000027.SZ  000039.SZ  \
trade_date                                                                     
20140103          0.0        1.0        0.0        0.0        0.0        0.0   
20140106          0.0        1.0        0.0        0.0        0.0        0.0   
20140107          0.0        1.0        0.0        0.0        0.0        0.0   
20140108          0.0        1.0        0.0        0.0        0.0        0.0   
20140109          0.0        1.0        0.0        0.0        0.0        0.0   

symbol      000063.SZ  000069.SZ  000100.SZ  000157.SZ    ...      601919.SH  \
trade_date                                                ...                  
20140103          1.0        0.0        0.0        1.0    ...            0.0   
20140106          1.0        0.0        0.0        1.0    ...            0.0   
20140107          0.0        0.0        0.0        1.0    ...            0.0   
20140108          0.0        0.0        0.0        1.0    ...            0.0   
20140109          0.0        0.0        0.0        1.0    ...            0.0   

symbol      601928.SH  601933.SH  601939.SH  601988.SH  601989.SH  601991.SH  \
trade_date                                                                     
20140103          0.0        0.0        0.0        1.0        0.0        1.0   
20140106          0.0        0.0        0.0        1.0        0.0        1.0   
20140107          0.0        0.0        0.0        0.0        0.0        1.0   
20140108          0.0        0.0        0.0        0.0        0.0        0.0   
20140109          0.0        0.0        0.0        1.0        0.0        0.0   

symbol      601992.SH  601997.SH  601998.SH  
trade_date                                   
20140103          1.0        0.0        0.0  
20140106          1.0        0.0        0.0  
20140107          1.0        0.0        0.0  
20140108          1.0        0.0        0.0  
20140109          1.0        0.0        0.0  

[5 rows x 244 columns]
⚠️ **GitHub.com Fallback** ⚠️