线上woe转换 - shuiwanghuohuo/scorecard_wiki GitHub Wiki

value2woe(data_rdd, filter_evidence, black_flag="label", type="rdd", exclude_column=[], not_in_list=["None", "NaN", "NA", "nan", None, "-999", "-999.0", -999, "-1111", "-1111.0", -1111])
将指标值转换为woe

Parameter Description
---------------------
data_rdd : pyspark.rdd.PipelinedRDD
    spark dataframe经过转换后的rdd

filter_evidence : pandas.core.frame.DataFrame
    feature_select函数产生的结果(筛选你需要的列),会将这些列转换为woe

black_flag : string,(default="label")
    标签列列名

type : string,(default="rdd")
    返回的数据类型,如果后续需要在集群上计算,使用"rdd",否则可以使用"df"

exclude_column=[],

not_in_list : list,(default=["None", "NaN", "NA", "nan",None, "-999", "-999.0", -999,"-1111","-1111.0",-1111])
    空值列表,在列表中的值会被认为空值

Return
------
woe_data : 带标签列和指标列(筛选后)的spark rdd 或 dataframe