线上woe转换 - shuiwanghuohuo/scorecard_wiki GitHub Wiki
value2woe(data_rdd, filter_evidence, black_flag="label", type="rdd", exclude_column=[], not_in_list=["None", "NaN", "NA", "nan", None, "-999", "-999.0", -999, "-1111", "-1111.0", -1111])
将指标值转换为woe
Parameter Description
---------------------
data_rdd : pyspark.rdd.PipelinedRDD
spark dataframe经过转换后的rdd
filter_evidence : pandas.core.frame.DataFrame
feature_select函数产生的结果(筛选你需要的列),会将这些列转换为woe
black_flag : string,(default="label")
标签列列名
type : string,(default="rdd")
返回的数据类型,如果后续需要在集群上计算,使用"rdd",否则可以使用"df"
exclude_column=[],
not_in_list : list,(default=["None", "NaN", "NA", "nan",None, "-999", "-999.0", -999,"-1111","-1111.0",-1111])
空值列表,在列表中的值会被认为空值
Return
------
woe_data : 带标签列和指标列(筛选后)的spark rdd 或 dataframe