write a pyspark script
the source is gonna be like a csv
Column A,Combined Values
source_system,"Derived: 'CL' (constant), ifrs9.committed_unutilized_v.LCPF_LOAD_COUNTRY (hardcoded as 'Unused Limits'), ifrs9.committed_unutilized_v.source_system, ifrs9.deals_funded_unfunded_report.deal_type"
facility_id,"ifrs9.committed_unutilized_v.LCPF_CMTMNT_REF, ifrs9.deals_funded_unfunded_report.facility_ref, ifrs9.deals_funded_unfunded_report_all_lg.facility_ref"
customer_id,"ifrs9.committed_unutilized_v.LCPF_CSTMR_MNMNC, ifrs9.deals_funded_unfunded_report.customer_equation_id_new, ifrs9.deals_funded_unfunded_report.customer_equation_id_new, ifrs9.deals_funded_unfunded_report.customer_id_new, ifrs9.deals_funded_unfunded_report_all_lg.customer_id or UNFUN_CUST.customer_equation_id"
branch,"RAW_DATA_VAULT.BRANCHES.CAPF_BRANCH_NUMBER, RAW_DATA_VAULT.BRANCHES.CAPF_BRNCH_NUMBER, ifrs9.deals_funded_unfunded_report.account_branch_number_new, ifrs9.deals_funded_unfunded_report_all_lg.account_branch_number"
basic,"ifrs9.committed_unutilized_v.LCPF_CSTMR_MNMNC, ifrs9.deals_funded_unfunded_report.external_account_number_new, ifrs9.deals_funded_unfunded_report_all_lg.external_account_number"
suffix,"ifrs9.committed_unutilized_v.SUFFIX, ifrs9.deals_funded_unfunded_report.account_suffix_new, ifrs9.deals_funded_unfunded_report_all_lg.account_suffix"
where the pyspark script has to read this and take the column combined values which is in the format schema.tablename.column name
later it should check if that column exists in that table of that respective schema
p.s shell can j=have multiple hivecolumns so make sure all are read