Audit Columns - ja-guzzle/guzzle_docs GitHub Wiki
Maintaining audit or control columns in data tables across the layers is very crucial to both product support and development team as it helps troubleshoot issues at times. Below are set of columns you should always maintain with some standard naming convention to make such columns consistent and readable.
- Use consistent prefix for all columns e.g. "w_"
- Capture guzzle runtime audit IDs like batch id, job instance id etc which helps troubleshoot
- Capture source system or table or file name while ingesting data into staging layer
Sr. | Audit Column name | Data Type | Purpose | Guzzle parameter/transformation | Partitioned Column | Applicable to which data layer | comments |
---|---|---|---|---|---|---|---|
1. | w_refresh_ts | timestamp | This is to capture system timestamp for data refresh | None | No | STG, FND, PLP | This column can be mapped using current_timestamp |
2. | w_job_instance_id | bigint | This is to capture run-time job instance id generated by Guzzle for each job execution | ${job_instance_id} | No | STG, FND, PLP | This column can be traced in Guzzle maintained run-time Audit table job_info |
3. | w_batch_id | bigint | This is to capture run-time batch id generated by Guzzle for each batch context execution | ${batch_id} | Likely | STG, FND, PLP | This column can be traced in Guzzle maintained run-time Audit table batch_control |
4. | w_src_file_name | string | This is to capture source file name processed by a job in target table | None | No | STG, FND | Additional columns src_filename, src_control_filename, src_fullfilepath are made available in Guzzle if Source property "Include Source File Name As Column" is checked in Guzzle ingestion job |