Ingestion - ja-guzzle/guzzle_docs GitHub Wiki
- Are the column names referred in Ingestion module case sensitive:
Source Section:
- Column mapping for JSON/XML and delimited file
- Column names in grok/regexp parser for Text file
- SQL and filters for Hive, Delta and JDBC
Schema Section:
- Column name
- Validate SQL and Transform SQL
Target Section:
- Partition columns
- Are the table names / file names referred in Ingestion module case sensitive:
Source Section:
- table name /SQL for JDBC
- file name pattern for file sources
Schema Section:
- in table in sub-query in validate SQL and Transform SQL
Target Section:
- table name for hive/jdbc/delta
- file name for hive/jdbc/delta
Reject Section:
- table name for hive/jdbc/delta
- file name for hive/jdbc/delta
- The validation threshold for JDBC when using parallelism is applied at the total pull level or for each partition?
Answer: Its at total "pull level". Entire JDBC feed is treated as one data frame though the data is read from JDBC source via multiple executors as per the config. The threshold is applied at the total pull level
- Why don't we support zero failure threshold? Usually some wants to reject the data if there is even one failure.
Answer: Yes, We support zero failure threshold. It will work if we specify it using editor.
- What happens when the rejection section is not specified but the "Schema and Validation" section has to validation rules defined and there are records failing validation?
Answer: It will write valid records in the target and ignore invalid records