Testing and Recon Approach for Data project - ja-guzzle/guzzle_docs GitHub Wiki
Background
Often we have to deal with recon and data testing in form of Unit testing, SIT and UAT
There are various approaches followed and they have been contextual to what developer felt was best in terms of being fool-proof, what customer agrees and understand and so forth
Reason why Cucumber is perfect to test the business rules
Your source data does not have all the scenarios wihch you coded for
Youcan't expect a BA to write the SQL which itself can have recon bugs. Its easy to mimic source and target data. Lets ask tester how many instances his recon query was wrong
Approaches
use Guzzle recon framework which has some similar capability -
The other option is to pass source and target query and do those 5 counts - and to avoid going thru recon framework, we came up with utility during SP project and a finalized approach of how to do this kind of check considering duplicate records etc in the source
To use notebook to perform this checks
Write ingestion job in guzzle which has this recon rule built and writes to some target table in a table which we can report.We can also use template based ETL in processing module
To use Cucumber framework
Interim conclusion
i foresee we will need combination of recon and unit testing like cucumber. But agree that we have to have something which is practical , a standard which can remain standard for foreseeable future and does not become overkill