Dataflow from staging to presentation - Ayush810/NYC-TAXI-Yellow GitHub Wiki
Now we have to bring data from staging to presentation so to that we will use dataflows to get the data from staging schema and we will perform transform operations on the data and then drop it to presentation schema DBO
but before this we need to create a table in the presentation dbo schema
use this code to create table
CREATE TABLE dbo.nyctaxi_yellow
(
vendor varchar(50),
tpep_pickup_datetime date,
tpep_dropoff_datetime date,
pu_borough varchar(100),
pu_zone varchar(100),
do_borough varchar(100),
do_zone varchar(100),
payment_method varchar(50),
passenger_count int,
trip_distance FLOAT,
total_amount FLOAT
);
Now create the dataflow
once you are done with the data transformation part be on the latest step of your transformation - > go to HOME -> Add data destination select your -> project warehouse -> select your shcema -> select table -> select Append data now you data has been landed into presentation layer wait for the dataflow refresh once it is completed the data will start reflecting in presentaion layer tables
Now we have to refresh this dataflow on demand so lets create a pipeline for that
create pipeline name = pl_pres_processing_nyctaxi
add a dataflow activity in pipeline
rename it = process to presentation got to settings -> select your dataflow name
now i want to add the metadaata and rows processed by my dataflow inside the metadata _processing_log table
so to do that i have to create a stored procedure
CREATE PROCEDURE metadata.insert_presentation_metadata
@pipeline_run_id VARCHAR(255),
@table_name VARCHAR(255),
@processed_date DATETIME2
AS
INSERT INTO metadata.processing_log (pipeline_run_id, table_processed, rows_processed, latest_processed_pickup, processed_datetime)
SELECT
@pipeline_run_id AS pipeline_id,
@table_name AS table_processed,
COUNT(*) AS rows_processed,
MAX(tpep_pickup_datetime) AS latest_processed_pickup,
@processed_date AS processed_datetime
FROM dbo.nyctaxi_yellow;
Now add a stored procedure activity to the pipeline
Apply the follwoing properties as shown in below image
Save + run (pipeline)