How to Create a Link and Satellite Quickly and Efficiently - OptimalBI/optimal-data-engine-mssql GitHub Wiki

The following is a sample script, which will do all the necessary work to create a Link and Satellite combination with minimal effort and fiddling with Configuration.

In addition, the script can be set to add a Satellite to an existing Link.

Modify this script as required to suit the standards of your installation.

The script requires a Stage Table to exist, which it uses to obtain the necessary meta data for creating all Configuration details.

The Code can be found at GitHub. It is a part of the ODE Admin project which is supposed to be installed as an extra database on the same server where ODE Config is located. If you don't want to install the project, replace $(ConfigDatabase) in code with the ODE Config database name before execution.

It expects a Schedule and a suitable Source System to exist in your Configuration (there are example snippets for creating these in the script). This script will add source table to the schedule. If the source table has dependencies on other Data Vault tables to be executed first, hierarchy should be added manually.

Further, it will create a Release for the Satellite (and if there is one, the Hub), to simplify releasing the newly created Configuration to Test / Production.

Script outputs every step of the execution. It includes link and satellite keys, list of columns, schedule record etc. These may be required in case of further modifications to configured objects. Otherwise most of the output could be ignored.

Note that the script will create all the necessary Configuration, but it doesn’t create the Link and Satellite table(s) in the Vault.

The last part of the script produces a set of useful scripts which can be used to:

  • Create the Tables in the Data Vault
  • Populate the tables by manual one-off load
  • Select from the tables to check that they have been created and populated as expected.

Code example:

USE [ODE_Admin]
GO

DECLARE @Hub_key_list [dbo].[dv_link_detail_list]
INSERT INTO @Hub_key_list VALUES
('Customer', 'Customer', 'CustomerID', 'CustomerID')
,('Order', 'Order', 'OrderID', 'SalesOrderID')

EXECUTE [dbo].[ODE_link_sat_config] 
   @SatelliteOnly = 'N'
  ,@SprintDate = 20180329
  ,@ReleaseReference = 'US-304'
  ,@ReleaseSource = 'Jira'
  ,@StageDatabase = 'ODE_Stage'
  ,@StageSchema = 'Stage'
  ,@StageTable = 'Link_Customer_Orders_Product'
  ,@StageSourceType = 'BespokeProc'
  ,@StageLoadType = 'Full'
  ,@StagePassLoadTypeToProc = 0
  ,@SourceDataFilter = NULL
  ,@LinkName = 'Customer_Orders_Product'
  ,@SatelliteName = 'Link_Customer_Makes_Order'
  ,@VaultName = 'ODE_Vault'
  ,@FullScheduleName = 'Full_load'
  ,@IncrementScheduleName = NULL
  ,@Hub_key_list = @Hub_key_list
  ,@SourceSystemName = 'AdvWorks'
  ,@SouceTableSchema = 'Sales'
  ,@SourceTableName = 'SalesOrderHeader'
GO

Provide the following parameters to the procedure execution:

  • SatelliteOnly - only acceptable values are 'Y' and 'N'. When set to "N", the script will create a Link and Satellite combination. "Y" will cause the script to create a Satellite and hook it up to the specified Link.
  • SprintDate - The Start Date of the current Sprint in Integer yyyymmdd. Script will create a new release sequence number based on that. In case if this release numbering system doesn't work for you, provide any other integer number.
  • ReleaseReference - Put the User Story and/or Task numbers for the Satellite which you are building.
  • ReleaseSource - system the release reference number refers to e.g. Jira, Rally
  • StageDatabase - the name of the Stage Database which holds the table on which the Load will be modeled. This Stage table needs to exist already.
  • StageSchema - the schema name of the Stage table
  • StageTable - the table name. Note that in case of Bespoke proc, its name should match the stage table name with the "usp_" prefix.
  • StageSourceType - the type of the process that loads the data into the stage table. The only acceptable values are: 'BespokeProc' - in case if the stage table is populated by the stored procedure that ODE needs to execute to populate the stage table; 'ExternalStage' - in case if some other ETL tool populates the stage table and ODE can rely on it (also you can use it in case if you use the stage view).
  • StageLoadType - the type of data load. The only acceptable values are: 'Full' - in case if the satellite will always be populated by a full load process; 'Delta' - in case if data will be loaded in portions, e.g. log type data (note that this type of load doesn't support deletions).
  • StagePassLoadTypeToProc - whether the Stage stored procedure takes the load type parameter. 0 = Don't Pass it on, 1 = Pass the parameter value "Delta" or "Full" to the stage Procedure. You can implement the different code behaviour depending on the type of load. E.g. you can use it if you implement Delta type of load. Only applicable to BespokeProc source type.
  • SourceDataFilter - provide a SQL statement of the conditions that need to be applied to the source dataset. This code will be added to the WHERE statement when SSIS package is generated. Only applicable to SSISPackage source type. In case of stored procedure this field could be used for the documentation purpose.
  • LinkName - Provide a link name you are creating or existing link name in case if you are adding a satellite to existing link. Note that there's no need to add prefix "l_".
  • SatelliteName - name of the satellite. Note that there's no need to add prefix "s_"
  • VaultName - the name of the vault where the Link and Satellite will be created.
  • FullScheduleName - put a Full schedule name, e.g. 'Full_Load' This schedule needs to exist prior to running this script. Otherwise leave NULL if job doesn't require to be scheduled.
  • IncrementScheduleName - put an Incremental schedule name, e.g. 'Increment_Load' This schedule needs to exist prior to running this script. Otherwise leave NULL if job doesn't require to be scheduled.
  • Hub_key_list - the list of fields for mapping between existing hub columns and link columns. This is a table of mapping parameters, you should provide mapping for every hub column in case of multi-part business key. Format of this custom table is:
    • a hub key as it will be shown at link, e.g. your actual hub could be "Party", but you can name this key as "Customer" in a link. This is mostly required for Same-As links. We recommend to name hub keys as they are in hubs for the Vault consistency.
    • a hub actual name
    • hub column name - as mentioned above, in case of multi-part business key there should be a record for each of the parts
    • a column name as it is in source table Here is an example of code:
      INSERT INTO @Hub_key_list VALUES 
      ('Customer', 'Party', 'PartyID', 'CUST_ID')
      , ('Customer', 'Party', 'Party_KeyType', 'CUST_TPE')
  • SourceSystemName - The name of the source system as at the table dv_source_system. If the source system is new for this ensemble, it should be created manually first.
  • SouceTableSchema - The schema of the source table. In case of SSIS package source, this should be a schema of the source CDC function.
  • SourceTableName - The source table name as it is in the stage database.

You may also want to check articles about general theory on link implementation in ODE, how to implement Same-As or hierarchical link and Why do Queries, used to load Links, often use DISTINCT?

⚠️ **GitHub.com Fallback** ⚠️