General: DeFi: Dex Swaps - FlipsideCrypto/fsc-evm GitHub Wiki
Building and Deploying DEX Swaps Models
Decentralized Exchanges or DEXs, are permissionless marketplaces where tokens can be bought or sold, without a centralized intermediary. For EVMs, activity occurs on Liquidity Pools, which are smart contracts comprised of token pairs that enable automated market making and token swaps. This guide will detail how to 1) Research DEX protocols and 2) Build the Silver/Gold DBT model layers for DEX Swaps.
Research Phase
Please reference the Research Phase section in How to Research and Curate EVM Protocols for official documentation on the research process. The following information will be relevant to DEX Swaps.
Summary:
- In order to properly build a Swap model for a DEX protocol, you'll need to identify the Liquidity Pools for the protocol. Please see the DEX Liquidity Pools guide for more information on researching and building those models.
Best Practices, Tips & Tricks:
- Swap events generally occur directly on the liquidity pool address/contract.
- In cases where a swap does not occur on a liquidity pool, we may be able to parse swap events via the router contract for the protocol.
- The majority of DEX Swap protocols are based on Uniswap deployments and share the same contract structure, emitted events and event topics. For this reason, we may use other DEX Swaps models as a starting point for researching new protocols.
- Depending on the protocol, the token/amount in/out may be labeled within the emitted events as
0or1. While0typically indicatesinand1out, there may be other logic that is required to determine the proper swap flow. Please see ethereum.silver_dex.fraxswap_swaps for an example (e.g. DEX protocols based on theuniswap v2deployment).- While swap events are similarly structured between the majority of decentralized exchanges, certain protocols may refer to token and amount in/out differently (e.g. token0 vs tokenA vs fromToken).
- In rare cases, certain protocols (such as Curve) may not have the required values emitted in events. You may need to reference transfers tables or traces to successfully parse all relevant columns that make up a swap.
Silver Layer Modeling
This guide outlines the process of creating the Silver Layer DBT models for tracking DEX Swap events or traces on EVM-compatible blockchains, along with the applicable DBT tests for maintaining data integrity. This includes both Protocol Specific and Curated Complete models.
Please reference the Silver Layer Modeling section in How to Research and Curate EVM Protocols for more details on the modeling process. The following information is related directly to DEX Swaps only.
1. Protocol Specific Models
Summary:
Built with silver.logs and/or silver.traces, the Silver models are designed to capture essential information about each and every token swap that occurs on the DEXs liquidity pools, including token and amount in/out, sender address, recipient address, and other relevant values. The token_in, token_out, amount_in, and amount_out are the most important columns to extract, as these will be crucial for analyzing each swap.
Best Practices, Tips & Tricks:
- The DEX models are some of the first curated models deployed by Flipside, and therefore may have non-standardized structures or naming conventions. Over time, these models may be adapted to new standards.
- Some protocols may have different deployed versions or topic_0 / events for swaps on same or different contract addresses. If the structure of the event is different, we typically deploy multiple models for each version (e.g.
silver_dex.kyberswap_v1_dynamic_swaps,silver_dex.kyberswap_v1_static_swapsandsilver_dex.kyberswap_v2_elastic_swaps.
Implementation Steps, Variables & Notes:
- Set up the model configuration
- Use
incrementalmaterialization with adelete+insertstrategy. - Define a unique key to identify each record that the incremental strategy should apply to. The unique key defined in the config will be different than the unique column / unique row in the table. This is because the model must delete+insert on a broader key to properly handle block reorg scenarios. Please see the Block Reorg guide for more information. We typically use
block_numberin swap models because it allows us to reload blocks in their entirety, rather than reloading by transaction or event and potentially missing, loading or deleting incorrect records. - Cluster by columns with low cardinality, such as
block_timestamp::DATE, to improve query performance. - Add relevant tags for model categorization (typically tagged as
curatedandreorg) so that the model may be picked up by the relevant scheduled workflows.
- Use
{{ config(
materialized = 'incremental',
incremental_strategy = 'delete+insert',
unique_key = "block_number",
cluster_by = ['block_timestamp::DATE'],
tags = ['curated','reorg']
) }}
- Create a CTE for liquidity pools
- Each swap typically occurs on a liquidity pool contract, so we pull these through without incremental logic to inner join in subsequent CTEs.
- While creating a separate CTE for pools is technically unnecessary, having it stand alone generally improves model readability.
WITH pools AS (
SELECT
pool_address,
token0,
token1
FROM
{{ ref('silver_dex__fraxswap_pools') }}
),
-
Create a CTE for Swap events
- Manually parse event data to extract relevant information (token and amount in/out, sender address, recipient address, etc.).
- For incremental runs, filter data based on the latest
_inserted_timestamp, with a12 hourlookback.
- For incremental runs, filter data based on the latest
- Manually parse event data to extract relevant information (token and amount in/out, sender address, recipient address, etc.).
-
Add the final SELECT statement
- Select all relevant fields from the Swaps events CTE.
- If necessary, add
WHERE token_in <> token_outto handle potential situations where a token is swapped for the same token, which may incorrectly lead to an overstatement of voluming or inconsistent reporting of true swap activity. - Adjust the token and amount 0/A/From or 1/B/To as needed to get the final token and amount in/out column values.
- Manually add
Swapas event_name and the protocol name as theplatform, where necessary.
-
Create the
.ymlfile and add table/column level tests- At minimum tests should include, but are not limited to, the following;
- Uniqueness on 1+ columns (typically
_log_idbut may vary by model) not_nulltests ontoken_in,token_out,senderandtx_to
- Uniqueness on 1+ columns (typically
- At minimum tests should include, but are not limited to, the following;
version: 2
models:
- name: silver_dex__fraxswap_swaps
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- _LOG_ID
columns:
- name: BLOCK_NUMBER
...
Examples, References & Sources:
2. Complete Models
Summary:
The complete_dex_swaps model aggregates all protocol specific DEX swaps models together into one, allowing centralized joins on silver.contracts and ez_prices_hourly for token metadata, hourly price data, streamlined logic and standardized columns. This includes platform names, versions, USD amounts, symbols, decimals and other relevant columns for all swap events that occur on DEX liquidity pools.
Best Practices, Tips & Tricks:
- Protocol Versions: If there are multiple versions of a protocol, coming from separate models, each should have it's own CTE.
- The name for the
platformcolumn should include both theprotocolandversionseparated by a-(dash). You do not need to include,v1, but instead begin incrementing atv2and beyond. - The
versioncolumn should represent a version specific to the CTE/model being referenced. For example, if there are multiple CTEs/models required for Pancakeswap v2, you would addpancakeswap-v2as theplatformandv2-ammas theversionin the first CTE, thenpancakeswap-v2as theplatformandv2-mmas theversionin the next CTE. This is required because thedelete+insertincremental strategy uses bothplatformandversionto identify new incremental records.
- The name for the
- When adding new Protocols/CTEs/models to the
completemodel, please pay attention to the number of columns being pulled through, the order of those columns and the data type/format and names of those columns, as this will materially impact the output of the model.- More specifically, pay attention to how
pool_nameis handled in thecomplete_dex_liquidity_poolsmodel that gets joined in here or ifpool_nameis NULL/missing entirely. Unique logic is typically required for DEXs that are not built off liquidity pool models and instead use events emitted through swap router contracts.
- More specifically, pay attention to how
Implementation Steps, Variables & Notes:
- Set up the model configuration
- Use
incrementalmaterialization with adelete+insertstrategy. - Define a unique key to identify each record that the incremental strategy should apply to. The unique key defined in the config will be different than the unique column / unique row in the table. This is because the model must delete+insert on a broader key to properly handle block reorg scenarios. Please see the Block Reorg guide for more information. We typically use ['block_number','platform','version'] in the complete swaps model because it allows us to reload blocks, platforms and versions in their entirety, rather than reloading by transaction or event and potentially missing, loading or deleting incorrect records.
- Cluster by columns with low cardinality, such as
block_timestamp::DATEandplatform, to improve query performance. - Add Search Optimization (SO) in the post_hook with
EQUALITYandSUBSTRING. Please only add SO on non-number columns, where applicable. - Add relevant tags for model categorization (typically tagged as
['curated','reorg','heal']) so that the model may be picked up by the relevant scheduled workflows.
- Use
{{ config(
materialized = 'incremental',
incremental_strategy = 'delete+insert',
unique_key = ['block_number','platform','version'],
cluster_by = ['block_timestamp::DATE','platform'],
post_hook = "ALTER TABLE {{ this }} ADD SEARCH OPTIMIZATION ON EQUALITY(tx_hash, origin_function_signature, origin_from_address, origin_to_address, contract_address, pool_name, event_name, sender, tx_to, token_in, token_out, symbol_in, symbol_out), SUBSTRING(origin_function_signature, pool_name, event_name, sender, tx_to, token_in, token_out, symbol_in, symbol_out)",
tags = ['curated','reorg','heal']
) }}
- Create CTEs for each Protocol Specific model/version, where applicable.
- Select all columns from the source table
- Use standardized column names, as each of these CTEs will be unioned together towards the end of the model.
- Add columns with
NULLvalues in cases where the column exists in other Protocol Specific CTEs.- Columns for tokens and amounts may require additional minor transformations, specifically in the case of
uniswap v3type DEXs.
- Columns for tokens and amounts may require additional minor transformations, specifically in the case of
- Add columns with
- Hardcode the
platformandversion, if not done already. - For incremental runs, filter data based on the latest
_inserted_timestamp, with theLOOKBACKvariable, defaulted to4 hours.- The name used to reference `var('HEAL_MODELS') should match the name of the CTE.
...
verse AS (
SELECT
block_number,
block_timestamp,
tx_hash,
origin_function_signature,
origin_from_address,
origin_to_address,
contract_address,
event_name,
amount_in_unadj,
amount_out_unadj,
token_in,
token_out,
sender,
tx_to,
event_index,
platform,
'v1' AS version,
_log_id,
_inserted_timestamp
FROM
{{ ref('silver_dex__verse_swaps') }}
{% if is_incremental() and 'verse' not in var('HEAL_MODELS') %}
WHERE
_inserted_timestamp >= (
SELECT
MAX(_inserted_timestamp) - INTERVAL '{{ var("LOOKBACK", "4 hours") }}'
FROM
{{ this }}
)
{% endif %}
),
univ3 AS (
SELECT
block_number,
block_timestamp,
tx_hash,
origin_function_signature,
origin_from_address,
origin_to_address,
pool_address AS contract_address,
'Swap' AS event_name,
CASE
WHEN amount0_unadj > 0 THEN ABS(amount0_unadj)
ELSE ABS(amount1_unadj)
END AS amount_in_unadj,
CASE
WHEN amount0_unadj < 0 THEN ABS(amount0_unadj)
ELSE ABS(amount1_unadj)
END AS amount_out_unadj,
CASE
WHEN amount0_unadj > 0 THEN token0_address
ELSE token1_address
END AS token_in,
CASE
WHEN amount0_unadj < 0 THEN token0_address
ELSE token1_address
END AS token_out,
sender,
recipient AS tx_to,
event_index,
'uniswap-v3' AS platform,
'v3' AS version,
_log_id,
_inserted_timestamp
FROM
{{ ref('silver__univ3_swaps') }}
{% if is_incremental() and 'univ3' not in var('HEAL_MODELS') %}
WHERE
_inserted_timestamp >= (
SELECT
MAX(_inserted_timestamp) - INTERVAL '{{ var("LOOKBACK", "4 hours") }}'
FROM
{{ this }}
)
{% endif %}
),
...
- Union all Protocol Specific CTEs together with
SELECT *andUNION ALL
...
all_dex AS (
...
UNION ALL
SELECT
*
FROM
dodo_v1
UNION ALL
SELECT
*
FROM
dodo_v2
UNION ALL
SELECT
*
FROM
fraxswap
UNION ALL
...
-
Create the
completeCTE based on the unioned data, and apply transformation logic.- Apply any and all transformation logic required to properly produce the values needed for the different protocols, using as much standardized logic as possible.
- This is where you will join
silver.contractsandprice.ez_prices_hourlyto include applicable USD swap pricing and token metadata values.
-
Add the
heal_modelCTE with theHEAL_MODELvar. For for information, please see the Incremental Heal Logic guide.- For DEX Swaps, we typically need to ensure that late-arriving decimals and USD amounts are properly handled.
-
Create the final SELECT statement.
- Add the
idcolumn usingdbt_utils.generate_surrogate_keyand include the column that represents one unique row. - Add
inserted_timestampandmodified_timestampusingSYSDATE()to represent the timestamp where new rows are inserted or modified in the model. - Add the
_invocation_idcolumn, which outputs a UUID generated for each newly inserted row. For more information, please see DBT Docs: invocation_id.
- Add the
{{ dbt_utils.generate_surrogate_key(
['tx_hash','event_index']
) }} AS complete_dex_swaps_id,
SYSDATE() AS inserted_timestamp,
SYSDATE() AS modified_timestamp,
'{{ invocation_id }}' AS _invocation_id
- Create the
.ymlfile and add table/column level tests- At minimum tests should include, but are not limited to, the following;
- Uniqueness on 1+ columns (typically
_log_idbut may vary by model) not_nulltests ontoken_in,token_out,senderandtx_to
- Uniqueness on 1+ columns (typically
- At minimum tests should include, but are not limited to, the following;
Examples, References & Sources:
Gold Layer Modeling
Please reference the Gold Layer Modeling section in How to Research and Curate EVM Protocols for more details on the modeling process. The following information is related directly to DEX Swaps only.
Summary:
This guide outlines the process of creating the Gold Layer DBT models for tracking DEX Swaps events, transfers or traces on EVM-compatible blockchains, along with the applicable table and column descriptions for transparency and documentation. The Gold model will select directly from the complete model, and will be surfaced to all users and clients.
Implementation Steps, Variables & Notes:
- Set up the model configuration
- Use
viewmaterialization. - Add persist_docs, with
relationandcolumnsset to `true. - Add
database_tagsto themetasection in the config. These should include the names of everyPROTOCOL(platform) in the model, along with one word values that define thePURPOSE(category) of the model.
- Use
{{ config(
materialized = 'view',
persist_docs ={ "relation": true,
"columns": true },
meta ={ 'database_tags':{ 'table':{ 'PROTOCOL': 'SUSHI, UNISWAP, CURVE, SYNTHETIX, BALANCER, DODO, FRAX, HASHFLOW, KYBERSWAP, MAVERICK, PANCAKESWAP, SHIBASWAP, TRADER JOE, VERSE',
'PURPOSE': 'DEX, SWAPS' } } }
) }}
-
SELECT all relevant columns to pull through from
complete_dex_swapsand surface to users in the Gold layer.- Adjust the column names so that they are intuitive for a user analyzing what the table represents, which in this case, is an EZ view on DEX Swap events and related USD values.
- Apply custom logic to NULL USD amounts in cases where the USD value on one side of the swap is NULL (e.g. missing prices) or the variance between the USD swap values is greater than 75% (e.g. low-liquidity swap). This does not apply to swaps on the wrapped version of the chain's NATIVE token.
- We do this to manage low liquidity swaps and tokens where prices may be extremely volatile, so that we don’t overstate volumes
- We exclude native tokens (WETH, WBNB, WAVAX etc.) from being NULL in cases where one of the swap doesn’t have a price, as the Wrapped Native token has more reliable pricing.
- Apply the
ROUNDfunction, and round to the nearest 2 decimals to emulate USD denominations.
-
Create the
.ymlfile and add table/column level descriptions.