General: DeFi: Dex Swaps - FlipsideCrypto/fsc-evm GitHub Wiki
Building and Deploying DEX Swaps Models
Decentralized Exchanges or DEXs, are permissionless marketplaces where tokens can be bought or sold, without a centralized intermediary. For EVMs, activity occurs on Liquidity Pools, which are smart contracts comprised of token pairs that enable automated market making and token swaps. This guide will detail how to 1) Research DEX protocols and 2) Build the Silver/Gold DBT model layers for DEX Swaps.
Research Phase
Please reference the Research Phase
section in How to Research and Curate EVM Protocols for official documentation on the research process. The following information will be relevant to DEX Swaps.
Summary:
- In order to properly build a Swap model for a DEX protocol, you'll need to identify the Liquidity Pools for the protocol. Please see the DEX Liquidity Pools guide for more information on researching and building those models.
Best Practices, Tips & Tricks:
- Swap events generally occur directly on the liquidity pool address/contract.
- In cases where a swap does not occur on a liquidity pool, we may be able to parse swap events via the router contract for the protocol.
- The majority of DEX Swap protocols are based on Uniswap deployments and share the same contract structure, emitted events and event topics. For this reason, we may use other DEX Swaps models as a starting point for researching new protocols.
- Depending on the protocol, the token/amount in/out may be labeled within the emitted events as
0
or1
. While0
typically indicatesin
and1
out
, there may be other logic that is required to determine the proper swap flow. Please see ethereum.silver_dex.fraxswap_swaps for an example (e.g. DEX protocols based on theuniswap v2
deployment).- While swap events are similarly structured between the majority of decentralized exchanges, certain protocols may refer to token and amount in/out differently (e.g. token0 vs tokenA vs fromToken).
- In rare cases, certain protocols (such as Curve) may not have the required values emitted in events. You may need to reference transfers tables or traces to successfully parse all relevant columns that make up a swap.
Silver Layer Modeling
This guide outlines the process of creating the Silver Layer DBT models for tracking DEX Swap events or traces on EVM-compatible blockchains, along with the applicable DBT tests for maintaining data integrity. This includes both Protocol Specific and Curated Complete models.
Please reference the Silver Layer Modeling
section in How to Research and Curate EVM Protocols for more details on the modeling process. The following information is related directly to DEX Swaps only.
1. Protocol Specific Models
Summary:
Built with silver.logs
and/or silver.traces
, the Silver models are designed to capture essential information about each and every token swap that occurs on the DEXs liquidity pools, including token and amount in/out, sender address, recipient address, and other relevant values. The token_in
, token_out
, amount_in
, and amount_out
are the most important columns to extract, as these will be crucial for analyzing each swap.
Best Practices, Tips & Tricks:
- The DEX models are some of the first curated models deployed by Flipside, and therefore may have non-standardized structures or naming conventions. Over time, these models may be adapted to new standards.
- Some protocols may have different deployed versions or topic_0 / events for swaps on same or different contract addresses. If the structure of the event is different, we typically deploy multiple models for each version (e.g.
silver_dex.kyberswap_v1_dynamic_swaps
,silver_dex.kyberswap_v1_static_swaps
andsilver_dex.kyberswap_v2_elastic_swaps
.
Implementation Steps, Variables & Notes:
- Set up the model configuration
- Use
incremental
materialization with adelete+insert
strategy. - Define a unique key to identify each record that the incremental strategy should apply to. The unique key defined in the config will be different than the unique column / unique row in the table. This is because the model must delete+insert on a broader key to properly handle block reorg scenarios. Please see the Block Reorg guide for more information. We typically use
block_number
in swap models because it allows us to reload blocks in their entirety, rather than reloading by transaction or event and potentially missing, loading or deleting incorrect records. - Cluster by columns with low cardinality, such as
block_timestamp::DATE
, to improve query performance. - Add relevant tags for model categorization (typically tagged as
curated
andreorg
) so that the model may be picked up by the relevant scheduled workflows.
- Use
{{ config(
materialized = 'incremental',
incremental_strategy = 'delete+insert',
unique_key = "block_number",
cluster_by = ['block_timestamp::DATE'],
tags = ['curated','reorg']
) }}
- Create a CTE for liquidity pools
- Each swap typically occurs on a liquidity pool contract, so we pull these through without incremental logic to inner join in subsequent CTEs.
- While creating a separate CTE for pools is technically unnecessary, having it stand alone generally improves model readability.
WITH pools AS (
SELECT
pool_address,
token0,
token1
FROM
{{ ref('silver_dex__fraxswap_pools') }}
),
-
Create a CTE for Swap events
- Manually parse event data to extract relevant information (token and amount in/out, sender address, recipient address, etc.).
- For incremental runs, filter data based on the latest
_inserted_timestamp
, with a12 hour
lookback.
- For incremental runs, filter data based on the latest
- Manually parse event data to extract relevant information (token and amount in/out, sender address, recipient address, etc.).
-
Add the final SELECT statement
- Select all relevant fields from the Swaps events CTE.
- If necessary, add
WHERE token_in <> token_out
to handle potential situations where a token is swapped for the same token, which may incorrectly lead to an overstatement of voluming or inconsistent reporting of true swap activity. - Adjust the token and amount 0/A/From or 1/B/To as needed to get the final token and amount in/out column values.
- Manually add
Swap
as event_name and the protocol name as theplatform
, where necessary.
-
Create the
.yml
file and add table/column level tests- At minimum tests should include, but are not limited to, the following;
- Uniqueness on 1+ columns (typically
_log_id
but may vary by model) not_null
tests ontoken_in
,token_out
,sender
andtx_to
- Uniqueness on 1+ columns (typically
- At minimum tests should include, but are not limited to, the following;
version: 2
models:
- name: silver_dex__fraxswap_swaps
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- _LOG_ID
columns:
- name: BLOCK_NUMBER
...
Examples, References & Sources:
2. Complete Models
Summary:
The complete_dex_swaps
model aggregates all protocol specific DEX swaps models together into one, allowing centralized joins on silver.contracts
and ez_prices_hourly
for token metadata, hourly price data, streamlined logic and standardized columns. This includes platform names, versions, USD amounts, symbols, decimals and other relevant columns for all swap events that occur on DEX liquidity pools.
Best Practices, Tips & Tricks:
- Protocol Versions: If there are multiple versions of a protocol, coming from separate models, each should have it's own CTE.
- The name for the
platform
column should include both theprotocol
andversion
separated by a-
(dash). You do not need to include,v1
, but instead begin incrementing atv2
and beyond. - The
version
column should represent a version specific to the CTE/model being referenced. For example, if there are multiple CTEs/models required for Pancakeswap v2, you would addpancakeswap-v2
as theplatform
andv2-amm
as theversion
in the first CTE, thenpancakeswap-v2
as theplatform
andv2-mm
as theversion
in the next CTE. This is required because thedelete+insert
incremental strategy uses bothplatform
andversion
to identify new incremental records.
- The name for the
- When adding new Protocols/CTEs/models to the
complete
model, please pay attention to the number of columns being pulled through, the order of those columns and the data type/format and names of those columns, as this will materially impact the output of the model.- More specifically, pay attention to how
pool_name
is handled in thecomplete_dex_liquidity_pools
model that gets joined in here or ifpool_name
is NULL/missing entirely. Unique logic is typically required for DEXs that are not built off liquidity pool models and instead use events emitted through swap router contracts.
- More specifically, pay attention to how
Implementation Steps, Variables & Notes:
- Set up the model configuration
- Use
incremental
materialization with adelete+insert
strategy. - Define a unique key to identify each record that the incremental strategy should apply to. The unique key defined in the config will be different than the unique column / unique row in the table. This is because the model must delete+insert on a broader key to properly handle block reorg scenarios. Please see the Block Reorg guide for more information. We typically use ['block_number','platform','version'] in the complete swaps model because it allows us to reload blocks, platforms and versions in their entirety, rather than reloading by transaction or event and potentially missing, loading or deleting incorrect records.
- Cluster by columns with low cardinality, such as
block_timestamp::DATE
andplatform
, to improve query performance. - Add Search Optimization (SO) in the post_hook with
EQUALITY
andSUBSTRING
. Please only add SO on non-number columns, where applicable. - Add relevant tags for model categorization (typically tagged as
['curated','reorg','heal']
) so that the model may be picked up by the relevant scheduled workflows.
- Use
{{ config(
materialized = 'incremental',
incremental_strategy = 'delete+insert',
unique_key = ['block_number','platform','version'],
cluster_by = ['block_timestamp::DATE','platform'],
post_hook = "ALTER TABLE {{ this }} ADD SEARCH OPTIMIZATION ON EQUALITY(tx_hash, origin_function_signature, origin_from_address, origin_to_address, contract_address, pool_name, event_name, sender, tx_to, token_in, token_out, symbol_in, symbol_out), SUBSTRING(origin_function_signature, pool_name, event_name, sender, tx_to, token_in, token_out, symbol_in, symbol_out)",
tags = ['curated','reorg','heal']
) }}
- Create CTEs for each Protocol Specific model/version, where applicable.
- Select all columns from the source table
- Use standardized column names, as each of these CTEs will be unioned together towards the end of the model.
- Add columns with
NULL
values in cases where the column exists in other Protocol Specific CTEs.- Columns for tokens and amounts may require additional minor transformations, specifically in the case of
uniswap v3
type DEXs.
- Columns for tokens and amounts may require additional minor transformations, specifically in the case of
- Add columns with
- Hardcode the
platform
andversion
, if not done already. - For incremental runs, filter data based on the latest
_inserted_timestamp
, with theLOOKBACK
variable, defaulted to4 hours
.- The name used to reference `var('HEAL_MODELS') should match the name of the CTE.
...
verse AS (
SELECT
block_number,
block_timestamp,
tx_hash,
origin_function_signature,
origin_from_address,
origin_to_address,
contract_address,
event_name,
amount_in_unadj,
amount_out_unadj,
token_in,
token_out,
sender,
tx_to,
event_index,
platform,
'v1' AS version,
_log_id,
_inserted_timestamp
FROM
{{ ref('silver_dex__verse_swaps') }}
{% if is_incremental() and 'verse' not in var('HEAL_MODELS') %}
WHERE
_inserted_timestamp >= (
SELECT
MAX(_inserted_timestamp) - INTERVAL '{{ var("LOOKBACK", "4 hours") }}'
FROM
{{ this }}
)
{% endif %}
),
univ3 AS (
SELECT
block_number,
block_timestamp,
tx_hash,
origin_function_signature,
origin_from_address,
origin_to_address,
pool_address AS contract_address,
'Swap' AS event_name,
CASE
WHEN amount0_unadj > 0 THEN ABS(amount0_unadj)
ELSE ABS(amount1_unadj)
END AS amount_in_unadj,
CASE
WHEN amount0_unadj < 0 THEN ABS(amount0_unadj)
ELSE ABS(amount1_unadj)
END AS amount_out_unadj,
CASE
WHEN amount0_unadj > 0 THEN token0_address
ELSE token1_address
END AS token_in,
CASE
WHEN amount0_unadj < 0 THEN token0_address
ELSE token1_address
END AS token_out,
sender,
recipient AS tx_to,
event_index,
'uniswap-v3' AS platform,
'v3' AS version,
_log_id,
_inserted_timestamp
FROM
{{ ref('silver__univ3_swaps') }}
{% if is_incremental() and 'univ3' not in var('HEAL_MODELS') %}
WHERE
_inserted_timestamp >= (
SELECT
MAX(_inserted_timestamp) - INTERVAL '{{ var("LOOKBACK", "4 hours") }}'
FROM
{{ this }}
)
{% endif %}
),
...
- Union all Protocol Specific CTEs together with
SELECT *
andUNION ALL
...
all_dex AS (
...
UNION ALL
SELECT
*
FROM
dodo_v1
UNION ALL
SELECT
*
FROM
dodo_v2
UNION ALL
SELECT
*
FROM
fraxswap
UNION ALL
...
-
Create the
complete
CTE based on the unioned data, and apply transformation logic.- Apply any and all transformation logic required to properly produce the values needed for the different protocols, using as much standardized logic as possible.
- This is where you will join
silver.contracts
andprice.ez_prices_hourly
to include applicable USD swap pricing and token metadata values.
-
Add the
heal_model
CTE with theHEAL_MODEL
var. For for information, please see the Incremental Heal Logic guide.- For DEX Swaps, we typically need to ensure that late-arriving decimals and USD amounts are properly handled.
-
Create the final SELECT statement.
- Add the
id
column usingdbt_utils.generate_surrogate_key
and include the column that represents one unique row. - Add
inserted_timestamp
andmodified_timestamp
usingSYSDATE()
to represent the timestamp where new rows are inserted or modified in the model. - Add the
_invocation_id
column, which outputs a UUID generated for each newly inserted row. For more information, please see DBT Docs: invocation_id.
- Add the
{{ dbt_utils.generate_surrogate_key(
['tx_hash','event_index']
) }} AS complete_dex_swaps_id,
SYSDATE() AS inserted_timestamp,
SYSDATE() AS modified_timestamp,
'{{ invocation_id }}' AS _invocation_id
- Create the
.yml
file and add table/column level tests- At minimum tests should include, but are not limited to, the following;
- Uniqueness on 1+ columns (typically
_log_id
but may vary by model) not_null
tests ontoken_in
,token_out
,sender
andtx_to
- Uniqueness on 1+ columns (typically
- At minimum tests should include, but are not limited to, the following;
Examples, References & Sources:
Gold Layer Modeling
Please reference the Gold Layer Modeling
section in How to Research and Curate EVM Protocols for more details on the modeling process. The following information is related directly to DEX Swaps only.
Summary:
This guide outlines the process of creating the Gold Layer DBT models for tracking DEX Swaps events, transfers or traces on EVM-compatible blockchains, along with the applicable table and column descriptions for transparency and documentation. The Gold model will select directly from the complete
model, and will be surfaced to all users and clients.
Implementation Steps, Variables & Notes:
- Set up the model configuration
- Use
view
materialization. - Add persist_docs, with
relation
andcolumns
set to `true. - Add
database_tags
to themeta
section in the config. These should include the names of everyPROTOCOL
(platform) in the model, along with one word values that define thePURPOSE
(category) of the model.
- Use
{{ config(
materialized = 'view',
persist_docs ={ "relation": true,
"columns": true },
meta ={ 'database_tags':{ 'table':{ 'PROTOCOL': 'SUSHI, UNISWAP, CURVE, SYNTHETIX, BALANCER, DODO, FRAX, HASHFLOW, KYBERSWAP, MAVERICK, PANCAKESWAP, SHIBASWAP, TRADER JOE, VERSE',
'PURPOSE': 'DEX, SWAPS' } } }
) }}
-
SELECT all relevant columns to pull through from
complete_dex_swaps
and surface to users in the Gold layer.- Adjust the column names so that they are intuitive for a user analyzing what the table represents, which in this case, is an EZ view on DEX Swap events and related USD values.
- Apply custom logic to NULL USD amounts in cases where the USD value on one side of the swap is NULL (e.g. missing prices) or the variance between the USD swap values is greater than 75% (e.g. low-liquidity swap). This does not apply to swaps on the wrapped version of the chain's NATIVE token.
- We do this to manage low liquidity swaps and tokens where prices may be extremely volatile, so that we don’t overstate volumes
- We exclude native tokens (WETH, WBNB, WAVAX etc.) from being NULL in cases where one of the swap doesn’t have a price, as the Wrapped Native token has more reliable pricing.
- Apply the
ROUND
function, and round to the nearest 2 decimals to emulate USD denominations.
-
Create the
.yml
file and add table/column level descriptions.