General: DeFi: DEX Liquidity Pools - FlipsideCrypto/fsc-evm GitHub Wiki
Building and Deploying DEX Liquidity Pool Models
Decentralized Exchanges or DEXs, are permissionless marketplaces where tokens can be bought or sold, without a centralized intermediary. For EVMs, activity occurs on Liquidity Pools, which are smart contracts comprised of token pairs that enable automated market making and token swaps. This guide will detail how to 1) Research DEX protocols and 2) Build the Silver/Gold DBT model layers for Liquidity Pools.
Research Phase
Please reference the Research Phase section in How to Research and Curate EVM Protocols for official documentation on the research process. The following information will be relevant to DEX Liquidity Pools.
Summary:
- In order to properly build a liquidity pool model for a DEX protocol, you need to identify the
factorycontract(s) and thepairCreated(or similar) event topic 0.- Factory Contract: Responsible for deploying new liquidity pool contracts. Typically there is only one factory contract per DEX protocol version. In some protocols, the factory contract may also be responsible for setting and updating trading fees for the pools.
- Pair Created Event: Emitted when a new liquidity pool is generated on the factory contract. This may be emitted after the
createPair()function or similar, is called on thefactorycontract.- Example Pair Created Event Parameters:
- token0: Address of the first token in the pair
- token1: Address of the second token in the pair
- pool: Address of the newly created pair contract (liquidity pool)
- Example Pair Created Event Parameters:
Best Practices, Tips & Tricks:
- Depending on the protocol, some liquidity pools may represent more than two tokens. Examples include Balancer and Curve, please reference these models to see how these are handled differently than a simple liquidity pool model with a pair of tokens.
Silver Layer Modeling
This guide outlines the process of creating the Silver Layer DBT models for tracking DEX liquidity pool creation events or traces on EVM-compatible blockchains, along with the applicable DBT tests for maintaining data integrity. This includes both Protocol Specific and Curated Complete models.
Please reference the Silver Layer Modeling section in How to Research and Curate EVM Protocols for more details on the modeling process. The following information is related directly to Liquidity Pools only.
1. Protocol Specific Models
Summary:
Built with silver.logs and/or silver.traces, the Silver models are designed to capture essential information about newly created liquidity pools, including token pairs, pool addresses, creation block_number, tx_hash, block_timestamp and other relevant values. The pool_address, token0 and token1 are the most important columns to extract, as these will be crucial for building the subsequent DEX Swaps models.
Best Practices, Tips & Tricks:
- The DEX models are some of the first curated models deployed by Flipside, and therefore may have non-standardized structures or naming conventions. Over time, these models may be adapted to new standards.
- In rare cases, models may require contract reads to collect the tokens and pool addresses, if they cannot be parsed onchain through event logs or traces. Please reference the guide on Contract Reads for more details.
- Certain liquidity pool contracts are only deployed via creation traces, rather than creation events. This is common for protocols such as Balancer and Curve.
- These can be queried by filtering for
TYPE ILIKE 'create%'onsilver.traces, where thefrom_addressis the deployer contract andto_addressis the liquidity pool. Note that it's likely there are multiplefrom_address/ deployer contracts to query for.
- These can be queried by filtering for
Implementation Steps, Variables & Notes:
- Set up the model configuration
- Use
incrementalmaterialization with adelete+insertstrategy. - Define a unique key (typically the
pool_address) to identify each record. - Add relevant tags for model categorization (typically tagged as
curated) so that the model may be picked up by the relevant scheduled workflows.
- Use
{{ config(
materialized = 'incremental',
incremental_strategy = 'delete+insert',
unique_key = "pool_address",
tags = ['curated']
) }}
- Create a CTE for pool creation events
- Filter logs for specific factory contract addresses and event topics.
- Manually parse event data to extract relevant information (tokens, pool address, etc.).
- For incremental runs, filter data based on the latest
_inserted_timestamp, with a12 hourlookback.
- For incremental runs, filter data based on the latest
WITH pool_creation AS (
SELECT
block_number,
block_timestamp,
tx_hash,
event_index,
origin_function_signature,
origin_from_address,
origin_to_address,
contract_address AS factory_address,
regexp_substr_all(SUBSTR(DATA, 3, len(DATA)), '.{64}') AS segmented_data,
CONCAT('0x', SUBSTR(segmented_data [0] :: STRING, 25, 40)) AS pool_address,
CONCAT('0x', SUBSTR(topics [1] :: STRING, 27, 40)) AS token0,
CONCAT('0x', SUBSTR(topics [2] :: STRING, 27, 40)) AS token1,
TRY_TO_NUMBER(utils.udf_hex_to_int(
segmented_data [1] :: STRING
)) AS pool_id,
_log_id,
_inserted_timestamp
FROM {{ ref ('silver__logs') }}
WHERE
contract_address IN (
'0xb076b06f669e682609fb4a8c6646d2619717be4b', --v1 factory
'0x43ec799eadd63848443e2347c49f5f52e8fe0f6f' --v2 factory
)
AND topics [0] :: STRING = '0x0d3648bd0f6ba80134a33ba9275ac585d9d315f0ad8355cddefde31afa28d0e9' --pairCreated
AND tx_status = 'SUCCESS'
{% if is_incremental() %}
AND _inserted_timestamp >= (
SELECT
MAX(_inserted_timestamp) - INTERVAL '12 hours'
FROM
{{ this }}
)
{% endif %}
)
- Add the final SELECT statement
- Select all relevant fields from the
pool_creationCTE. - If necessary, add a qualify clause on
pool_addressto handle potential duplicates, keeping the most recent entries based on_inserted_timestamp.
- Select all relevant fields from the
SELECT
block_number,
block_timestamp,
tx_hash,
event_index,
origin_function_signature,
origin_from_address,
origin_to_address,
factory_address,
token0,
token1,
pool_address,
pool_id,
_log_id,
_inserted_timestamp
FROM
pool_creation
qualify(ROW_NUMBER() over(PARTITION BY pool_address
ORDER BY
_inserted_timestamp DESC)) = 1
- Create the
.ymlfile and add table/column level tests- At minimum tests should include, but are not limited to, the following;
- Uniqueness on 1+ columns (typically
pool_addressbut may vary by model) not_nulltests ontoken0,token1,pool_address
- Uniqueness on 1+ columns (typically
- At minimum tests should include, but are not limited to, the following;
version: 2
models:
- name: silver_dex__fraxswap_pools
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- POOL_ADDRESS
columns:
- name: POOL_ADDRESS
tests:
- not_null
- name: TOKEN0
tests:
- not_null
- name: TOKEN1
tests:
- not_null
Examples, References & Sources:
- Example models
- Example using creation event logs: ethereum.silver_dex.fraxswap_pools.sql
- Example
.ymlfile for tests: ethereum.silver_dex.fraxswap_pools.yml - Example using creation traces: ethereum.silver_dex.hashflow_pools
2. Complete Models
Summary:
The complete_dex_liquidity_pools model aggregates all protocol specific liquidity pool models together into one, allowing centralized joins on silver.contracts for token metadata, streamlined logic and standardized columns. This includes platform names, versions, pool names, symbols, decimals, fees (where applicable), and columns for all individual tokens that make up each liquidity pool.
Best Practices, Tips & Tricks:
- Protocol Versions: If there are multiple versions of a protocol, coming from separate models, each should have it's own CTE.
- The name for the
platformcolumn should include both theprotocolandversionseparated by a-(dash). You do not need to include,v1, but instead begin incrementing atv2and beyond. - The
versioncolumn should represent a version specific to the CTE/model being referenced. For example, if there are multiple CTEs/models required for Pancakeswap v2, you would addpancakeswap-v2as theplatformandv2-ammas theversionin the first CTE, thenpancakeswap-v2as theplatformandv2-mmas theversionin the next CTE. This is required because thedelete+insertincremental strategy uses bothplatformandversionto identify new incremental records.
- The name for the
- When adding new Protocols/CTEs/models to the
completemodel, please pay attention to the number of columns being pulled through, the order of those columns and the data type/format and names of those columns, as this will materially impact the output of the model.- More specifically, pay attention to how
pool_nameis handled in thecomplete_lpsCTE. Unique logic is typically required for DEXs built off theuniswap v3forks, for example, as well as the liquidity pool models built using creation traces (rather than creation event logs), such asbalancerandcurve.
- More specifically, pay attention to how
Implementation Steps, Variables & Notes:
- Set up the model configuration
- Use
incrementalmaterialization with adelete+insertstrategy. - Define a unique key to identify each record that the incremental strategy should apply to. The unique key defined in the config will be different than the unique column / unique row in the table. This is because the model must delete+insert on a broader key to properly handle block reorg scenarios. Please see the Block Reorg guide for more information. We typically use ['block_number','platform','version'] in the complete liquidity pools model because it allows us to reload blocks, platforms and versions in their entirety, rather than reloading by transaction or event and potentially missing, loading or deleting incorrect records.
- Cluster by columns with low cardinality, such as
block_timestamp::DATEandplatform, to improve query performance. - Add Search Optimization (SO) in the post_hook with
EQUALITYandSUBSTRING. Please only add SO on non-number columns, where applicable. - Add relevant tags for model categorization (typically tagged as
['curated','reorg','heal']) so that the model may be picked up by the relevant scheduled workflows.
- Use
{{ config(
materialized = 'incremental',
incremental_strategy = 'delete+insert',
unique_key = ['block_number','platform','version'],
cluster_by = ['block_timestamp::DATE','platform'],
post_hook = "ALTER TABLE {{ this }} ADD SEARCH OPTIMIZATION ON EQUALITY(tx_hash, contract_address, pool_address, pool_name, tokens, symbols), SUBSTRING(pool_address, pool_name, tokens, symbols)",
tags = ['curated','reorg','heal']
) }}
-
Create a CTE for contracts
- Pull through the
contract_address,token_symbolandtoken_decimalcolumns fromsilver__contracts. These will be joined into various CTEs in the model to get the token metadata columns.
- Pull through the
-
Create CTEs for each Protocol Specific model/version, where applicable.
- Select all columns from the source table
- Use standardized column names, as each of these CTEs will be unioned together towards the end of the model.
- Add columns with
NULLvalues in cases where the column exists in other Protocol Specific CTEs.- Liquidity Pools may columns such as
fee,tick_spacingand up to eight (8) different tokens paired together.
- Liquidity Pools may columns such as
- Add columns with
- Hardcode the
platformandversion, if not done already. - For incremental runs, filter data based on the latest
_inserted_timestamp, with theLOOKBACKvariable, defaulted to4 hours.- The name used to reference `var('HEAL_MODELS') should match the name of the CTE.
...
balancer AS (
SELECT
block_number,
block_timestamp,
tx_hash,
contract_address,
pool_address,
pool_name,
NULL AS fee,
NULL AS tick_spacing,
token0,
token1,
token2,
token3,
token4,
token5,
token6,
token7,
'balancer' AS platform,
'v1' AS version,
_log_id AS _id,
_inserted_timestamp
FROM
{{ ref('silver_dex__balancer_pools') }}
{% if is_incremental() and 'balancer' not in var('HEAL_MODELS') %}
WHERE
_inserted_timestamp >= (
SELECT
MAX(_inserted_timestamp) - INTERVAL '{{ var("LOOKBACK", "4 hours") }}'
FROM
{{ this }}
)
{% endif %}
),
...
- Union all Protocol Specific CTEs together with
SELECT *andUNION ALL
...
all_pools AS (
SELECT
*
FROM
dodo_v1
UNION ALL
SELECT
*
FROM
dodo_v2
UNION ALL
SELECT
*
FROM
frax
UNION ALL
...
-
Create the
completeCTE based on the unioned data, and apply transformation logic.- Apply any and all transformation logic required to properly produce the values needed for the different protocols, using as much standardized logic as possible.
- This is where you will join the
contractsCTE to include applicable token metadata values.
-
Add the
heal_modelCTE with theHEAL_MODELvar. For for information, please see the Incremental Heal Logic guide.- For Liquidity Pools, we typically need to ensure that late-arriving decimals are properly handled.
-
Create the final SELECT statement.
- Add the
idcolumn usingdbt_utils.generate_surrogate_keyand include the column that represents one unique row. - Add
inserted_timestampandmodified_timestampusingSYSDATE()to represent the timestamp where new rows are inserted or modified in the model. - Add the
_invocation_idcolumn, which outputs a UUID generated for each newly inserted row. For more information, please see DBT Docs: invocation_id.
- Add the
{{ dbt_utils.generate_surrogate_key(
['pool_address']
) }} AS complete_dex_liquidity_pools_id,
SYSDATE() AS inserted_timestamp,
SYSDATE() AS modified_timestamp,
'{{ invocation_id }}' AS _invocation_id
- Create the
.ymlfile and add table/column level tests- At minimum tests should include, but are not limited to, the following;
- Uniqueness on 1+ columns (typically
_idbut may vary by model) not_nulltests oncontract_address,pool_addressandplatform
- Uniqueness on 1+ columns (typically
- At minimum tests should include, but are not limited to, the following;
Examples, References & Sources:
- ethereum.silver_dex.complete_dex_liquidity_pools.sql
- ethereum.silver_dex.complete_dex_liquidity_pools.yml
Gold Layer Modeling
Please reference the Gold Layer Modeling section in How to Research and Curate EVM Protocols for more details on the modeling process. The following information is related directly to Liquidity Pools only.
Summary:
This guide outlines the process of creating the Gold Layer DBT models for tracking DEX liquidity pool creation events or traces on EVM-compatible blockchains, along with the applicable table and column descriptions for transparency and documentation. The Gold model will select directly from the complete model, and will be surfaced to all users and clients.
Implementation Steps, Variables & Notes:
- Set up the model configuration
- Use
viewmaterialization. - Add persist_docs, with
relationandcolumnsset to `true. - Add
database_tagsto themetasection in the config. These should include the names of everyPROTOCOL(platform) in the model, along with one word values that define thePURPOSE(category) of the model.
- Use
{{ config(
materialized = 'view',
persist_docs ={ "relation": true,
"columns": true },
meta={
'database_tags':{
'table': {
'PROTOCOL': 'BALANCER, CURVE, DODO, FRAXSWAP, KYBERSWAP, MAVERICK, PANCAKESWAP, SHIBASWAP, SUSHISWAP, UNISWAP, TRADER JOE, VERSE',
'PURPOSE': 'DEX, LIQUIDITY, POOLS, LP, SWAPS',
}
}
}
) }}
-
SELECT all relevant columns to pull through from
complete_dex_liqudity_poolsand surface to users in the Gold layer.- Adjust the column names so that they are intuitive for a user analyzing what the table represents, which in this case, is a dimensional view on liquidity pool creations.
-
Create the
.ymlfile and add table/column level descriptions.