General: DeFi: DEX Liquidity Pools - FlipsideCrypto/fsc-evm GitHub Wiki
Building and Deploying DEX Liquidity Pool Models
Decentralized Exchanges or DEXs, are permissionless marketplaces where tokens can be bought or sold, without a centralized intermediary. For EVMs, activity occurs on Liquidity Pools, which are smart contracts comprised of token pairs that enable automated market making and token swaps. This guide will detail how to 1) Research DEX protocols and 2) Build the Silver/Gold DBT model layers for Liquidity Pools.
Research Phase
Please reference the Research Phase
section in How to Research and Curate EVM Protocols for official documentation on the research process. The following information will be relevant to DEX Liquidity Pools.
Summary:
- In order to properly build a liquidity pool model for a DEX protocol, you need to identify the
factory
contract(s) and thepairCreated
(or similar) event topic 0.- Factory Contract: Responsible for deploying new liquidity pool contracts. Typically there is only one factory contract per DEX protocol version. In some protocols, the factory contract may also be responsible for setting and updating trading fees for the pools.
- Pair Created Event: Emitted when a new liquidity pool is generated on the factory contract. This may be emitted after the
createPair()
function or similar, is called on thefactory
contract.- Example Pair Created Event Parameters:
- token0: Address of the first token in the pair
- token1: Address of the second token in the pair
- pool: Address of the newly created pair contract (liquidity pool)
- Example Pair Created Event Parameters:
Best Practices, Tips & Tricks:
- Depending on the protocol, some liquidity pools may represent more than two tokens. Examples include Balancer and Curve, please reference these models to see how these are handled differently than a simple liquidity pool model with a pair of tokens.
Silver Layer Modeling
This guide outlines the process of creating the Silver Layer DBT models for tracking DEX liquidity pool creation events or traces on EVM-compatible blockchains, along with the applicable DBT tests for maintaining data integrity. This includes both Protocol Specific and Curated Complete models.
Please reference the Silver Layer Modeling
section in How to Research and Curate EVM Protocols for more details on the modeling process. The following information is related directly to Liquidity Pools only.
1. Protocol Specific Models
Summary:
Built with silver.logs
and/or silver.traces
, the Silver models are designed to capture essential information about newly created liquidity pools, including token pairs, pool addresses, creation block_number
, tx_hash
, block_timestamp
and other relevant values. The pool_address
, token0
and token1
are the most important columns to extract, as these will be crucial for building the subsequent DEX Swaps models.
Best Practices, Tips & Tricks:
- The DEX models are some of the first curated models deployed by Flipside, and therefore may have non-standardized structures or naming conventions. Over time, these models may be adapted to new standards.
- In rare cases, models may require contract reads to collect the tokens and pool addresses, if they cannot be parsed onchain through event logs or traces. Please reference the guide on Contract Reads for more details.
- Certain liquidity pool contracts are only deployed via creation traces, rather than creation events. This is common for protocols such as Balancer and Curve.
- These can be queried by filtering for
TYPE ILIKE 'create%'
onsilver.traces
, where thefrom_address
is the deployer contract andto_address
is the liquidity pool. Note that it's likely there are multiplefrom_address
/ deployer contracts to query for.
- These can be queried by filtering for
Implementation Steps, Variables & Notes:
- Set up the model configuration
- Use
incremental
materialization with adelete+insert
strategy. - Define a unique key (typically the
pool_address
) to identify each record. - Add relevant tags for model categorization (typically tagged as
curated
) so that the model may be picked up by the relevant scheduled workflows.
- Use
{{ config(
materialized = 'incremental',
incremental_strategy = 'delete+insert',
unique_key = "pool_address",
tags = ['curated']
) }}
- Create a CTE for pool creation events
- Filter logs for specific factory contract addresses and event topics.
- Manually parse event data to extract relevant information (tokens, pool address, etc.).
- For incremental runs, filter data based on the latest
_inserted_timestamp
, with a12 hour
lookback.
- For incremental runs, filter data based on the latest
WITH pool_creation AS (
SELECT
block_number,
block_timestamp,
tx_hash,
event_index,
origin_function_signature,
origin_from_address,
origin_to_address,
contract_address AS factory_address,
regexp_substr_all(SUBSTR(DATA, 3, len(DATA)), '.{64}') AS segmented_data,
CONCAT('0x', SUBSTR(segmented_data [0] :: STRING, 25, 40)) AS pool_address,
CONCAT('0x', SUBSTR(topics [1] :: STRING, 27, 40)) AS token0,
CONCAT('0x', SUBSTR(topics [2] :: STRING, 27, 40)) AS token1,
TRY_TO_NUMBER(utils.udf_hex_to_int(
segmented_data [1] :: STRING
)) AS pool_id,
_log_id,
_inserted_timestamp
FROM {{ ref ('silver__logs') }}
WHERE
contract_address IN (
'0xb076b06f669e682609fb4a8c6646d2619717be4b', --v1 factory
'0x43ec799eadd63848443e2347c49f5f52e8fe0f6f' --v2 factory
)
AND topics [0] :: STRING = '0x0d3648bd0f6ba80134a33ba9275ac585d9d315f0ad8355cddefde31afa28d0e9' --pairCreated
AND tx_status = 'SUCCESS'
{% if is_incremental() %}
AND _inserted_timestamp >= (
SELECT
MAX(_inserted_timestamp) - INTERVAL '12 hours'
FROM
{{ this }}
)
{% endif %}
)
- Add the final SELECT statement
- Select all relevant fields from the
pool_creation
CTE. - If necessary, add a qualify clause on
pool_address
to handle potential duplicates, keeping the most recent entries based on_inserted_timestamp
.
- Select all relevant fields from the
SELECT
block_number,
block_timestamp,
tx_hash,
event_index,
origin_function_signature,
origin_from_address,
origin_to_address,
factory_address,
token0,
token1,
pool_address,
pool_id,
_log_id,
_inserted_timestamp
FROM
pool_creation
qualify(ROW_NUMBER() over(PARTITION BY pool_address
ORDER BY
_inserted_timestamp DESC)) = 1
- Create the
.yml
file and add table/column level tests- At minimum tests should include, but are not limited to, the following;
- Uniqueness on 1+ columns (typically
pool_address
but may vary by model) not_null
tests ontoken0
,token1
,pool_address
- Uniqueness on 1+ columns (typically
- At minimum tests should include, but are not limited to, the following;
version: 2
models:
- name: silver_dex__fraxswap_pools
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- POOL_ADDRESS
columns:
- name: POOL_ADDRESS
tests:
- not_null
- name: TOKEN0
tests:
- not_null
- name: TOKEN1
tests:
- not_null
Examples, References & Sources:
- Example models
- Example using creation event logs: ethereum.silver_dex.fraxswap_pools.sql
- Example
.yml
file for tests: ethereum.silver_dex.fraxswap_pools.yml - Example using creation traces: ethereum.silver_dex.hashflow_pools
2. Complete Models
Summary:
The complete_dex_liquidity_pools
model aggregates all protocol specific liquidity pool models together into one, allowing centralized joins on silver.contracts
for token metadata, streamlined logic and standardized columns. This includes platform names, versions, pool names, symbols, decimals, fees (where applicable), and columns for all individual tokens that make up each liquidity pool.
Best Practices, Tips & Tricks:
- Protocol Versions: If there are multiple versions of a protocol, coming from separate models, each should have it's own CTE.
- The name for the
platform
column should include both theprotocol
andversion
separated by a-
(dash). You do not need to include,v1
, but instead begin incrementing atv2
and beyond. - The
version
column should represent a version specific to the CTE/model being referenced. For example, if there are multiple CTEs/models required for Pancakeswap v2, you would addpancakeswap-v2
as theplatform
andv2-amm
as theversion
in the first CTE, thenpancakeswap-v2
as theplatform
andv2-mm
as theversion
in the next CTE. This is required because thedelete+insert
incremental strategy uses bothplatform
andversion
to identify new incremental records.
- The name for the
- When adding new Protocols/CTEs/models to the
complete
model, please pay attention to the number of columns being pulled through, the order of those columns and the data type/format and names of those columns, as this will materially impact the output of the model.- More specifically, pay attention to how
pool_name
is handled in thecomplete_lps
CTE. Unique logic is typically required for DEXs built off theuniswap v3
forks, for example, as well as the liquidity pool models built using creation traces (rather than creation event logs), such asbalancer
andcurve
.
- More specifically, pay attention to how
Implementation Steps, Variables & Notes:
- Set up the model configuration
- Use
incremental
materialization with adelete+insert
strategy. - Define a unique key to identify each record that the incremental strategy should apply to. The unique key defined in the config will be different than the unique column / unique row in the table. This is because the model must delete+insert on a broader key to properly handle block reorg scenarios. Please see the Block Reorg guide for more information. We typically use ['block_number','platform','version'] in the complete liquidity pools model because it allows us to reload blocks, platforms and versions in their entirety, rather than reloading by transaction or event and potentially missing, loading or deleting incorrect records.
- Cluster by columns with low cardinality, such as
block_timestamp::DATE
andplatform
, to improve query performance. - Add Search Optimization (SO) in the post_hook with
EQUALITY
andSUBSTRING
. Please only add SO on non-number columns, where applicable. - Add relevant tags for model categorization (typically tagged as
['curated','reorg','heal']
) so that the model may be picked up by the relevant scheduled workflows.
- Use
{{ config(
materialized = 'incremental',
incremental_strategy = 'delete+insert',
unique_key = ['block_number','platform','version'],
cluster_by = ['block_timestamp::DATE','platform'],
post_hook = "ALTER TABLE {{ this }} ADD SEARCH OPTIMIZATION ON EQUALITY(tx_hash, contract_address, pool_address, pool_name, tokens, symbols), SUBSTRING(pool_address, pool_name, tokens, symbols)",
tags = ['curated','reorg','heal']
) }}
-
Create a CTE for contracts
- Pull through the
contract_address
,token_symbol
andtoken_decimal
columns fromsilver__contracts
. These will be joined into various CTEs in the model to get the token metadata columns.
- Pull through the
-
Create CTEs for each Protocol Specific model/version, where applicable.
- Select all columns from the source table
- Use standardized column names, as each of these CTEs will be unioned together towards the end of the model.
- Add columns with
NULL
values in cases where the column exists in other Protocol Specific CTEs.- Liquidity Pools may columns such as
fee
,tick_spacing
and up to eight (8) different tokens paired together.
- Liquidity Pools may columns such as
- Add columns with
- Hardcode the
platform
andversion
, if not done already. - For incremental runs, filter data based on the latest
_inserted_timestamp
, with theLOOKBACK
variable, defaulted to4 hours
.- The name used to reference `var('HEAL_MODELS') should match the name of the CTE.
...
balancer AS (
SELECT
block_number,
block_timestamp,
tx_hash,
contract_address,
pool_address,
pool_name,
NULL AS fee,
NULL AS tick_spacing,
token0,
token1,
token2,
token3,
token4,
token5,
token6,
token7,
'balancer' AS platform,
'v1' AS version,
_log_id AS _id,
_inserted_timestamp
FROM
{{ ref('silver_dex__balancer_pools') }}
{% if is_incremental() and 'balancer' not in var('HEAL_MODELS') %}
WHERE
_inserted_timestamp >= (
SELECT
MAX(_inserted_timestamp) - INTERVAL '{{ var("LOOKBACK", "4 hours") }}'
FROM
{{ this }}
)
{% endif %}
),
...
- Union all Protocol Specific CTEs together with
SELECT *
andUNION ALL
...
all_pools AS (
SELECT
*
FROM
dodo_v1
UNION ALL
SELECT
*
FROM
dodo_v2
UNION ALL
SELECT
*
FROM
frax
UNION ALL
...
-
Create the
complete
CTE based on the unioned data, and apply transformation logic.- Apply any and all transformation logic required to properly produce the values needed for the different protocols, using as much standardized logic as possible.
- This is where you will join the
contracts
CTE to include applicable token metadata values.
-
Add the
heal_model
CTE with theHEAL_MODEL
var. For for information, please see the Incremental Heal Logic guide.- For Liquidity Pools, we typically need to ensure that late-arriving decimals are properly handled.
-
Create the final SELECT statement.
- Add the
id
column usingdbt_utils.generate_surrogate_key
and include the column that represents one unique row. - Add
inserted_timestamp
andmodified_timestamp
usingSYSDATE()
to represent the timestamp where new rows are inserted or modified in the model. - Add the
_invocation_id
column, which outputs a UUID generated for each newly inserted row. For more information, please see DBT Docs: invocation_id.
- Add the
{{ dbt_utils.generate_surrogate_key(
['pool_address']
) }} AS complete_dex_liquidity_pools_id,
SYSDATE() AS inserted_timestamp,
SYSDATE() AS modified_timestamp,
'{{ invocation_id }}' AS _invocation_id
- Create the
.yml
file and add table/column level tests- At minimum tests should include, but are not limited to, the following;
- Uniqueness on 1+ columns (typically
_id
but may vary by model) not_null
tests oncontract_address
,pool_address
andplatform
- Uniqueness on 1+ columns (typically
- At minimum tests should include, but are not limited to, the following;
Examples, References & Sources:
- ethereum.silver_dex.complete_dex_liquidity_pools.sql
- ethereum.silver_dex.complete_dex_liquidity_pools.yml
Gold Layer Modeling
Please reference the Gold Layer Modeling
section in How to Research and Curate EVM Protocols for more details on the modeling process. The following information is related directly to Liquidity Pools only.
Summary:
This guide outlines the process of creating the Gold Layer DBT models for tracking DEX liquidity pool creation events or traces on EVM-compatible blockchains, along with the applicable table and column descriptions for transparency and documentation. The Gold model will select directly from the complete
model, and will be surfaced to all users and clients.
Implementation Steps, Variables & Notes:
- Set up the model configuration
- Use
view
materialization. - Add persist_docs, with
relation
andcolumns
set to `true. - Add
database_tags
to themeta
section in the config. These should include the names of everyPROTOCOL
(platform) in the model, along with one word values that define thePURPOSE
(category) of the model.
- Use
{{ config(
materialized = 'view',
persist_docs ={ "relation": true,
"columns": true },
meta={
'database_tags':{
'table': {
'PROTOCOL': 'BALANCER, CURVE, DODO, FRAXSWAP, KYBERSWAP, MAVERICK, PANCAKESWAP, SHIBASWAP, SUSHISWAP, UNISWAP, TRADER JOE, VERSE',
'PURPOSE': 'DEX, LIQUIDITY, POOLS, LP, SWAPS',
}
}
}
) }}
-
SELECT all relevant columns to pull through from
complete_dex_liqudity_pools
and surface to users in the Gold layer.- Adjust the column names so that they are intuitive for a user analyzing what the table represents, which in this case, is a dimensional view on liquidity pool creations.
-
Create the
.yml
file and add table/column level descriptions.