General: Streamline: Decoder Setup - FlipsideCrypto/fsc-evm GitHub Wiki

Deploying Decoder Streamline Models

Summary:

Streamline enables us to rapidly scale our data processing and ingestion capabilities with the use of AWS Lambdas, Snowflake External Tables, and DBT Models. Streamline models in each EVM repo are organized into Bronze and Silver folders, with additional nesting by category.

Please see Streamline: Macros for EVM Models and Utility: Macros for Streamline for more information on the macros used to build the models and other dependencies.

Bronze Layer

  • These models are materialized as Views and use the fsc_evm.streamline_external_table_query_decoder and fsc_evm.streamline_external_table_fr_query_decoder macros to select the raw node responses that are stored in External Tables / S3.
  • The various versions of these models include views that select the last three (3) days of inserted rows (optimized performance for incremental loads downstream) and those that reference External Tables for the entirety of stored history (useful for full-refresh scenarios in the downstream models).
    • Note: The full-refresh (fr) models may require multiple versions deployed that differ slightly based on the structure of the deployed External Tables. These are denoted with the v1 or v2 suffix. If this is the case, a comprehensive view that unions data from all full-refresh version models is required to ensure we can access 100% of stored history in downstream models.

Silver Layer

  • Request Models: Materialized as Views, these models establish the decoder requests to initialize the lambdas by compiling relevant blocks, logs/traces, and ABIs. By leveraging the fsc_utils.if_data_call_function_v2 macro, streamline.udf_bulk_decode_logs_v2, and udf_bulk_decode_traces_v2 functions in the model's DBT config, and running the model with the appropriate variables, e.g. --vars '{"STREAMLINE_INVOKE_STREAMS":True}', the decoder requests are sent. In conjunction with AWS Lambdas, the Streamline pipeline is established and results are returned to AWS S3, which can then be queried in Snowflake alongside External Tables. These External Tables are defined and deployed via the streamline-snowflake repo.

    • Realtime: Includes the last three (3) days of blocks only

    • History: Includes all blocks prior to the last three (3) days

      • Note: Due to size limitations, decoder history files are split by block range. We then set the start and stop to the range defined in the file name itself. Additionally, we utilize the fsc_utils.if_data_call_wait() function alongside the WAIT variable defined in dbt_projects.yml to apply a buffer period before the next range kicks off.
      • Example: models/streamline/silver/decoder/history/event_logs/range_1/streamline__decode_logs_history_020084004_020112004.sql
         {% set start = this.identifier.split("_") [-2] %}
         {% set stop = this.identifier.split("_") [-1] %}
         {{ fsc_evm.streamline_decoded_logs_requests(
             start,
             stop,
             model_type = 'history'
         ) }}
      
  • Complete Models: Materialized as Tables, these models query all blocks and logs/traces (via _log_id / _call_id) that have been requested, and have successfully landed in the External Tables / Bronze Views. The complete models are required to properly implement the requests models, as they prevent re-requesting data that was already decoded.

Best Practices, Tips & Tricks:

Implementation Steps, Variables & Notes:

Examples, References & Sources:

Example Code:

Example Folder Structure and Model Hierarchy:

> models/streamline/bronze/decoder
    - bronze__streamline_decoded_logs.sql
    - bronze__streamline_fr_decoded_logs.sql
    - bronze__streamline_decoded_traces.sql
    - bronze__streamline_fr_decoded_traces.sql

> models/streamline/silver/decoder
    >> Realtime
        - streamline__decode_logs_realtime.sql
        - streamline__decode_traces_realtime.sql
    >> History
        >>> event_logs
             >>>> range_0
                   - ...
                   - streamline__decode_logs_history_005544370_005644534.sql
                   - ...
             >>>> range_1
        >>> traces
    >> Complete
        - streamline__complete_decoded_logs.sql
        - streamline__complete_decoded_traces.sql