Snowflake Notebooks - GouriShankarS/Snowflake_Codebook GitHub Wiki
Core functionality
Snowflake Notebook
Simplifying CTE's with Notebooks using Python and SQL cell result referencing
Create synthetic data in Notebooks
Convert SQL results to pandas dataframe
Python variables and Jinja syntax
Streamlit example
Snowflake Cortex
Execute Notebooks from SQL and Python Cells
Getting Started with Streamlit in Snowflake: Building Interactive Data Visualizations
Introduction Data visualization is crucial for understanding trends, identifying anomalies, and making data-driven decisions. However, managing infrastructure, deploying dashboards, and ensuring data freshness often present challenges.
This article explores how Streamlit in Snowflake simplifies the process by allowing users to build and deploy interactive visualizations directly within their Snowflake environment. By leveraging this integration, users can create data applications without worrying about infrastructure, data movement, or external dependencies.
What is Streamlit in Snowflake? Streamlit is an open-source Python framework that makes it easy to build interactive data applications with minimal code. By integrating Streamlit natively within Snowflake, users can run applications directly where their data resides, eliminating the need for complex ETL pipelines, external infrastructure, or API calls to third-party dashboards.
Benefits of Using Streamlit in Snowflake No Infrastructure Management: Since Streamlit runs inside Snowflake, there’s no need to set up servers, manage scaling, or configure networking. Faster Insights: By running directly on Snowflake, users avoid the latency associated with querying external BI tools. Security & Governance: Streamlit apps inherit Snowflake’s access control and security model, ensuring data privacy and compliance. Seamless Data Access: Applications can directly query Snowflake tables without additional authentication layers. Easy to Build & Deploy: Users can write Python code in minutes to create interactive visualizations and publish them as apps for stakeholders. Purpose This Streamlit in Snowflake application provides a template your team can use to build custom visualizations.
Step 1: Create New Streamlit Application Head over to Projects: Go to your Snowflake home page and select “Projects”. Select Streamlit: Click on Streamlit on the projects page. A list of existing Streamlit applications will be visible for the selected role. Create New App: Click on “New App” in the upper right-hand corner. Provide a title, location, and warehouse for your application. (The application code will reside in the specified database/schema and run using the specified warehouse. An XS warehouse should suffice.) Click “Create”. Step 2: Copy & Paste SiS Application Code Below
How to use Set Variables
import streamlit as st import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns import plotly.express as px import altair as alt
Set the page title and layout
st.set_page_config(page_title="Data Visualization Showcase", layout="wide")
Generate sample data
np.random.seed(42)
data = pd.DataFrame({
'A': np.random.randn(100),
'B': np.random.rand(100) * 10,
'Category': np.random.choice(['X', 'Y', 'Z'], size=100),
'Latitude': np.random.uniform(29.5, 30.5, 100),
'Longitude': np.random.uniform(-98, -97, 100),
'C': np.random.randn(100) * 5
})
Header
st.title("ACME Data Visualization Examples") st.write("Explore various ways of visualizing data using interactive elements in Streamlit.")
Sidebar for navigation
st.sidebar.title("Visualization Options") visualization_options = [ "Line Chart", "Bar Chart", "Matplotlib Plot", "Seaborn Heatmap", "Plotly Scatter Plot", "Altair Chart", "Pie Chart", "Geospatial Plot", "Histogram", "Box Plot", "3D Scatter Plot" ] selected_visualization = st.sidebar.selectbox("Choose a chart type", visualization_options)
Interactive Slider for filtering dataset
st.sidebar.subheader("Filter Data") data_filter = st.sidebar.slider("Select data range for 'A'", float(data['A'].min()), float(data['A'].max()), (float(data['A'].min()), float(data['A'].max()))) filtered_data = data[(data['A'] >= data_filter[0]) & (data['A'] <= data_filter[1])]
Interactive Category Selection
selected_category = st.sidebar.multiselect("Select Category", options=data["Category"].unique(), default=data["Category"].unique()) filtered_data = filtered_data[filtered_data["Category"].isin(selected_category)]
Color Picker for Customization
chart_color = st.sidebar.color_picker("Pick a Chart Color", "#1f77b4")
1. Line Chart (Streamlit)
if selected_visualization == "Line Chart": st.subheader("Line Chart (Streamlit)") st.line_chart(filtered_data'A', 'B')
2. Bar Chart (Streamlit)
if selected_visualization == "Bar Chart": st.subheader("Bar Chart (Streamlit)") st.bar_chart(filtered_data'A', 'B')
3. Matplotlib Scatter Plot
if selected_visualization == "Matplotlib Plot": st.subheader("Matplotlib Scatter Plot") fig, ax = plt.subplots() ax.scatter(filtered_data['A'], filtered_data['B'], c=chart_color, alpha=0.5) ax.set_title("Scatter Plot using Matplotlib") ax.set_xlabel("A") ax.set_ylabel("B") st.pyplot(fig)
4. Seaborn Heatmap
if selected_visualization == "Seaborn Heatmap": st.subheader("Seaborn Heatmap") correlation_matrix = filtered_data'A', 'B', 'C'.corr() fig, ax = plt.subplots() sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', ax=ax) ax.set_title("Heatmap of Correlation Matrix") st.pyplot(fig)
5. Plotly Scatter Plot
if selected_visualization == "Plotly Scatter Plot": st.subheader("Plotly Scatter Plot") fig = px.scatter(filtered_data, x='A', y='B', color='Category', title="Scatter Plot using Plotly") st.plotly_chart(fig)
6. Altair Chart
if selected_visualization == "Altair Chart": st.subheader("Altair Chart (Line Chart)") filtered_data_altair = filtered_data.reset_index().rename(columns={'index': 'Index'})
chart = alt.Chart(filtered_data_altair).mark_line(color=chart_color).encode(
x='Index:Q',
y='A',
color='Category'
).interactive().properties(title="Line Chart using Altair")
st.altair_chart(chart, use_container_width=True)
7. Pie Chart (Matplotlib)
if selected_visualization == "Pie Chart":
st.subheader("Pie Chart (Matplotlib)")
category_counts = filtered_data['Category'].value_counts()
fig, ax = plt.subplots()
ax.pie(category_counts, labels=category_counts.index, autopct='%1.1f%%', startangle=90, colors=[chart_color])
ax.axis('equal')
ax.set_title("Pie Chart of Category Distribution")
st.pyplot(fig)
8. Geospatial Plot (Plotly Map)
if selected_visualization == "Geospatial Plot": st.subheader("Geospatial Plot (Plotly Map)") fig = px.scatter_mapbox( filtered_data, lat="Latitude", lon="Longitude", color="Category", size=np.abs(filtered_data["A"]), mapbox_style="open-street-map", zoom=8, title="Geospatial Data Visualization" ) st.plotly_chart(fig)
9. Histogram (Matplotlib)
if selected_visualization == "Histogram": st.subheader("Histogram (Matplotlib)") fig, ax = plt.subplots() ax.hist(filtered_data['A'], bins=20, color=chart_color, alpha=0.7) ax.set_title("Histogram of Variable A") ax.set_xlabel("Value of A") ax.set_ylabel("Frequency") st.pyplot(fig)
10. Box Plot (Seaborn)
if selected_visualization == "Box Plot": st.subheader("Box Plot (Seaborn)") fig, ax = plt.subplots() sns.boxplot(x="Category", y="A", data=filtered_data, ax=ax, palette=[chart_color]) ax.set_title("Box Plot of A by Category") st.pyplot(fig)
11. 3D Scatter Plot (Plotly)
if selected_visualization == "3D Scatter Plot": st.subheader("3D Scatter Plot (Plotly)") fig = px.scatter_3d(filtered_data, x='A', y='B', z='C', color='Category', title="3D Scatter Plot") st.plotly_chart(fig)
Footer
st.write("This app showcases various interactive data visualization techniques using Streamlit.")