Plan de capacitaciones y referencias - wandent/mutual-wiki GitHub Wiki
Plan de capacitaciones y referencias
Actualizado en 30/10/2020
Capacidades para Ingeniero de Datos
Azure for the data engineer (learning path)
https://docs.microsoft.com/en-us/learn/paths/azure-for-the-data-engineer/
Productos
Azure Storage (Data Lake Gen2)
Large scale data processing with Azure Data Lake Storage gen2 (learning path)
https://docs.microsoft.com/en-us/learn/paths/data-processing-with-azure-adls/
Overview of Azure Data Lake store gen2
Introduction to Azure Data Lake Store gen2
https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction
Multiprotocol Access on Azure Data Lake Storage
https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-multi-protocol-access
Open Source Platforms that supports Azure Data Lake Store Gen2
AzCopy Tool
https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10
Azure Data Lake Store gen2 : Security recommendations
https://docs.microsoft.com/en-us/azure/storage/blobs/security-recommendations
Immutable Storage
https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-immutable-storage
Object Replication https://docs.microsoft.com/en-us/azure/storage/blobs/object-replication-configure?tabs=portal
Azure Data Factory - ADF
Integrate Data with Data Factory (learning path)
https://docs.microsoft.com/en-us/learn/modules/data-integration-azure-data-factory/
Receive Data with Azure Data Share and transforming with Azure Data Factory (learning path)
Transform data by running Python activity (ADF) in Azure Databricks
https://docs.microsoft.com/en-us/azure/data-factory/transform-data-databricks-python
Transform data by running a jar activity in Azure Databricks
https://docs.microsoft.com/en-us/azure/data-factory/transform-data-databricks-jar
Transform data by running a databricks notebook (submitting jobs on databricks)
https://docs.microsoft.com/en-us/azure/data-factory/transform-data-databricks-notebook
Databricks
Microsoft
Data Engineering with Databricks
https://docs.microsoft.com/en-us/learn/paths/data-engineer-azure-databricks/
Databricks training (clases virtuales, con costo)
ETL Part 1
https://academy.databricks.com/course/MID-DE-DAEX-v1-SP-C
ETL Part 2
https://academy.databricks.com/course/MID-DE-DTLO-v1-SP-C
ETL Part 3
https://academy.databricks.com/course/MID-DE-ETLP-v1-SP-C
Structured Streaming
https://academy.databricks.com/course/MID-AL-STST-v1-SP-C
otras capacitaciones por profesión (ingeniero de datos, cientifico de datos o administrador)
https://academy.databricks.com/pathway/INT-AL-FREE-SP
Capacitaciones con instructor
https://academy.databricks.com/category/public-trainings
Webinar: Using SQL to query your Datas Lake with Delta Lake
Sin costo, bajo demanda.
https://databricks.com/p/webinar/using-sql-to-query-your-data-lake-with-delta-lake
Azure Databricks Essential - LinkedIn Learning
https://www.linkedin.com/learning/azure-databricks-essential-training/optimize-data-pipelines?u=3322
More references for Databricks
(some might not be current, or available)
Databricks 101
Security - Isso 27001 certified secuirty ISO 27001 Certified Security
Delta Lake Website (Webinars) https://databricks.com/product/delta-lake-on-databricks
Apache Spark Documentation
Spark Docs
http://spark.apache.org/docs/latest/
Pyspark documentation
https://spark.apache.org/docs/latest/api/python/index.html
Delta Lake
https://docs.azuredatabricks.net/delta/index.html
Dataframes
https://docs.databricks.com/getting-started/spark/dataframes.html
Introduction to Dataframes
Spark SQL Reference (Hive Spark)
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select
Spark API (Dataframes and Datasets)
https://docs.databricks.com/spark/latest/dataframes-datasets/index.html#dataframes
Databases and Tables
https://docs.databricks.com/data/tables.html
Data types
http://spark.apache.org/docs/latest/sql-programming-guide.html#data-types
Metastores
https://docs.databricks.com/data/metastores/index.html
Introduction to Dataframes
Databases and Tables
https://docs.databricks.com/data/tables.html
Spark Clusters Configurations
https://docs.microsoft.com/en-us/azure/databricks/clusters/configure
Python Tutorial
https://www.tutorialspoint.com/python/
Mix Languages in Notebooks
https://docs.databricks.com/notebooks/notebooks-use.html#mix-languages
Databricks-cli (Databricks CLI Command Line Interface)
https://pypi.org/project/databricks-cli/
Delta Lake
Delta Lake
https://docs.azuredatabricks.net/delta/index.html Delta Lake API Reference
https://docs.azuredatabricks.net/delta/delta-apidoc.html Delta Lake Website
Delta Lake Documentation
https://docs.delta.io/latest/index.html Delta Lake Quick Start Python
https://docs.azuredatabricks.net/delta/delta-batch.html#write-to-a-table
Delta Lake Quick Start SQL
https://docs.azuredatabricks.net/_static/notebooks/delta/quickstart-sql.html
Streaming
Sample Notebook 1
https://docs.microsoft.com/en-us/azure/databricks/_static/notebooks/structured-streaming-python.html Structured Streaming and Event Hubs Integration Guide https://github.com/Azure/azure-event-hubs-spark/blob/master/docs/structured-streaming-eventhubs-integration.md
Azure Event Hubs Spark Connector
https://github.com/Azure/azure-event-hubs-spark
Other (advanced)
Libraries
https://docs.azuredatabricks.net/libraries.html ML Flow
https://docs.azuredatabricks.net/applications/mlflow/index.html
GraphFrames
https://docs.azuredatabricks.net/spark/latest/graph-analysis/graphframes/index.html
Azure Samples - Streaming at Scale
https://github.com/Azure-Samples/streaming-at-scale
Loading Avro Files into Databricks https://docs.databricks.com/data/data-sources/read-avro.html
Databricks logs in Azure
Databricks Cloud Automation
https://github.com/databrickslabs/databricks-cloud-automation
High Performance Spark Queries with Databricks Delta https://docs.azuredatabricks.net/_static/notebooks/delta/optimize-python.html
Azure Databricks Operator - Container image
https://hub.docker.com/_/microsoft-k8s-azure-databricks-operator
Azure Databricks API container
https://hub.docker.com/_/microsoft-azure-databricks-api
ML Leap model export demo Python https://docs.microsoft.com/en-us/azure/databricks/_static/notebooks/mleap-model-export-demo-python.html
Deploy models with Azure Machine Learning
https://docs.microsoft.com/en-us/azure/machine-learning/how-to-deploy-and-where
Power BI
Iniciante
Create and use analytics reports with Power BI
https://docs.microsoft.com/en-us/learn/paths/create-use-analytics-reports-power-bi/
Intermedio Prepare Data in Power BI
https://docs.microsoft.com/en-us/learn/paths/prepare-data-power-bi/
Visualize Data in Power BI
https://docs.microsoft.com/en-us/learn/paths/visualize-data-power-bi/
Perform analytics in Power BI
https://docs.microsoft.com/en-us/learn/paths/perform-analytics-power-bi/
Use DAX in Power BI
https://docs.microsoft.com/en-us/learn/paths/dax-power-bi/
Manage Workspaces and Datasets in Power BI
https://docs.microsoft.com/en-us/learn/paths/manage-workspaces-datasets-power-bi/