Databricks - LaunchCode-Code-Connect/CandidateResources GitHub Wiki

Introduction to Databricks

Databricks is a cloud-based platform that focuses on data processing and analysis. It allows users to process large amounts of data in a unified place. One of the big benefits of Databricks is that it makes it easier for their users to see all of their data in one location. Because of this, it tends to be more efficient to query and analyze their data.

Why we recommend it

Databricks is a leader in the industry for data lakes. It allows a solid platform for data and governance. We've seen a significant uptick in requests for candidates to have experience with it as companies shift to standardizing their data practices. It's going to be a growing need as companies improve their data storage and analysis!

In order to get started with Databricks, we recommend signing up for a free account with the company here: Databricks Free Edition. This will get you access to be able to put what you'll learn into practice!

Skill Update!

Have you completed one of the courses below or a different course or certification related to AWS? If you are an active candidate working with LaunchCode, please let us know by filling out this form. Our team reviews submissions on a weekly basis and will update your LaunchCode candidate profile with a note to let the Career Outcomes team know that you have completed a new course!

Databricks Learning Resources

Course Name Description Skills Covered Time to Complete
Databricks Essentials This Pluralsight path will teach you the fundamentals of how to get started with the Databricks platform. Data Lakes, SQL, Automation, ETL Pipelines 8 hours
Databricks Fundamentals This short course from Databricks will help introduce you to the platform and what you can accomplish with it. While it is not technical in nature, this will help you get a better understanding of why companies use Databricks, which is great information to have for future interviews! Databricks Intro 1 hour
Apache Spark for Databricks Apache Spark is an engine for large scale data processing. It's at the heart of Databricks, so it's an important tool understand. This course will take you through how to get started with Apache Spark on Databricks and how to start using it effectively. Handling Batch Data, Windowing and Join Operations, Optimization 14 hours
Getting Started with Data Engineering on Databricks This official Databricks course will teach you how to use the core components of the platform, how to create and manage clusters and Delta Lake tables, and how to integrate version control using Git. Git, Data Lake Tables 2 hours
Databricks Certified Data Analyst Associate Preparation This Pluralsight path focuses on helping you prepare to take the Databricks Certified Data Analyst Associate Certificate. While LaunchCode will not currently cover the cost of this certificate, we recommend building up the skills that would help you pass the exam, because it will help you become more effective at using Databricks. Databricks SQL, Data Visualization and Dashboarding, Data Lakehouses 7 hours