Cas Obdamn - rvdegroen/notes GitHub Wiki

Table of Contents

Who is Cas Obdamn?

Cas Obdamn is a data engineer from DEPT. Cas will explain why managing large datasets properly is important. He will also explain how the skills we'll be learning can be used in a professional enviroment. He also said he works for digital marketing.

Programming languages

Cas works with different programming languages. He mostly works with:

  • Python
  • SQL (data sources)

Cas does less often:

  • Cloud provisioning
  • Build an infrastructure
  • Data activation on the website

Cas uses less often:

  • Bash
  • Terraform
  • Javascript

Introduction

Cas explained in his introduction that you always need to test your assumptions.

DEPT office

In the office, you have different depratments, such as branding advertisement, design tech and digital marketing. As earlier explained, Cas works in the digital marketing department if I understood him correctly.

They're doing a lot of automation, but being creative is still very important. Within the company, you could be more on the context creativity side or on the data science side (creative) or on the data science side (heavier in tech, where Cas works in).

Digital marketing department

Within the Digital Marketing (DM) department, alot of work gets automated, such as all kinds of user journeys. The marketing cloud team works alot with this.

Personalised approach

To be able to market your product as good as possible you need to collect alot of user data an for this you'll need a personalised approach. You also need to know your audience and their "customer journeys". DEPT works alot with persona's and different kinds of customer journeys, because everyone wants to buy the product for a different reason. The whole team thinks about all of these aspects.

Law & tech are limiting data

You also have to keep in mind that sometimes you can't always acquire the data you'll need: tech can be limiting. Sometimes it's not allowed to get certain data due to the law for privacy reasons.

Branding advertisment & design tech

A big part of DEPT is also context creativity. It's a less bigger part of digital marketing. Context creativty has everything to do with creating visualizations and brandings. Nowadays you don't stand out if you don't have an unique branding.

Teams and reposibilities

There are different teams with each their own resposibilities within the DEPT company:

On the CLIENT side we got:

  • product owner
  • data expert
  • analist
  • connectior
  • PROJECT MANAGER (between CLIENT & DEPT)

On the DEPT side we got:

  • PROJECT MANAGER (between CLIENT & DEPT)
  • strategist
  • technical analasit
  • data engineer (Cas)
  • data science

Team collaboration

With every use cases are different teams involved.

  1. Within the company: Imagine being a spider, because you belong with Marketing, and you need to communicate with other departments.
  2. You usually need to wait on the costumer to get permission.

Use case topics

The company can sell different use cases, the company can do the use case themselves or they can develop an use case. below you can read all possible use case topics. Several thems are:

  • Data Quality
  • Audience targeting
  • Dynamic Content
  • Operational Excellence

What does cas work on?

These are topics or use cases that Cas either sold, made or developed.

  1. supply and demand (what is) (what's in storage, buy smart, smart saving)
  2. conversion intent
  3. product scoring
  4. attribution (which sources are not biased to google?)

Overall roadmap

The overall roadmap displays how a sprint within the company works. How does Cas works within his company is shown below.

  1. Collecting data
  • Sales team goes to work
  • Data Discvovery - how bad is the situation for the customer
  1. Setting up a team
  • Project setup & Roadmap
  • Avengers assemble - team can change
  1. Project stats
  • Iterative sprints - amount of sprints can change

Components of an data architecture

There are several components of an data architecture:"

  1. Customer Data
  2. Business Intelligence (BI)
  3. Machine Learning
  4. Data Activiation

Cas sometimes works with different teams.

Components of an data architecture - generic structure

Every structure can differen, but to make the story more insightful, here you can read about the generic structure of the company (how Cas works).

  1. Customer Data | the "what" Several things stay the same, you just collect the data:
  • sources
  • websites (online)
  • Offline
  • Crm
  • Media
  1. Business Intelligence (BI) | Extract transform load (ELT) First you have to stage the database then decide what data you need and put this in a production database.
  • Retrieve data from source (API, FTP)

For this, different integration tools (software) is used:

  • Adverity ()
  • Fivetran
  • Google cloud free tier
  • Beam
  • Custom code
  • Apache airflow (big customers with data pipelines, transform data that's from a-b to x, in python)
  1. Machine Learning | Data warehouse & Data lake
  • Optimised for ingesting, transforming and exporting large datasets
  • For smaller customers, only data warehous is usually used
  • For bigger customers, sometimes data warehouse, sometimes datalake and sometimes both

For this, different tools (software) are used:

  • Snowflake (most "mature"/up-to-date)

Software that could be outdated are, but could be used:

  • Google bigquery
  • Azure synapse
  • Aws redshift
  1. Data Activiation |
  • You snd the data with the source in a business intelligence tool (machine learning).
  • With machine learning, there are different "buckets". Low, medium and high. Let's say you need very few people for something, you could tell the machine to choose the lowest bucket.

For this, different tools (software) are used:

  • Power Business Intelligence
  • SFMC
  • Analysis
  • APIs - Hoe maak je het bruikbar? met sql modellen gebruiken ze DBT.
  • DBT - Squel/SQL with Ninja

There are different flavours for machine learning

  1. supervised (most used, because some data sources are labeled good)
  2. unsupervised
  3. reinforcement

Business intelligence

  1. marketing kpis - marketing
  2. operational kpis - is about selling, waiting times, how will it end up in the distribution centrum
  3. measurement kpi’s among others - everything that is not inbetween 1 or 2

Example: components of an data architecture - generic structure

To make these components of an data architecture more insightful, Cas had created a generic architecture to put a real-time case in perspective. He uses a possible use case as an example to show what he has worked with for the past 2 years.

Customer randstad

Cas and his team had Randstad as customer. In this section I'll be talking about this specific project he worked on.

How Cas worked on this project:

  • Working for online intelligence within randstad group marketing department
  • Finished and or maintaining 6 use cases all business intelligence or machine learning related
  • Working with a team of 9 DEPT-ers and 5 Randstad marketing analysts

Randstad tech stack

  • cloud: google cloud platform
  • integration tools used: apache airflow, cloud functions
  • data warehouse: bigquery, google cloud storage
  • BI software: google data studio, tableau
  • data modeling: data building tool
  • ml:
  • project management

What do they do at randstad?

At randstad they do conversion intent.

What is conversion intent? For example:

  • You google about laptops
  • Suddenly you see laptop advertisements

They built this with google: they basically only needed a google analytics dataset from their customer.

Conversion intent example

  1. Uses google analytics data to train an algorithm (it's nice data, you do have to get specific data from a web session, Cas developed this)
  2. Supervised algorithm trains on web sessions with conversions data scientists.
  3. Algorithm puts users in three buckets: (we need 3 people, with 10 requests jobs are filled, ML needs low bucket. Cas developed this with a data scientist)
    • low
    • middle
    • high
  4. Buckets get pushed to GMP and Facebook ads for the user to see
  5. Marketing people can use the bucket to re-target it.

/images/sketchnote.png

  • You have an ui in big query or Snowflake to make your own datasets instead of writing themselves on your own with excel for example.
  • With power business intellingce, you can build different dashboards en visualize them in different ways.
  • ML stands for Machine Learning. It's an aglorithm that can find patterns and learn them.
  • DBT is to harmonize the data

Reflection

I thought the presentation of Cas was very interesting. At some point I was thinking what this has to do with us because it just seemed like a lot of back-end to me, but it also made sense that that's where you get your data from.

For questions regarding his presentation I can always mail him on: [email protected]