231007 - Forestreee/Data-Analytics GitHub Wiki
Google Data Analytics Professional
Foundations: Data, Data, Everywhere
WEEK1 - Introduction
Data helps us make decisions in everyday life and in business. In this first part of the course, you’ll learn how data analysts use data analytics and the tools of their trade to inform those decisions. You’ll also discover more about this course and the overall program expectations.
Learning objectives
- Define key concepts involved in data analytics including data, data analysis, and data ecosystem
- Discuss the use of data in everyday life decisions
- Identify the key features of the learning environment and their uses
- Describe principles and practices that will help to increase one's chances of success in this certificate
- Explain the use of data in organizational decision-making
- Describe the key concepts to be discussed in the program, including learning outcomes
DATA in daily life
Data is a collection of facts that can be used to draw conclusions, make predictions, and assist in decision-making.
Data applied in different ways
- Finance
- Healthcare
- Communications
- Government
- Consumer products
- And lots more!
Data analysts make data-driven decisions.
Ask
Prepare
Process
Analyze
Share
Act
Computer + Your Brain + Your Skills + Your Traits = Job Success
Program features
- Video vignettes
- Data journal
- Readings
- Activities
- Discussion prompts
FAQ
What tools or platforms are included in the curriculum?
Spreadsheets, SQL, presentation tools, Tableau, RStudio, and Kaggle.
Do I need to take the course in a certain order?
We highly recommend completing the courses in the order presented because the content in each course builds on information from earlier lessons.
Will you be teaching R or Python?
This program teaches the open-source programming language, R, which is great for foundational data analysis, and offers helpful packages for beginners to apply to their projects. We do not cover Python in the curriculum.
Preview (Infographic)
Your Google Data Analytics Certificate roadmap
Use this guide to review the topics covered, tools used, and skills you will use in each course.
1. Foundations
What you will learn:
- Real-life roles and responsibilities of a junior data analyst
- How businesses transform data into actionable insights
- Spreadsheet basics
- Database and query basics
- Data visualization basics
Skill sets you will build:
- Using data in everyday life
- Thinking analytically
- Applying tools from the data analytics toolkit
- Showing trends and patterns with data visualizations
- Ensuring your data analysis is fair
2. Ask
What you will learn:
- How data analysts solve problems with data
- The use of analytics for making data-driven decisions
- Spreadsheet formulas and functions
- Dashboard basics, including an introduction to Tableau
- Data reporting basics
Skill sets you will build:
- Asking SMART and effective questions
- Structuring how you think
- Summarizing data
- Putting things into context
- Managing team and stakeholder expectations
- Problem-solving and conflict-resolution
3. Prepare
What you will learn:
- How data is generated
- Features of different data types, fields, and values
- Database structures
- The function of metadata in data analytics
- Structured Query Language (SQL) functions
Skill sets you will build:
- Ensuring ethical data analysis practices
- Addressing issues of bias and credibility
- Accessing databases and importing data
- Writing simple queries
- Organizing and protecting data
- Connecting with the data community (optional)
4. Process
What you will learn:
- Data integrity and the importance of clean data
- The tools and processes used by data analysts to clean data
- Data-cleaning verification and reports
- Statistics, hypothesis testing, and margin of error
- Resume building and interpretation of job postings (optional)
Skill sets you will build:
- Connecting business objectives to data analysis
- Identifying clean and dirty data
- Cleaning small datasets using spreadsheet tools
- Cleaning large datasets by writing SQL queries
- Documenting data-cleaning processes
5. Analyze
What you will learn:
- Steps data analysts take to organize data
- How to combine data from multiple sources
- Spreadsheet calculations and pivot tables
- SQL calculations
- Temporary tables
- Data validation
Skill sets you will build:
- Sorting data in spreadsheets and by writing SQL queries
- Filtering data in spreadsheets and by writing SQL queries
- Converting data
- Formatting data
- Substantiating data analysis processes
- Seeking feedback and support from others during data analysis
6. Share
What you will learn:
- Design thinking
- How data analysts use visualizations to communicate about data
- The benefits of Tableau for presenting data analysis findings
- Data-driven storytelling
- Dashboards and dashboard filters
- Strategies for creating an effective data presentation
Skill sets you will build:
- Creating visualizations and dashboards in Tableau
- Addressing accessibility issues when communicating about data
- Understanding the purpose of different business communication tools
- Telling a data-driven story
- Presenting to others about data
- Answering questions about data
7. Act
What you will learn:
- Programming languages and environments
- R packages
- R functions, variables, data types, pipes, and vectors
- R data frames
- Bias and credibility in R
- R visualization tools
- R Markdown for documentation, creating structure, and emphasis
Skill sets you will build:
- Coding in R
- Writing functions in R
- Accessing data in R
- Cleaning data in R
- Generating data visualizations in R
- Reporting on data analysis to stakeholders
8. Capstone
What you will learn:
- How a data analytics portfolio distinguishes you from other candidates
- Practical, real-world problem-solving
- Strategies for extracting insights from data
- Clear presentation of data findings
- Motivation and ability to take initiative
Skill sets you will build:
- Building a portfolio
- Increasing your employability
- Showcasing your data analytics knowledge, skill, and technical expertise
- Sharing your work during an interview
- Communicating your unique value proposition to a potential employer
Data Analytics in everyday life
Every goal and success that my team and I have achieved couldn't have been done without data. Here at Google, all of our products are built on data and data-driven decision-making. From concept to development to launch, we're using data to figure out the best way forward. Countless other organizations also see the incredible value in data and, of course, the data analysts who help them make use of it. So we know data opens up a lot of opportunities.
All of these are great examples of real-life patterns and relationships that you can use to make predictions about the right actions to take, and that is a huge part of data analysis right there. Every minute of every hour of every day, more data is being created. Businesses need a way to control all that data so they can use it to improve processes, identify opportunities and trends, launch new products, serve customers, and make thoughtful decisions. For businesses to be on top of the competition, they need to be on top of their data. That's why these companies hire data analysts to control the waves of data they collect every day, make sense of it, and then draw conclusions or make predictions. This is the process of turning data into insights, and it's how analysts help businesses put all their data to good use. This is actually a good way to think about analysis: turning data into insights.
Data analysis
The collection, transformation, and organization of data in order to draw conclusions, make predictions, and drive informed decision-making.
Case Study:
- New data perspectives (course reading)
- 4 Examples of Business Analytics in Action (additional article from Harvard Business School)
The article reveals how corporations use data insights to optimize their decision-making process.
Dimensions of data analytics
- Decision Intelligence is a combination of applied data science and the social and managerial sciences.
- "A data analyst is an explorer, a detective, and an artist all rolled into one."
- Data has expanded so much that specialization has become important. Pick specialization based on which flavor, which type of impact best suits their personality
Data Science
- The discipline of making data useful, is an umbrella term that encompasses three disciplines: machine learning, statistics, and analytics.
Statistics: Make a few important decisions under uncertainty
Machine learning and AI: Automate, in other words, make many, many, many decisions under uncertainty
Analytics: Don't know how many decisions you want to make before you begin, what you're looking for is inspiration, encounter your unknowns, and understand your world. (The excellence of an analyst is speed. How quickly can you surf through vast amounts of data to explore it and discover the gems, the beautiful potential insights that are worth knowing about and bringing to your decision-makers? Are you excited by the ambiguity of exploration? Are you excited by the idea of working on a lot of different things, looking at a lot of different data sources, and thinking through vast amounts of information, while promising not to snooze past the important potential insights? Are you okay with being told, "Here is a whole lot of data. No one has looked at it before. Go find something interesting"? Do you thrive on creative, open-ended projects? If that's you, then analytics is probably the best fit for you.)
Data ecosystems
- Data ecosystems are made up of various elements that interact with one another in order to produce, manage, store, organize, analyze, and share data. These elements include hardware and software tools, as well as the people who use them.
The cloud is a place to keep data online, rather than on a computer hard drive. So instead of storing data somewhere inside your organization's network, that data is accessed over the internet. So the cloud is just a term we use to describe the virtual location. The cloud plays a big part in the data ecosystem, and as a data analyst, it's your job to harness the power of that data ecosystem, find the right information, and provide the team with analysis that helps them make smart decisions.
e.g.1) You could tap into your retail store's database, which is an ecosystem filled with customer names, addresses, previous purchases, and customer reviews. As a data analyst, you could use this information to predict what these customers will buy in the future, and make sure the store has the products and stock when they're needed.
e.g.2) Think about a data ecosystem used by a human resources department. This ecosystem would include information like postings from job websites, stats on the current labor market, employment rates, and social media data on prospective employees. A data analyst could use this information to help their team recruit new workers and improve employee engagement and retention rates.
e.g.3) They work on farms, too. Agricultural companies regularly use data ecosystems that include information including geological patterns in weather movements. Data analysts can use this data to help farmers predict crop yields. Some data analysts are even using data ecosystems to save real environmental ecosystems.
e.g.4) At the Scripps Institution of Oceanography, coral reefs all over the world are monitored digitally, so they can see how organisms change over time, track their growth, and measure any increases or declines in individual colonies. The possibilities are endless.
Data scientists vs Data analysts
Data science is defined as creating new ways of modeling and understanding the unknown by using raw data. Data scientists create new questions using data, while analysts find answers to existing questions by creating insights from data sources.
Data Analysis
The collection, transformation, and organization of data in order to draw conclusions, make predictions, and drive informed decision-making
Data Analytics
The science of data
So when you think about data, data analysis, and the data ecosystem, it's important to understand that all of these things fit under the data analytics umbrella.
Data-driven decision-making
Using facts to guide business strategy (Data informs better decisions)
In our everyday lives, we use data when we wear a fitness tracker or read product reviews to make a purchase decision. And in business, we use data to learn more about our customers, improve processes, and help employees do their jobs more effectively. But this is just the tip of the iceberg. One of the most powerful ways you can put data to work is with data-driven decision-making. Data-driven decision-making is defined as using facts to guide business strategy.
The first step in data-driven decision-making is figuring out the business need. Usually, this is a problem that needs to be solved. For example, a problem could be a new company needing to establish better brand recognition, so it can compete with bigger, more well-known competitors. Or maybe an organization wants to improve a product and needs to figure out how to source parts from a more sustainable or ethically responsible supplier. Or, it could be a business trying to solve the problem of unhappy employees, and low levels of engagement, satisfaction, and retention.
Whatever the problem is, once it's defined, a data analyst finds data, analyzes it and uses it to uncover trends, patterns, and relationships. Sometimes the data-driven strategy will build on what's worked in the past. Other times, it can guide a business to branch out in a whole new direction. Let's look at a real-world example.
e.g.1) Think about a music or movie streaming service. How do these companies know what people want to watch or listen to, and how do they provide it? Well using data-driven decision-making, they gather information about what their customers are currently listening to, analyze it, and then use the insights they've gained to make suggestions for things people will most likely enjoy in the future. This keeps customers happy and coming back for more, which in turn means more revenue for the company.
e.g.2) Another example of data-driven decision-making can be seen in the rise of e-commerce. It wasn't long ago that most purchases were made in a physical store, but the data showed people's preferences were changing. So a lot of companies created entirely new business models that remove the physical store, and let people shop right from their computers or mobile phones with products delivered right to their doorstep.
Real-world examples of making data-driven decisions include suggesting new music to a customer, scheduling a certain number of restaurant employees to work, and choosing e-commerce solutions based on established facts. Data-driven decision-making is using facts to guide business strategy.
An airline collecting, observing, and analyzing its customers' online behaviors, then using the insights gained to choose what new products and services to offer, describes data-driven decision-making. Data-driven decision-making is using facts to guide business strategy.
By ensuring that data is built into every business strategy, data analysts play a critical role in their companies' success, but it's important to note that no matter how valuable data-driven decision-making is, data alone will never be as powerful as data combined with human experience, observation, and sometimes even intuition. To get the most out of data-driven decision-making, it's important to include insights from people who are familiar with the business problem. These people are called subject matter experts, and they have the ability to look at the results of data analysis and identify any inconsistencies, make sense of gray areas, and eventually validate the choices being made. Organizations that work this way put data at the heart of every business strategy, but also benefit from the insights of their people. It's a win-win. As a data analyst, you play a key role in empowering these organizations to make data-driven decisions, which is why it's so important for you to understand how data plays a part in the decision-making process.
Data and gut instinct
Detectives and data analysts have a lot in common. Both depend on facts and clues to make decisions. Both collect and look at the evidence. Both talk to people who know part of the story. And both might even follow some footprints to see where they lead. Whether you’re a detective or a data analyst, your job is all about following steps to collect and understand facts.
Analysts use data-driven decision-making and follow a step-by-step process. You have learned that there are six steps to this process:
- Ask questions and define the problem.
- Prepare data by collecting and storing the information.
- Process data by cleaning and checking the information.
- Analyze data to find patterns, relationships, and trends.
- Share data with your audience.
- Act on the data and use the analysis results.
Why gut instinct can be a problem
It's essential that data analysts focus on the data to ensure they make informed decisions. But even worse, decisions based on gut instinct without any data to back them up can cause mistakes.
e.g.) Consider an example of a restaurant entrepreneur, partnering with a well known chef to develop a new restaurant in a bustling part of the city’s central shopping district. The well known chef has several restaurants across the city. Banking on their reputation, the restaurant entrepreneur and chef followed gut instinct and created another uniquely themed restaurant. However, fundraising efforts fell short to fund the opening of the restaurant after months of planning and preparation. The property will go back on the market to be sold at a loss. Had the entrepreneur done more research, they would've found data showing prospective customers in this new restaurant location were very different from the chef's other restaurants.
Data + business knowledge = mystery solved
Blending data with business knowledge, plus maybe a touch of gut instinct, will be a common part of your process as a junior data analyst. The key is figuring out the exact mix for each particular project. A lot of times, it will depend on the goals of your analysis. That is why analysts often ask, “How do I define success for this project?”
In addition, try asking yourself these questions about a project to help find the perfect balance:
- What kind of results are needed?
- Who will be informed?
- Am I answering the question being asked?
- How quickly does a decision need to be made?
e.g.) If you are working on a rush project, you might need to rely on your own knowledge and experience more than usual. There just isn’t enough time to thoroughly analyze all of the available data. But if you get a project that involves plenty of time and resources, then the best strategy is to be more data-driven. It’s up to you, the data analyst, to make the best possible choice. You will probably blend data and knowledge in a million different ways over the course of your data analytics career. And the more you practice, the better you will get at finding that perfect blend.
Origins of the data analysis process
Data analysis life cycle—the process of going from data to decision. Data goes through several phases as it gets created, consumed, tested, processed, and reused. With a life cycle model, all key team members can drive success by planning work both upfront and at the end of the data analysis process.
Google Data Analytics
The process presented as part of the Google Data Analytics Certificate is one that will be valuable to you as you keep moving forward in your career:
- Ask: Business Challenge/Objective/Question
- Prepare: Data generation, collection, storage, and data management
- Process: Data cleaning/data integrity
- Analyze: Data exploration, visualization, and analysis
- Share: Communicating and interpreting results
- Act: Putting your insights to work to solve the problem
Understanding this process—and all of the iterations that helped make it popular—will be a big part of guiding your own analysis and your work in this program.
EMC's data analysis life cycle
EMC Corporation's data analytics life cycle is cyclical with six steps:
- Discovery
- Pre-processing data
- Model planning
- Model building
- Communicate results
- Operationalize
EMC Corporation is now Dell EMC. This model, created by David Dietrich, reflects the cyclical nature of real-world projects. The phases aren’t static milestones; each step connects and leads to the next, and eventually repeats. Key questions help analysts test whether they have accomplished enough to move forward and ensure that teams have spent enough time on each of the phases and don’t start modeling before the data is ready. It is a little different from the data analysis life cycle this program is based on, but it has some core ideas in common: the first phase is interested in discovering and asking questions; data has to be prepared before it can be analyzed and used; and then findings should be shared and acted on. For more information, refer to this e-book, Data Science & Big Data Analytics.
SAS's iterative life cycle
An iterative life cycle was created by a company called SAS, a leading data analytics solutions provider. It can be used to produce repeatable, reliable, and predictive results:
- Ask
- Prepare
- Explore
- Model
- Implement
- Act
- Evaluate
The SAS model emphasizes the cyclical nature of their model by visualizing it as an infinity symbol. Their life cycle has seven steps, many of which we have seen in the other models, like Ask, Prepare, Model, and Act. But this life cycle is also a little different; it includes a step after the act phase designed to help analysts evaluate their solutions and potentially return to the ask phase again. For more information, refer to Managing the Analytics Life Cycle for Decisions at Scale.
Project-based data analytics life cycle
A project-based data analytics life cycle has five simple steps:
- Identifying the problem
- Designing data requirements
- Pre-processing data
- Performing data analysis
- Visualizing data
This data analytics project life cycle was developed by Vignesh Prajapati. It doesn’t include the sixth phase, or what we have been referring to as the Act phase. However, it still covers a lot of the same steps as the life cycles we have already described. It begins with identifying the problem, preparing and processing data before analysis, and ends with data visualization. For more information, refer to Understanding the data analytics project life cycle.
Big data analytics life cycle
Authors Thomas Erl, Wajid Khattak, and Paul Buhler proposed a big data analytics life cycle in their book, Big Data Fundamentals: Concepts, Drivers & Techniques. Their life cycle suggests phases divided into nine steps:
- Business case evaluation
- Data identification
- Data acquisition and filtering
- Data extraction
- Data validation and cleaning
- Data aggregation and representation
- Data analysis
- Data visualization
- Utilization of analysis results
This life cycle appears to have three or four more steps than the previous life cycle models. But in reality, they have just broken down what we have been referring to as Prepare and Process into smaller steps. It emphasizes the individual tasks required for gathering, preparing, and cleaning data before the analysis phase. For more information, refer to Big Data Adoption and Planning Considerations.
Review Learning Objectives
- Define key concepts involved in data analytics including data, data analysis, and data ecosystem
- Explain the use of data in organizational decision-making
- Identify the key features of the learning environment and their uses
- Describe principles and practices that will help to increase one's chances of success in this certificate
- Discuss the use of data in everyday life decisions