1.3.1.Following the data life cycle - sj50179/Google-Data-Analytics-Professional-Certificate GitHub Wiki

Phases of the data life cycle

The life cycle of data is plan, capture, manage, analyze, archive and destroy.

  1. Plan - This actually happens well before starting an analysis project. During planning, a business decides what kind of data it needs, how it will be managed throughout its life cycle, who will be responsible for it, and the optimal outcomes. For example, let's say an electricity provider wanted to gain insights into how to save people energy. In the planning phase, they might decide to capture information on how much electricity its customers use each year, what types of buildings are being powered, and what types of devices are being powered inside of them. The electricity company would also decide which team members will be responsible for collecting, storing, and sharing that data. All of this happens during planning, and it helps set up the rest of the project.

  2. Capture - This is where data is collected from a variety of different sources and brought into the organization. With so much data being created everyday, the ways to collect it are truly endless. One common method is getting data from outside resources. For example, if you were doing data analysis on weather patterns, you'd probably get data from a publicly available dataset like the National Climatic Data Center. Another way to get data is from a company's own documents and files, which are usually stored inside a database. While we've mentioned databases before, we haven't gone into too much detail about what they are. A database is a collection of data stored in a computer system. In the case of our electricity provider, the business would probably measure data usage among its customers within a database that it owns. As a quick note, when you maintain a database of customer information, ensuring data integrity, credibility, and privacy are all important concerns.

  3. Manage - Here we're talking about how we care for our data, how and where it's stored, the tools used to keep it safe and secure, and the actions taken to make sure that it's maintained properly. This phase is very important to data cleansing, which we'll cover later on.

  4. Analyze - This is where data analysts really shine. In this phase, the data is used to solve problems, make great decisions, and support business goals. For example, one of our electricity company's goals might be to find ways to help customers save energy.

  5. Archive - Archiving means storing data in a place where it's still available, but may not be used again. During analysis, analysts handle huge amounts of data. Can you imagine if we had to sort through all of the available data that's out there, even if it was no longer useful and relevant to our work? It makes way more sense to archive it than to keep it around. During analysis, analysts handle huge amounts of data. Can you imagine if we had to sort through all of the available data that's out there, even if it was no longer useful and relevant to our work? It makes way more sense to archive it than to keep it around.

  6. Destroy - let's get back to our electricity provider example. They would have data stored on multiple hard drives. To destroy it, the company would use a secure data erasure software. If there were any paper files, they would be shredded too. This is important for protecting a company's private information, as well as private data about its customers.

Q. In the data life cycle, which phase involves using data to solve problems, make good decisions, and support business goals?

  • A. Analyze

Self-Reflection: Collecting data

In this activity, you will use what you learned about the data life cycle in a mock interview for a data analyst role at an ice cream company.

Instructions

Read the scenario below and then share your response in the reflection section.

You are interviewing for a data analyst role at a local ice cream company. They are interested in using data on customer ice cream flavor preferences to help drive important decisions.

The hiring manager asks you:

“We want to better understand our customers’ ice cream flavor preferences, but honestly we don’t even know where to start. How would you approach this if you were part of our team?”

Reflection

Before responding to the hiring manager’s question, consider each step of the data life cycle:

  • Plan - What plans and decisions do you need to make? What data do you need to answer your question?
  • Capture - Where does your data come from? How will you get it?
  • Manage - How will you store your data? What should it be used for, and how do you keep this data secure and protected?
  • Analyze - How will the company analyze the data? What tools should they use?
  • Archive - What should they do with their data when it gets old? How do they know when it's time?
  • Destroy - Should they ever dispose of any data? If so, when and how?

Write 5-10 sentences explaining your recommendations for collecting customer flavor preference data.

Here are some topics to consider for your response:

  • What kind of data should they be gathering?

  • How should they gather this data?

  • Where will the data live? How will they store the data?

  • Once they have the data, how will they use it?

  • How do they keep their data secure and protected?

  • What should they do with old data? What are their options?

  • First, we need to know which flavors of ice cream are the best sellers, and the data can be obtained by looking at the company's sales status. And we can also conduct a survey of customers' preferences. The combination of the survey results and sales status will help we know the ranking of taste preferences and give us a more direct answer to what flavor customers prefer. It looks good to keep this data for about a year and analyze it in the same way after a year because customers' tastes can be changed more easily than we think.

Correct

Thank you for the submission! Understanding the life cycle of the data you’re working with is crucial to any project. Your response should focus on helping the ice cream company come up with specific answers to the questions associated with each step of the data life cycle.

To use their data successfully:

  • The ice cream company must first figure out what data they need and where they can get it.
  • Once they have the data, they have to be sure of what they will (and won’t) use it for.
  • The ice cream company also has to be mindful of how they will keep the data secure, and how to deal with old data that has outlived its usefulness.

As a data analyst, these are the types of questions you should always be seeking to answer about your data.

Test your knowledge on the data life cycle

Question 1

Fill in the blank: During the _____ phase of the data life cycle, a business decides what kind of data it needs, how it will be managed, who will be responsible for it, and the optimal outcomes.

  • planning

  • archive

  • manage

  • capture

Correct. During the planning phase of the data life cycle, a business decides what kind of data it needs, how it will be managed, who will be responsible for it, and the optimal outcomes.

Question 2

In the data life cycle, which phase involves gathering data from various sources and bringing it into the organization?

  • Capture

  • Analyze

  • Archive

  • Manage

Correct. The capture phase involves gathering data from various sources and bringing it into the organization.

Question 3

A data analyst finishes using a dataset, so they erase or shred the files in order to protect private information. This is called archiving.

  • True

  • False

Correct. Erasing or shredding files describes the destroy phase of the data life cycle. Archiving involves storing files in a place where it's still available.

Question 4

A dairy farmer decides to open an ice cream shop on her farm. After surveying the local community about people’s favorite flavors, she takes the data they provided and stores it in a secure hard drive so it can be maintained safely on her computer. This is part of which phase of the data life cycle?

  • Archive

  • Manage

  • Plan

  • Analyze

This is the manage phase of the data life cycle. It deals with how data is cared for, how and where it’s stored, the tools used to keep it safe and secure, and the actions taken to make sure it’s maintained properly.

Question 5

After opening the ice cream shop on her farm, the same dairy farmer then surveys the local community about people’s favorite flavors. She uses the data she collected to determine that the top five flavors are strawberry, vanilla, chocolate, mint chip, and peanut butter. She feels confident in her decision to sell these flavors. This is part of which phase of the data life cycle?

  • Capture

  • Plan

  • Analyze

  • Archive

Correct. This is part of the analyze phase. This phase involves using data to make smart decisions and support business goals.