1.3.1.Follow the data life cycle - quanganh2001/Google-Data-Analytics-Professional-Certificate-Coursera GitHub Wiki

Variations of the data life cycle

You learned that there are six stages to the data life cycle. Here is a recap:

  1. Plan: Decide what kind of data is needed, how it will be managed, and who will be responsible for it.
  2. Capture: Collect or bring in data from a variety of different sources.
  3. Manage: Care for and maintain the data. This includes determining how and where it is stored and the tools used to do so.
  4. Analyze: Use the data to solve problems, make decisions, and support business goals.
  5. Archive: Keep relevant data stored for long-term and future reference.
  6. Destroy: Remove data from storage and delete any shared copies of the data.

Warning: Be careful not to mix up or confuse the six stages of the data life cycle (Plan, Capture, Manage, Analyze, Archive, and Destroy) with the six phases of the data analysis life cycle (Ask, Prepare, Process, Analyze, Share, and Act). They shouldn't be used or referred to interchangeably.

The data life cycle provides a generic or common framework for how data is managed. You may recall that variations of the data analysis life cycle were described in Origins of the data analysis process. The same can be done for the data life cycle. The rest of this reading provides a glimpse of how government, finance, and education institutions can view data life cycles a little differently.

U.S. Fish and Wildlife Service

The U.S. Fish and Wildlife Service uses the following data life cycle:

  1. Plan
  2. Acquire
  3. Maintain
  4. Access
  5. Evaluate
  6. Archive

For more information, refer to U.S. Fish and Wildlife's Data Management Life Cycle page.

The U.S. Geological Survey (USGS)

The USGS uses the data life cycle below:

  1. Plan
  2. Acquire
  3. Process
  4. Analyze
  5. Preserve
  6. Publish/Share

Several cross-cutting or overarching activities are also performed during each stage of their life cycle:

  • Describle (metadata and documentation)
  • Manage Quality
  • Backup and Secure

For more information, refer to the USGS Data Lifecycle page.

Financial institutions

Financial institutions may take a slightly different approach to the data life cycle as described in The Data Life Cycle, an article in Strategic Finance magazine:

  1. Capture
  2. Qualify
  3. Transform
  4. Utilize
  5. Report
  6. Archive
  7. Purge

Harvard Business School (HBS)

One final data life cycle informed by Harvard University research has eight stages:

  1. Generation
  2. Collection
  3. Processing
  4. Storage
  5. Management
  6. Analysis
  7. Visualization
  8. Interpretation

For more information, refer to 8 Steps in the Data Life Cycle.

Key takeaway

Understanding the importance of the data life cycle will set you up for success as a data analyst. Individual stages in the data life cycle will vary from company to company or by industry or sector. Historical data is important to both the U.S. Fish and Wildlife Service and the USGS, so their data life cycle focuses on archiving and backing up data. Harvard's interests are in research and teaching, so its data life cycle includes visualization and interpretation even though these are more often associated with a data analysis life cycle. The HBS data life cycle also doesn't call out a stage for purging or destroying data. In contrast, the data life cycle for finance clearly identifies archive and purge stages. To sum it up, although data life cycles vary, one data management principle is universal. Govern how data is handled so that it is accurate, secure, and available to meet your organization's needs.

Self-Reflection: Collecting data

YvR1YaLGQey0dWGixhHsOQ_764cd5250c3e418fbf6c28d6f45e6cc4_self-review-magnifying-glass

Overview

StgNojWYS52YDaI1mEudzg_6f3a5f1d117d4d34902b2648bc84ec7a_line-y

Now that you are familiar with the phases of the data life cycle, you can take a moment to think about your learning. In this self-reflection, you will consider your thoughts about collecting data and how data collection fits into the data life cycle.

To start, you will consider a simple scenario: discussing the data life cycle in a mock interview for a data analyst role. Then, you will respond to three brief questions. You’ve done the hard work to learn the basics of the data life cycle, so get the most out of it: This reflection will help your knowledge stick!

The scenario: interview for a data analyst position

StgNojWYS52YDaI1mEudzg_6f3a5f1d117d4d34902b2648bc84ec7a_line-y

Imagine that you interview for a data analyst role at a local ice cream company. The hiring manager explains that the company needs a data analyst because they want to learn more about their customers. First, they want to understand their customers’ ice cream flavor preferences. Then, they will use this customer data to help make important decisions.

The hiring manager explains that they do not collect any customer data, and they don’t know where to begin. The hiring manager asks you: Can you please explain how you would approach this task?

Before responding to the question, you consider each step of the data life cycle.

Recap: The data life cycle

The steps of the data life cycle are:

  • Plan: What plans and decisions do you need to make? What data do you need to answer your question?
  • Capture: Where does your data come from? How will you get it?
  • Manage: How will you store your data? What should it be used for? How do you keep this data secure and protected?
  • Analyze: How will the company analyze the data? What tools should they use?
  • Archive: What should they do with their data when it gets old? How do they know when it's time?
  • Destroy: Should they ever dispose of any data? If so, when and how?

Reflection:

Consider your learning about the steps of the data life cycle and reflect on the hiring manager’s request. Review the following questions to help guide your thinking:

  • What kind of data should they gather?
  • How should they gather this data?
  • Where will the data live? How will they store the data?
  • Once they have the data, how will they use it?
  • How do they keep their data secure and protected?
  • What should they do with old data? What are their options?

In the text box below, write 2-3 sentences (40-60 words) that explains your recommendation for how the ice cream company should collect customer flavor preference data.

Test your knowledge on the data life cycle

Question 1

Fill in the blank: During the _____ phase of the data life cycle, a business decides what kind of data it needs, how it will be managed, who will be responsible for it, and the optimal outcomes.

A. archive

B. capture

C. manage

D. planning

The correct answer is D. planning. Explain: During the planning phase of the data life cycle, a business decides what kind of data it needs, how it will be managed, who will be responsible for it, and the optimal outcomes.

Question 2

In the data life cycle, which phase involves gathering data from various sources and bringing it into the organization?

A. Manage

B. Analyze

C. Archive

D. Capture

The correct answer is D. Capture. Explain: The capture phase involves gathering data from various sources and bringing it into the organization.

Question 3

A data analyst finishes using a dataset, so they erase or shred the files in order to protect private information. This is called archiving. True or False?

A. True

B. False

It is false because: Erasing or shredding files describes the destroy phase of the data life cycle. Archiving involves storing files in a place where it's still available.

Question 4

A dairy farmer decides to open an ice cream shop on her farm. After surveying the local community about people’s favorite flavors, she takes the data they provided and stores it in a secure hard drive so it can be maintained safely on her computer. This is part of which phase of the data life cycle?

A. Manage

B. Plan

C. Archive

D. Analyze

The correct answer is A. Manage. Explain: This is the manage phase of the data life cycle. It deals with how data is cared for, how and where it’s stored, the tools used to keep it safe and secure, and the actions taken to make sure it’s maintained properly.

Question 5

After opening the ice cream shop on her farm, the same dairy farmer then surveys the local community about people’s favorite flavors. She uses the data she collected to determine that the top five flavors are strawberry, vanilla, chocolate, mint chip, and peanut butter. She feels confident in her decision to sell these flavors. This is part of which phase of the data life cycle?

A. Plan

B. Archive

C. Analyze

D. Capture

The correct answer is C. Analyze. Explain: This is part of the analyze phase. This phase involves using data to make smart decisions and support business goals.