1.3.3.The data analysis toolbox - sj50179/Google-Data-Analytics-Professional-Certificate GitHub Wiki

Exploring data analyst tools

the most common ones you'll see analyst use are :

  • Spreadsheets

    • The usefulness of your data depends on how well it's structured. When you put your data into a spreadsheet, you can see patterns, group information and easily find the information you need.
    • Formula : a set of instructions that performs a specific calculation using the data in a spreadsheet.
    • Function : a preset command that automatically performs a specific process or task using the data in a spreadsheet.
  • Query languages for databases

    • Query language : a computer programming language that allows you to retrieve and manipulate data from a database - structured query language(SQL)
  • Visualization tools

    • Data visualization : the graphical representation of information. This makes it easier for stakeholders to draw conclusions, make decisions, and come up with strategies. Some popular visualization tools are Tableau and Looker.

Q. Fill in the blank: A query language is a computer programming language that enables data analysts to retrieve and manipulate data from a _____.

  • A. database

Key data analyst tools

As you are learning, the most common programs and solutions used by data analysts include spreadsheets, query languages, and visualization tools. In this reading, you will learn more about each one. You will cover when to use them, and why they are so important in data analytics.

Spreadsheets

Data analysts rely on spreadsheets to collect and organize data. Two popular spreadsheet applications you will probably use a lot in your future role as a data analyst are Microsoft Excel and Google Sheets.

Digital worksheets structure data in a meaningful way by letting you

  • Collect, store, organize, and sort information
  • Identify patterns and piece the data together in a way that works for each specific data project
  • Create excellent data visualizations, like graphs and charts.

Databases and query languages

A database is a collection of structured data stored in a computer system. Some popular Structured Query Language (SQL) programs include MySQL, Microsoft SQL Server, and BigQuery.

Query languages

  • Allow analysts to isolate specific information from a database(s)
  • Make it easier for you to learn and understand the requests made to databases
  • Allow analysts to select, create, add, or download data from a database for analysis

Visualization tools

Data analysts use a number of visualization tools, like graphs, maps, tables, charts, and more. Two popular visualization tools are Tableau and Looker.

These tools

  • Turn complex numbers into a story that people can understand

  • Help stakeholders come up with conclusions that lead to informed decisions and effective business strategies

  • Have multiple features

    • Tableau's simple drag-and-drop feature lets users create interactive graphs in dashboards and worksheets

    • Looker communicates directly with a database, allowing you to connect your data right to the visual tool you choose

A career as a data analyst also involves using programming languages, like R and Python, which are used a lot for statistical analysis, visualization, and other data analysis.

As you will continue to learn, data analysts have a lot of tools to choose from. This is a first look at some of the possibilities, and you will explore all of these tools in-depth throughout this program.

Choosing the right tool for the job

As a data analyst, you will usually have to decide which program or solution is right for the particular project you are working on. In this reading, you will learn more about how to choose which tool you need and when.

Depending on which phase of the data analysis process youโ€™re in, you will need to use different tools. For example, if you are focusing on creating complex and eye-catching visualizations, then the visualization tools we discussed earlier are the best choice. But if you are focusing on organizing, cleaning, and analyzing data, then you will probably be choosing between spreadsheets and databases using queries. Spreadsheets and databases both offer ways to store, manage, and use data. The basic content for both tools are sets of values. Yet, there are some key differences, too:

You donโ€™t have to choose one or the other because each serves its own purpose. Generally, data analysts work with a combination of the two, as both tools are very useful in data analytics. For example, you can store data in a database, then export it to a spreadsheet for analysis. Or, if you are collecting information in a spreadsheet, and it becomes too much for that particular platform, you can import it into a database. And, later in this course, you will learn about programming languages like R that give you even greater control of your data, its analysis, and the visualizations you create.

As you continue learning about these important tools, you will gain the knowledge to choose the right tool for any data job.

Self-Reflection: Reviewing past concepts

So far weโ€™ve learned about the data life cycle and the data analysis process. They cover the following steps:

1. Data life cycle:

  • Plan
  • Capture
  • Manage
  • Analyze
  • Archive
  • Destroy

2. Data analysis process:

  • Ask
  • Prepare
  • Process
  • Analyze
  • Share
  • Act

Reflection

Take a moment to consider what youโ€™ve learned about these processes. In the text box below, write three to five sentences (60 - 100 words) explaining the relationship between the data life cycle and the data analysis process. How are they similar? How are they different?

While the data analysis process drives our projects and helps us reach our business goals, we have to understand the life cycle of the data weโ€™ll be working with in order to use that process. We canโ€™t analyze our data well if we donโ€™t have a thorough understanding of our data. Similarly, we can collect all the data we want, but the data becomes worthless if we donโ€™t have a good plan for analyzing the data. ๋ฐ์ดํ„ฐ ๋ถ„์„ ํ”„๋กœ์„ธ์Šค๊ฐ€ ํ”„๋กœ์ ํŠธ๋ฅผ ์ถ”์ง„ํ•˜๊ณ  ๋น„์ฆˆ๋‹ˆ์Šค ๋ชฉํ‘œ์— ๋„๋‹ฌํ•˜๋Š” ๋ฐ ๋„์›€์„ ์ฃผ์ง€๋งŒ, ์ด ํ”„๋กœ์„ธ์Šค๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ์•ž์œผ๋กœ ์‚ฌ์šฉํ•  ๋ฐ์ดํ„ฐ์˜ ์ˆ˜๋ช… ์ฃผ๊ธฐ๋ฅผ ์ดํ•ดํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋ฐ์ดํ„ฐ๋ฅผ ์ž˜ ์ดํ•ดํ•˜์ง€ ๋ชปํ•˜๋ฉด ๋ฐ์ดํ„ฐ๋ฅผ ์ž˜ ๋ถ„์„ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ, ์›ํ•˜๋Š” ๋ชจ๋“  ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ, ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„์„ํ•  ์ ์ ˆํ•œ ๊ณ„ํš์ด ์—†์œผ๋ฉด ๋ฐ์ดํ„ฐ๋Š” ๋ฌด์šฉ์ง€๋ฌผ์ด ๋ฉ๋‹ˆ๋‹ค.

Question 2

Next, in 3-5 sentences (60 - 100 words), explain the relationship between the ask phase of the data analysis process and the plan phase of the data life cycle. How are they similar? How are they different?

Both phases involve planning, and asking questions. They are different in that the โ€˜Askโ€™ phase in the data analysis process focuses on โ€œbig pictureโ€ strategic thinking about business goals, while the โ€œPlanโ€ phase focuses on โ€œnuts and boltsโ€ of the project, like what data you have access to, what data you need, and where youโ€™re going to get it. ๋‘ ๋‹จ๊ณ„ ๋ชจ๋‘ ๊ณ„ํš ์ˆ˜๋ฆฝ๊ณผ ์งˆ๋ฌธ์ž…๋‹ˆ๋‹ค. ๋ฐ์ดํ„ฐ ๋ถ„์„ ํ”„๋กœ์„ธ์Šค์˜ '๋ฌป๊ธฐ' ๋‹จ๊ณ„๋Š” ๋น„์ฆˆ๋‹ˆ์Šค ๋ชฉํ‘œ์— ๋Œ€ํ•œ "ํฐ ๊ทธ๋ฆผ" ์ „๋žต์ ์ธ ์‚ฌ๊ณ ์— ์ดˆ์ ์„ ๋งž์ถ”๊ณ , "๊ณ„ํš" ๋‹จ๊ณ„๋Š” ์–ด๋–ค ๋ฐ์ดํ„ฐ์— ์•ก์„ธ์Šคํ•  ์ˆ˜ ์žˆ๋Š”์ง€, ์–ด๋–ค ๋ฐ์ดํ„ฐ๊ฐ€ ํ•„์š”ํ•œ์ง€, ์–ด๋””์„œ ์–ป์„ ๊ฒƒ์ธ์ง€์™€ ๊ฐ™์€ ํ”„๋กœ์ ํŠธ์˜ "๋„ˆํŠธ ๋ฐ ๋ณผํŠธ"์— ์ดˆ์ ์„ ๋งž์ถ˜๋‹ค๋Š” ์ ์—์„œ ๋‹ค๋ฆ…๋‹ˆ๋‹ค.

Test your knowledge on the data analysis toolbox

Question 1

Based on what you have learned in this course, spreadsheets are digital worksheets that enable data analysts to do which of the following tasks? Select all that apply.

  • Store data

  • Sort and filter data

  • Choose a topic for data analysis

  • Organize data in columns and rows

Correct. Spreadsheets enable data analysts to store, organize, sort, and filter data. This helps them see patterns, group information, and easily find the information they need.

Question 2

Fill in the blank: A set of instructions that performs a specific calculation using spreadsheet data is called _____.

  • an operation

  • a report

  • a formula

  • a program

Correct. A set of instructions that performs a specific calculation using spreadsheet data is called a formula.

Question 3

A database is a collection of data stored in a computer system.

  • True

  • False

Correct. A database is a collection of data stored in a computer system.

Question 4

In data analytics, SQL is an acronym meaning _____ query language.

  • statistical

  • syntax

  • structured

  • software

Correct. SQL stands for structured query language. It enables data analysts to communicate with a database.

Question 5

What is the term for the graphical representation of data?

  • Data collection

  • Data language

  • Data visualization

  • Data summary

Correct. Data visualization is the graphical representation of data.