1.5.2.The importance of fair business decisions - sj50179/Google-Data-Analytics-Professional-Certificate GitHub Wiki

The power of data in business

  • Issue : a topic or subject to investigate

  • Question : designed to discover information

  • Problem : an obstacle or complication that needs to be worked out.

  • Business task : the question or problem data analysis answers for business

  • Data-driven decision-making : using facts to guide business strategy

Data helps us see the whole thing. With data, we have a complete picture of the problem and its causes, which lets us find new and surprising solutions we never would've been able to see before. Data analytics helps businesses make better decisions. It all starts with a business task and the question it's trying to answer.

"I think one of the most important things to remember about data analytics is that data is data. (...) I found that data acts like a living and breathing thing." - Rachel, Business systems and analytics lead at Verily

Understanding data and fairness

Fairness : ensuring that your analysis doesn't create or reinforce bias.

Question

Fill in the blank: In data analytics, fairness means ensuring that your analysis does not create or reinforce bias. This requires using processes and systems that are fair and _____.

  • favorable

  • inclusive

  • restrictive

  • partial

Correct. Ensuring that analysis does not create or reinforce bias requires using processes and systems that are fair and inclusive to everyone.

As a data analyst, it's your responsibility to make sure your analysis is fair, and factors in the complicated social context that could create bias in your conclusions. It's important to think about fairness from the moment you start collecting data for a business task to the time you present your conclusions to your stakeholders.

Self-Reflection: Business cases

In this activity, you’ll have the opportunity to review three case studies and reflect on fairness practices.

Case Study #1

In an effort to improve the teaching quality of its staff, the administration of a high school offered the chance for all teachers to participate in a workshop, though they were not required to attend. Instead, they were encouraged to sign up on a first-come, first-served basis. Of the 43 teachers on staff, 19 chose to take the workshop.

At the end of the academic year, the administration collected data on all teachers’ performance. Then they compared the data on those teachers who attended the workshop to the teachers who did not attend. The data was collected via student surveys that ranked a teacher's effectiveness on a scale of 1 (very poor) to 6 (outstanding). The data revealed that those who attended the workshop had an average score of 4.95, while teachers that did not attend the workshop had an average score of 4.22. The administration concluded that the workshop was a success.

Reflection

Are there examples of fair or unfair practices in the above case? If there are unfair practices, how could a data analyst correct them?

In the text box below, write 3-5 sentences (60-100 words) answering these questions.

This is an example of unfair practice. It is tempting to conclude — as the administration did — that the workshop was a success. However, since the workshop was voluntary and not random, it is impossible to find a relationship between attending the workshop and the higher rating.

It is possible that the workshop was effective, but other explanations for the differences in the ratings cannot be ruled out. For example, another explanation could be that the staff volunteering for the workshop was the better, more motivated teachers. This group of teachers would be rated higher whether or not the workshop was effective.

It’s also worth noting that there is no direct connection between student survey responses and the attendance of the workshop, so this data isn’t actually useful. The data analyst could correct this by asking for the teachers to be selected randomly to participate in the workshop, and by adjusting the data they collect to measure something more directly related to workshop attendance, like the success of a technique they learned in that workshop.

Case Study #2

A self-driving car prototype is going to be tested on its driving abilities. The test is carried out on various types of roadways — specifically a race track, trail track, and dirt road.

The prototype is only being tested during the day time. The data collected includes sensor data from the car during the drives, as well as video of the drive from cameras on the car.

The results of the initial tests illustrate that the new self-driving car met the performance standards across each of the different tracks and will progress to the next phase of testing, which will include driving in different weather conditions.

Reflection

Are there examples of fair or unfair practices in the above case? If there are unfair practices, how could a data analyst correct them?

In the text box below, write 3-5 sentences (60-100 words) answering these questions.

This case study shows an unfair practice. While the prototype is being tested on three different tracks, it is only being tested during the day, for example. Conditions on each track may be very different during the day and night and this could change the results significantly. The data analyst should correct this by asking the test team to add in night-time testing to get a full view of how the prototype performs at any time of the day on the tracks.

Case Study #3

An amusement park is trying to determine what kinds of new rides visitors would be most excited for the park to build. In order to understand their visitors’ interests, the park develops a survey. They decide to distribute the survey by the roller coasters because the lines are long enough that visitors will have time to fully answer all of the questions. After collecting this survey data, they find that most visitors apparently want more roller coasters at the park.

Reflection

Are there examples of fair or unfair practices in the above case? If there are unfair practices, how could a data analyst correct them?

In the text box below, write 3-5 sentences (60-100 words) answering these questions.

This case study contains an unfair practice. While the decision to distribute surveys in places where visitors would have time to respond makes sense, it accidentally introduces sampling bias. Because the only respondents to the survey are people waiting in line for the roller coasters, the results are unfairly biased towards roller coasters. A data analyst could reduce sampling bias by distributing the survey at the entrance and exit of the amusement park to avoid targeting roller coaster fans.


"How do we actually improve the lives of people by using data? (...) I think aspiring data analysts need to keep in mind that a lot of the data that you're going to encounter is data that comes from people so at the end of the day, data are people." - Alex, Research scientist at Google

Test your knowledge on making fair business decisions

TOTAL POINTS 4

Question 1

What steps do data analysts take to ensure fairness when collecting data? Select all that apply.

  • Understand the social context

  • Use an inclusive sample population

  • Clean the data provided

  • Include data self-reported by individuals

Correct. Considering inclusive sample populations, social context, and self-reported data enable fairness in data collection.

Question 2

Avens Engineering needs more engineers, so they purchase ads on a job search website. The website’s data reveals that 86% of engineers are men. Based on that number, an analyst decides that men are more likely to be successful applicants, so they target the ads to male job seekers. What should the analyst have done instead?

  • Let Avens Engineering decide which type of applicants to target ads to.

  • Only show ads for the engineering jobs to women.

  • Decline to accept ads from Avens Engineering because of fairness concerns.

  • Make sure their recommendation doesn’t create or reinforce bias

Correct. They should make sure their recommendation doesn't create or reinforce bias. As a data analyst, it’s important to help create systems that are fair and inclusive to everyone.

Question 3

On a railway line, peak ridership occurs between 7:00 AM and 5:00 PM. The fairness of a passenger survey could be improved by over-sampling data from which group?

  • Male passengers

  • Daytime riders

  • Nighttime riders

  • Female passengers

Correct. Over-sampling the data from nighttime riders, an under-represented group of passengers, could improve the fairness of the survey.

Question 4

A real estate company needs to hire a human resources assistant. The owner asks a data analyst to help them decide where to advertise the job opening. The analyst learns that the majority of human resources professionals are women, validates this finding with research, and targets ads to a women's community college. This is fair because the analyst conducted research to make sure the information about gender breakdown of human resources professionals was accurate.

  • True

  • False

Correct. This is not fair. Fairness means ensuring that analysis doesn't create or reinforce bias. As a data analyst, it’s important to help create systems that are fair and inclusive to everyone.