Week 1 Test Prep 5.3 5.4 - LindaLiu1202/lindaliu GitHub Wiki

5.3 Computing Bias

  • Computing innovations can reflect existing human biases because of biases written into the algorithms or biases in the data used by the innovation
  • Programmers should take action to reduce bias in algorithms used for computing innovations as a way of combating existing human biases
  • Biases can be embedded at all levels of software development
    • Questions to ask about bias
      • Enhancing or intentionally excluding?
      • Intentionally or Unintentionally?
      • Receiving feedback from a wide variety of people?
  • Netflix
    • Explicit data: thumbs, name, address, etc
    • Implicit Data: when you watch, what you binge, show style most popular for you
    • Bias: netflix exclusives are featured ahead, subscriptions
  • Hypothetical Loan Company
    • Creating software to assist loan officers, finds trends in successful loans, reject those who don’t fit their trends (age, gender, race, ethnicity)
  • All software can be biased (unintentional or intentional): casual vs sweaty, YouTube Kids, Facebook vs Instagram vs Snapchat/TikTok, WeChat, Kakao Talk

Intentional or Purposeful bias

  • Facebook vs TikTok, is there purposeful exclusion? Is it harmful? Should it be corrected? Is it good business?
    • I don't think there is purposeful exclusion in Facebook and TikTok, but they are favoring those who can attract more viewers or brought benefits to the company. Also, the algorithm of Facebook and TikTok will calculate what kind of video the user likes to watch and recommend more videos of this like on the For You Page so that the viewer watch the content they like and keep using this app which is beneficial to the company, that's how the company gain money. It is not harmful; however, it can be harmful to the user's health if the user spent way too much time on the app, but it's not hurting the company. It should not be corrected since there is nothing wrong with this, it's just a business strategy, it's just the way thing works for Facebook and TikTok these kinds of app. It is a good business though.
    • Tiktok is targeting us teenager and Facebook is targeting older groups, which is not bad.
  • Amazon, Alexa Google, Apple Siri original had flaws in detecting accents or young voices, was this purposeful? Is it harmful? Should it be corrected? Is it good business?
    • I think this is not purposeful and not harmful. They just need more time to develop and improve on detecting accents or young voices. No algorithm is perfect, all algorithm has flaws/areas that can be improved on, but it will definitely be more beneficial if they can detect more accents since it will benefit the user and expand the customer base.
  • Netflix, do their algorithm purposely bias certain content to certain watchers? Is it harmful? Is it good business?
    • For kids or younger children, Netflix would purposefully recommend more movies that are appropriate for the kid's age, which is not harmful but better for the kids.

GitHub Pages Actions. Break out into Study Groups of 2 or 3

  • Watch the video... HP computers are racist
  • Come up with some thoughts on the video and be ready to discuss them as I call on you. Here are some ideas...
    • Does the owner of the computer think this was intentional?
      • No, I think the owner of the computer does not think this was intentional.
    • If yes or no, justify you conclusion.
      • He is more like joking when he said the computer was racist because in the description box, he uses the word "hilarious," which makes me think that it is more like he discovered that the computer cannot recognize or track black faces but knew that it is not doing it on purpose. If he really thinks the computer was intentionally being racist, he would be more angry when filming this video.
    • How do you think this happened?
      • I guess it's because of the difference in skin color, lighter skin color creates more contrast between facial features and the rest of the face so the computer can recognize the face.
    • Is this harmful? Was it intended to be harmful or exclude?
      • It is not as bad as "harmful," but the customer's experience and feeling would not be good since the customer would not spent money to buy a computer with a huge flaw of not recognizing his face. This is not intended to be harmful or exclude, but it's a huge error that the company has to fix as soon as possible.
    • Should it be corrected?
      • Yes, it should definitely be corrected as soon as possible so that all customer, not matter the skin color, can have a better user experience.
    • What would you or should you do to produce a better outcome?
      • I should improve the algorithm that's used for recognizing different faces. Have more testers that are different ethnicities to test this computer.

5.4 Crowdsourcing

  • Use public to gather data

    • Example, covid statistics api
  • Crowdsourcing provides the ability to obtain shared information, share information, and participate in distributed computing.

  • The more you crowdsource, the more you reach beyond your own community, the more you will reduce Computer Bias.

  • Crowdsourcing provides the ability to obtain shared information, share information, and participate in distributed computing.

  • All information is biased, even with crowdsourcing. Still have to do some analytics to see if the data is accurate or not.

  • Wikipedia - has everything that we think of, it's not comprehensive, not intended to give us deep study

    • Britannica and Wikipedia, which is more bias? Britannica because less people work on it.
    • Widespread access to information and public data facilitates the identification of problems, development of solutions, and dissemination of results.
  • Science has been affected by using distributed and “citizen science” to solve scientific problems.

  • Citizen science is scientific research conducted in whole or part by distributed individuals, many of whom may not be scientists, who contribute relevant data to research using their own computing devices.

  • Crowdsourcing is the practice of obtaining input or information from a large number of people via the internet.

  • Human capabilities can be enhanced by collaboration via computing.

  • Crowdsourcing offers new models for collaboration, such as connecting businesses or social causes with funding.Widespread access to information and public data facilitates the identification of problems, development of solutions, and dissemination of results.

  • Science has been affected by using distributed and “citizen science” to solve scientific problems.

  • Citizen science is scientific research conducted in whole or part by distributed individuals, many of whom may not be scientists, who contribute relevant data to research using their own computing devices.

  • Crowdsourcing is the practice of obtaining input or information from a large number of people via the internet.

  • Human capabilities can be enhanced by collaboration via computing.

  • Crowdsourcing offers new models for collaboration, such as connecting businesses or social causes with funding.

Evidence of Crowdsourcing

  • Wikipedia has a ton of information from crowdsourcing, see Wikipedia definition on crowdsourcing. It can have inaccuracies, but when it does it often is corrected through a self-policing community. Reviews and many authors have made this, according to many, better than "official" information.
  • Crypto currency and associated block chain. All exchanges of money are validated at least 3-times by independent miners. If there is a flaw in the independent calculations the process is checked and performed again. Innovation of crypto crowdsourcing has impact on how governments think about currency. Additionally, block chain algorithms are being considered for many other crowdsourcing most private data (ie medical records).
  • COVID data, it is easy to recognize areas that are contributing and not contributing. This data has impacted all our lives and decision we make on attending public events, flying on planes, or wearing masks. The community of data and analysts will spawn many new ways of thinking about data that impacts lives.

Obtaining Data via Crowdsourcing

  • We have all experienced Crowdsourcing by using external data through API's, namely RapidAPI. This data has influenced how we code and shown possibilities in obtaining and analyzing data.
  • Additionally, we have all participated in code Crowdsourcing by using GitHub. Many of you have forked from the Teacher repository, or exchanged code with fellow students. Not only can we analyze GitHub code, but we can obtain profiles and history about the persons coding history. The Teacher presenting these videos recommends Kaggle for code and science exploration. Then of course, once we have an idea what to "Learn", the avenue of crowdsource youtube or google searched articles make the word our Teacher.

GitHub pages actions

  • CompSci has 150 principles students. Describe a crowdsource idea and how you might initiate it in our environment?
    • We have a place for reviews. How do students think about this class? Rating? Where can the class improve on? Advices? etc.
  • What about Del Norte crowdsourcing? Could your final project be better with crowdsourcing?
    • Yes, since our final project is to make a new version of the Career Technical Education website, we need to include lots of information and statistics from lots of people.