AI‐Homework‐04 - TheEvergreenStateCollege/upper-division-cs-23-24 GitHub Wiki

AI Self-Hosting, Spring 2024

Homework 04

Reading and Questions

Backpropagation Overview

3Blue1Brown Chapter 3: Backpropagation

After reading the chapter, attempt an answer to these questions in a dev diary entry.

Question 0. In this equation, match the symbols with their meaning:

image
  • Symbols

    A. $\sigma$

    B. $w_i$

    C. $a_i$

    D. $b$

  • Meanings

i. Activations from previous layer

ii. Bias of this neuron, or threshold for firing

iii. Sigmoid, or squishing function, to smooth outputs to the 0.0 to 1.0 range?

iv. Weights from the previous layer to this neuron

Question 1.

Calculate the cost of this "trash" output of a neural network and our desired label of the digit "3" image

Question 2.

Suppose we are feeding this image forward through our neural network and want to increase the classification of it as the digit "2"

image

Answer this question about decreasing the cost of the "output-2" node firing

image

Question 3.

What does the phrase "neurons that fire together wire together" mean in the context of increasing weights in layer $L$ in proportion to how they are activated in the previous layer $a_L$ ?

image

Question 4.

The following image shows which of the following:

image
  • changes to the weights of all the neurons "requested" by each training data
  • changes to the biases of all the neurons "requested" by each training data
  • changes to the activations of the previous layer
  • changes to the activation of the output layer

In addition to the answer you chose above, what other choices are changes that backpropagation can actually make to the neural network?

Question 5.

In the reading, calculating the cost function delta $\nabla C$ by mini-batches to find the direction of steepest descent is compared to a

  • a cautious person calculating how to get down a hill
  • a drunk stumbling quickly down a hill
  • a cat leaping gracefully down a hill
  • a bunch of rocks tumbling down a hill

What is the closest analogy to calculating the best update changes $\nabla C$ by mini-batches?

  • passing laws by electing a new president and waiting for an entire election's paper ballots to be completely counted
  • asking a single pundit on a television show what laws should be changed
  • asking a random-sized group of people to make a small chance to any law in the country, repeated $n$ times, allowing a person the possibility to be chosen multiple times
  • making a small change to one law at a time chosen by random groups of $n$ people, until everyone in the country has been asked at least once

Question 6.

If each row in this image is a mini-batch, what is the mini-batch size?

image

Remember in our MNIST train.py in last week's lab, the mini-batch size was 10.

Backpropagation Calculus

3Blue1Brown Chapter 4: Backpropagation Calculus

Question #1

For our neural network with layers of [784, 100, 10], what is the size (number of elements) of the $$\nabla C$$ (cost function changes) matrix below:

image

Answer the question again for this smaller neural network

image

Question #2

image
  • Symbols

    A. $a^{(L-1)}$

    B. $\sigma$

    C. $b^{L}$

    D. $w^{L}$

    E. $a^{(L)}$

  • Meanings

i. Activations from the previous layer

ii. Bias of the current layer

iii. Activations of the current layer

iv. Sigmoid, or squishing function, to smooth outputs to the 0.0 to 1.0 range

v. Weights from the previous layer to this neuron

Question 3.

In this tree diagram, we see how calculating the final cost function at the first training image 0 $C_0$ is dependent on the activation of the output layer $a^(L)$. In turn, $a^{(L)}$ is dependent on the weighted output (before the sigmoid function) $z^(L)$, which itself depends on the incoming weights $w^{(L)}$ and activations $a^{(L-1)}$ from the previous layer and the bias of the current layer $b^{L}$

image

What is the relationship of this second, extended diagram to the first one?

image

Choices (choose all that apply)

  1. There is no relationship
  2. The bottom half of the second diagram is the same as the first diagram
  3. The second diagram extends backward into the neural network, showing a previous layer $L-2$ whose outputs the layer $L-1$ depends on.
  4. The second diagram can be extended further back to layer $L-3$, all the way to the first layer $L$
  5. The first diagram is an improved version of the second diagram with fewer dependencies
  6. Drawing a path between two quantities in either diagram will show you which partial derivatives to "chain together" in calculating $\nabla C$

Human Writing

Make progress on the AI Lab 04 on Voice Cloning before attempting this writing exercise. (You don't have to complete it first).

Use the questions below to choose a topic about AI ethics.

  • Consider the Warren Buffett voice cloning demonstration.
  • What thoughts or feelings do you have right now, before hearing your synthesized voice?
    • Should you have the right to synthesize your own voice? What are possible advantages or misuses of it?
    • Photographs, video and audio recordings are technologies that changed the way we remember loved ones who have died.
      • Should family members have the right to create AI simulated personalities of the deceased?
      • If generative AI allows us another ay to interact with a personality of a person in the past, how does this compare with historical re-enactors, or movies depicting fictional events with real people?

Write your response as a dev diary entry wiki page, linked to from your personal dev diary.

⚠️ **GitHub.com Fallback** ⚠️