AI‐Homework‐03 - TheEvergreenStateCollege/upper-division-cs-23-24 GitHub Wiki

This homework has three parts

AI Technical Reading

List any teammates you work with.

Attempt and discuss the following questions in a dev diary entry. Be sure to include screenshots and code blocks to illustrate your point.

Read pages 12 through 26 of the Deep Learning book

What is connectionism and the distributed representation approach? How does it relate to the MNIST classification of learning the idea of a circle, regardless of whether it is the top of the digit 9 or the top / bottom of the digit 8?
What are some factors that has led to recent progress in the ability of deep learning to mimic intelligent human tasks?
How many neurons are in the average human brain, versus the number of simulated neurons in the biggest AI supercomputer described in the book chapter? Now in the year 2024, how many neurons can the biggest supercomputer simulate? (You may use a search engine or an AI chat itself to speculate).

Read Chapter 2: Gradient Descent from 3Blue1Brown and respond to the questions below in your dev diary entry.

Let's say you are training your neural network on pairs of $(x,y)$, where $x$ is a training datapoint (an image in MNIST) and $y$ is the correct label for $x$.

Why does the neural network, before you've trained it on the first input $x$, output "trash", or something that is very far from the corresponding $y$?

Review this shared Google Colab notebook.

If you have a Numpy array that looks like the following, give its shape as a tuple of maximum dimensions along each axis.

For example (p,q,r) is a tensor of third rank, with p along the first dimension (which "2D layer"), q "rows" in each "2D layer", and r "columns" in each "row".

[[[1,2,3,4],
  [5,6,7,8],
  [9,10,11,12]]
]

Assume your neural network in network.py is created with the following layers

net = Network([7,13,2])

What is the list of shapes of the self.weights and self.biases members in the constructor of Network? Because these are lists of Numpy matrices, the different elements can have different shapes, so your answer will have a slightly different form.

Your answer should look like a list of tuples, such as

[ (1,2), (3,4), (5,6) ]

which means a list of three shapes, the first has 1 row of 2 columns each, the second has 3 rows of 4 columns each, and the last has 5 rows of 6 columns each.

From the notes, answer this question ordering the given outputs in terms of the cost function they give out.

What is the effect of changing the learning rate (the Greek letter $\eta$ "eta") in training with SGD?

Read the hiker analogy by Sebastian Raschka, related to our outdoor hill-climbing activity

Why is the word "stochastic" in the name "stochastic gradient descent", and how is it different than normal gradient descent?

Programming MNIST Classifier in Python

This is the same as our lab activity to be completed in class on Thursdays.

Follow the instructions there.

Human Writing: Paul McMillin Guest Speaker, ChatGPT and Hallucination

Pre-Class Reading Slides

Human Prompt:

AI chat models are primarily trained from web crawls (automated mass retrievals of data) from websites that may include Wikipedia, social media platforms that don't require logins like X (formerly known as Twitter), and paywall protected sites like the New York Times and academic journals. Correlations that appear between words, including emotional tone as interpreted by humans, from these training sources is more likely to be included in the chat model and to appear in generated text.

This chat model is similar to the weights and biases (the parameters) that we are training in our MNIST handwritten digit classifier. In fact, GPTs like ChatGPT include multiple neural networks in their architecture that work very similar in principle.

In a Dev Diary entry, write a response addressing the following questions and tying them together with thoughts on how you currently view AI chat alongside other tools for learning.

(These are related but different from the discussion questions brought up during the guest seminar).

Respond to the prompt in the slides.

How do the essays hang together? Is
4’s essay, which responds to exactly the
same prompt, an improvement on that
of 3.5? What do you think of ChatGPT
as a student writer? Would you want
to use ChatGPT (or other AI) for an
assignment like this? If you did, how
would you use it? For a first draft? To
help edit a first draft you wrote
yourself? Would you just submit
ChatGPT’s version as is, maybe ‘making
it your own’ a bit by changing a few
words or adding a few things of your
own? If you would use ChatGPT in any
way, would you do that merely for
convenience, or do you think it would
contribute to your development as a
thinker and academic writer?”

What is framing and the "standard story" in terms of journalism?

System prompts are a technical feature that allow you to steer or pre-condition your interaction with an AI chat.

How does journalistic framing relate to system prompts?
How does the "standard story" relate to the kinds of information that is likely to be expressed by AI chat systems like ChatGPT?
How can research assistance for a critical essay be connected to using AI chat to learn technical topics or for assistance with programming languages?