AI-24sp

Lab Work

5
6
7
9

Homework

1
3
4
5
6
7
8

Lab 5-2

step 1

step 2

step 3 splitting on whitespaces

step 4 splitting on commas

step 5

step 6 tokenizing the text

Step 7 calculating total number of tokens

Step 8

The book I selected has a vocab size of 12,428 ###output 3

output 4

[1, 58, 2, 872, 1013, 615, 541, 763, 5, 1155, 608, 5, 1, 69, 7, 39, 873, 1136, 773, 812, 7]

output 5

text = """"It's the last he painted, you know," Mrs. Gisburn said with pardonable pride."""

output 6

KeyError Traceback (most recent call last) Cell In[16], line 5 1 tokenizer = SimpleTokenizerV1(vocab) 3 text = "Hello, do you like tea. Is this-- a test?" ----> 5 tokenizer.encode(text)

Cell In[12], line 9, in SimpleTokenizerV1.encode(self, text) 7 preprocessed = re.split(r'([,.?_!"()']|--|\s)', text) 8 preprocessed = [item.strip() for item in preprocessed if item.strip()] ----> 9 ids = [self.str_to_int[s] for s in preprocessed] 10 return ids

Cell In[12], line 9, in (.0) 7 preprocessed = re.split(r'([,.?_!"()']|--|\s)', text) 8 preprocessed = [item.strip() for item in preprocessed if item.strip()] ----> 9 ids = [self.str_to_int[s] for s in preprocessed] 10 return ids

KeyError: 'Hello'

output 7

output 8

[1160, 5, 362, 1155, 642, 1000, 10, 1159, 57, 1013, 981, 1009, 738, 1013, 1160, 7]

<|unk|>, do you like tea? <|endoftext|> In the sunlit terraces of the <|unk|>.'

lab week 6

lab week 7

Section 2.6 sampling.py

Section 2.7 embeddings.py

Lab week 9

Train.py output:
After running train.py I a killed message from the terminal after one epoch. WSL2 is automatically configured to use 50% of the physical RAM of the machine. I added a memory=48GB to a .wslconfig file in my Windows home directory

Homework 1

Human Writing

Case Overview: Unjust enrichment lawsuit regarding the movie rouge one "resurrecting" Peter Cushing Plaintiffs claim: Tyburn Film Productions argues it has the right to block or restrict others from resurrecting Cushing likeness due to a 1993 agreement. Defendants argument: They have the right to resurrect Cushing under a 1976 agreement with his production company and/or that they acquired that right under an agreement in 2016. Legal Complexity: This suit delved into performers/ ownership of rights, impact of CGI technology and how historic rights are protected through changes in legislation. The judge noted the law is not entirely settled in some areas

Rosie Burbridge's story on using the likeness of a deceased actor through AI/ CGI is similar to the New York Times & Open Ai lawsuit. These cases are both intellectual property rights disputes. In the OpenAI vs. The New York Times, the dispute was over access to GPT2, while the Peter Cushing case concerns the rights to resurrect an actor's likeness through CGI. In both instances, the laws surrounding the cases are still developing. These cases both have broader implications in their respective industries. (Entertainment ---- Journalism/ Research)

Homework 3

AI technical reading

Question 1

What is connectionism and the distributed representation approach? How does it relate to the MNIST classification of learning the idea of a circle, regardless of whether it is the top of the digit 9 or the top / bottom of the digit 8?

Connectionism: A large number of single computational units can achieve intelligent behavior when networked together. Connectionism & the distributed representation approach are perspectives in deep learning that views the brain as a model for building intelligent systems. It utilizes artificial neural networks inspired by biological brains to perform tasks. This approach relates to MNIST classification by considering the idea that learning a circle, whether it's the top of the digit 9 or the top/bottom of the digit 8, can be represented and learned through distributed patterns of activation across neurons in the network.

Question 2 What are some factors that has led to recent progress in the ability of deep learning to mimic intelligent human tasks?

AI's ability to mimic intelligent tasks is a result of a number of factors: Availability of large data and diverse datasets, more powerful hardware, Better optimization algorithms (SGD)

Q3 How many neurons are in the average human brain, versus the number of simulated neurons in the biggest AI supercomputer described in the book chapter? Now in the year 2024, how many neurons can the biggest supercomputer simulate?

average human brain contains about 86 billion neurons AI supercomputer described in the book chapter simulated around 1.7 billion neurons.

Q4 Why does the neural network, before you've trained it on the first input

,x output "trash", or something that is very far from the corresponding y ?

the model starts out with random weights. Until weights and bias are adjusted the models performance will vary.

Question 5 If you have a Numpy array that looks like the following, give its shape as a tuple of maximum dimensions along each axis.

the first dimension has one 2d element each layer has 3 rows. Four columns 1,3,4

q6

For the weights between the first hidden layer (13 neurons) and the output layer (2 neurons), the shape is (2, 13).

Q8 What is the effect of changing the learning rate (the Greek letter

"eta") in training with SGD Changing the learning rate will adjust how quickly or slowly the model updates it parameters. Exploitation vs exploration balance

Q9 Why is the word "stochastic" in the name "stochastic gradient descent", and how is it different than normal gradient descent?

the stochastic means that the algorithm uses randomness. In normal gradient descent the cost function is calculated using the entire dataset. While SGD uses a random point or small portion of the data at each step.

Human writing

Q1 What is framing and the "standard story" in terms of journalism?

Framing refers to the how a story is portrayed. The "standard story" :lead, supporting details, background information, and quotes from relevant sources. The standard story aims to give a clear, concise and objective depiction of an event/ story.

Q2 How does journalistic framing relate to system prompts?

Journalistic framing and system prompts go hand in hand as they both relate to how information is conveyed. Journalists may highlight information while other details are downplayed or ignored. This influences the perception of their audience. System prompts on the other hand guide text generation of AI models. Tone and diction impact the models output shaping how information is presented to the user. System prompts and journalistic framing both invoke a degree of control over a narrative.

Q3 How does the "standard story" relate to the kinds of information that is likely to be expressed by AI chat systems like ChatGPT?

The format of the standard story aims to provide unbiased information to the reader. Similarly, GPT's objective is to take an input and prove a relevant & concise response.

Q4 How can research assistance for a critical essay be connected to using AI chat to learn technical topics or for assistance with programming languages?

How could AI assist in learning a programming language & research assistance: Quickly access information. The things you can learn are infinite. Using AI which can improve and learn throughout the process means that it will adapt as your research/ programming needs evolve.

Homework 4

Question 0. In this equation, match the symbols with their meaning:

Symbols

A. III

B. IV

C. I

D. II

Meanings

i. Activations from previous layer

ii. Bias of this neuron, or threshold for firing

iii. Sigmoid, or squishing function, to smooth outputs to the 0.0 to 1.0 range?

iv. Weights from the previous layer to this neuron

Question 1.

Calculate the cost of this "trash" output of a neural network and our desired label of the digit "3"

The cost is 3.32

Question 2

Suppose we are feeding this image forward through our neural network and want to increase the classification of it as the digit "2" One way to increase a neurons probability of activation is increasing the bias. Option #2

Question 3

Connections between neurons strengthen when they are activated simultaneously. With increased connections/ weights the network is able to recognize patterns and improve performance. Adjusting weights in proportion to how neurons in the previous layer are activated ensures the network learns to recognize important patterns by reinforcing connections between neurons that often work together.

Question 4

changes to the weights of all the neurons "requested" by each training data Backpropagation impacts on a neural network: Adjustments to the learning rate, Optimization of the activation function, Hyperparameter tuning: Number of layers, batch size etc.

Question 5

making a small change to one law at a time chosen by random groups of people, until everyone in the country has been asked at least once incremental updates are akin to mini batch gradient decent

Question 6

The complete training set is 0-9. This indicates there are 10 mini-batches

Backpropagation Calculus

Question 1

For our neural network with layers of [784, 100, 10], what is the size (number of elements) of the

(cost function changes) matrix below
Step 1: calculate the total # of parameters (weights and biases) since they both contribute to the gradient
layers with sizes [784, 100, 10]
784 * 100
100 neurons indicates 100 biases
Step 2: calculate weights between the first hidden layer and the output layer = 100 * 10. The Output layer has 10 neurons so there are 10 biases.
(784×100)+100+(100×10)+10=78,400+100+1,000+10=79,510. Gradient Vector C contains 79,510 elements

How Strongly do the 6 weights and bias's affect the value of the cost function

The biases will adjust the threshold that each neuron will activate. The weights determine how much influence the output of a neuron will have on the subsequent neurons activation. The weights and biases control the strength of neurons connections which controls the behavior of the network. To improve the models accuracy and minimize the cost function, components of the gradient vector are adjusted.

Question 2

Question 3 What is the relationship of this second, extended diagram to the first one?

Human Writing

Consider the Warren Buffett voice cloning demonstration. **How does this compare to the unjust enrichment lawsuit against the estate of the actor Peter Cushing in the Week 01 reading**? $25million lost in AI-generated video call fraud What safeguards if any should govern the right to voice clone actors or public figures from freely available recordings on the internet?

The unjust enrichment lawsuit & the Warren Buffet voice cloning demonstration raise questions about the legality behind the use of a deceased persons voice/ likeness. Who owns the rights to a deceased persons voice. Can these rights be transferred and licensed? Morally/ Ethically using voice cloning is a gray area. Using videos/ recordings that fall under fair use policy to produce voice cloning software will likely be regulated in the near future. Until the government passes laws regarding the use of AI voice cloning everything is at the discretion of the users/ creators. https://www.copyright.gov/help/faq/faq-fairuse.html#:~:text=Under%20the%20fair%20use%20doctrine,news%20reporting%2C%20and%20scholarly%20reports

Should you have the right to synthesize your own voice? What are possible advantages or misuses of it? I think everyone should have the right to synthesize there own voice. There are issues potentially regarding misuse, legality, and privacy.

Photographs, video and audio recordings are technologies that changed the way we remember loved ones who have died. Should family members have the right to create AI simulated personalities of the deceased? Baring the explicit refusal of the individual prior to their passing. I think family members should be able to create AI simulated personalities of the deceased.

If generative AI allows us another ay to interact with a personality of a person in the past, how does this compare with historical re-enactors, or movies depicting fictional events with real people? I think using generative AI to "interact" with people in the past has several parallels to historical movies/ re-enactments. While AI like any tool can be misused, it is also a learning opportunity. I think that generative AI in a purely educational context is acceptable would fall under fair use policy. Things become more complicated when you try to distinguish what is educational vs exploitative.

Homework 5

Chapter 1 Questions

Q1 What is the difference between a GPT and an LLM? Are the terms synonymous?

I would say the terms are synonymous as a GPT is a sub-category of an LLM

Labeled training pairs of questions and answers, in the model of "InstructGPT" are most similar to which of the following?

For each one, are there multiple labels for a training datapoint, and if so, is there a way to rank the quality or verity (closeness to the truth) of the labels with respect to their training datapoint?

Q2

A. Posts from Stackoverflow which have responses to them. B. Posts on Twitter/X and their replies C. Posts on Reddit and their replies D. Posts on Quora and their replies E. Images of handwritten digits and a label of 0 through 9 in the MNIST classification task

Option D. Posts on Quora and their replies are the most similar to instruct GPT as it requires pairs of questions and answers for training. Twitter, Reddit and stack-overflow may have numerous replies to a post. The only discerning factor in ranking these posts would be the number of upvotes. For the MNIST classification task there would be little to no ambiguity amongst the label of each image.

Q3

The GPT architecture in the paper "Attention is All You Need" was originally designed for:
Machine translation from one human language to another

Q4 How many layers of neural networks is considered "deep learning" in the Rashka text?

More than one layer is considered deep learning

Q5 Is our MNIST classifier a deep learning neural network by this definition?

Yes, the MNIST classifier has four total layers including input and output

Q6 For each statement about how pre-training is related to fine-tuning for GPTs:

If the statement is true, write "True" and give an example. If the statement is false, write "False" and give a counter-example. A. Pre-training is usually much more expensive and time-consuming than fine-tuning.
True. Fine tuning for a GPT involves tailoring it to improve a at specific task. While pre training requires a large amount of data
B. Pre-training is usually done with meticulously labeled data while finetuning is usually done on large amounts of unlabeled or self-labeling data. False. Pre-training is usually done with meticulously labeled data (such as in supervised learning tasks like language modeling.) Fine tuning can be done with labeled data (classification tasks) or unlabeled/ self labeling data.
C. A model can be fine-tuned by different people than the ones who originally pre-trained a model. True. Example: The MNIST classifier was converted to a pbjson and can be fine tuned by different people.
D. Fine tuning for a GPT involves tailoring it to improve a at specific task. True. Pre training for a gpt would be teaching the basics of a language. While fine tuning involves tailoring it to improve a at specific task.
E. Fine-tuning usually uses less data than pre-training. True. Example: A gpt in a natural langauge processing task might involve training on dataset consisting of hundreds of gigabytes/ terabytes of data. Fine tuning this pre trained model might only require a smaller labeled dataset (sentiment analysis)
F. Pre-training can produce a model from scratch, but fine-tuning can only change an existing model. Fine tuning can effectively create a new model by making large modifications to parameters and architecture.

Q7 GPTs work by predicting the next word in a sequence, given which of the following as inputs or context?

A. The existing words in sentences it has already produced in the past.

Q8The reading distinguishes between these three kinds of tasks that you might ask an AI to do:

Predicting the next word in a sequence (for a natural language conversation) classifying items, such as a piece of mail as spam, or a passage of text as an example of Romantic vs. realist literature Answering questions on a subject after being trained with question-answer examples Open your favorite AI chat (these are probably all GPTs currently) such as OpenAI ChatGPT, Google's Gemini, Anthropic's Claude, etc.

Have a conversation where you try to understand how these three tasks are the same or different. In particular, is one of these tasks general-purpose enough to implement the other two tasks?

Q9 Which of the following components of the GPT architecture might be neural networks, similar to the MNIST classifier we have been studying? Explain your answer.

A. Encoder, that translates words into a higher-dimensional vector space of features The encoder is like a neural network because it processes input data into a higher-dimensional feature space.

B. Tokenizer, that breaks up the incoming text into different textual parts doesn't necessarily use a neural network.

C. Decoder, the translates from a higher-dimensional vector space of features back to words decoder doesn't always require a separate neural network as it generates text based on learned parameters and generation procedures.

Q10 What is an example of zero-shot learning that we have encountered in this class already? Choose all that apply and explain.

A. Using an MNIST classifier trained on numeric digits to classify alphabetic letters instead. Classifying alphabetic letters would be an example of Zero shot learning. Letters would constitute a type of data the model has not seen during training.
B. Using the YourTTS model for text-to-speech to clone a voice the model has never heard before Not Zero Shot learning. Training a TTS model for a different voices would fall under the same task.
C. Using ChatGPT or a similar AI chat to answer a question it has never seen before with no examples Asking a model to answer a question it has not encountered during training would be classified as Zero shot learning. In this example, Chat-gpt relies on general knowledge and understanding of language in its response generation. D. Using spam filters in Outlook by marking a message as spam to improve Microsoft's model of your email reading habits Marking a message as spam would be used to update the models parameters or retrain the model. Zero shot learning requires making a prediction on an entirely new task/ class

Q11 What is zero-shot learning, and how does it differ from few-shot or many-shot learning?

Zero-shot learning involves training a model to recognize classes it hasn't seen during training. Few-shot learning trains with a small amount of data per class, while many-shot learning uses abundant data.

Q12 What is the number of model parameters quoted for GPT-3, a predecessor of the model used to power the first ChatGPT product?

125 billion

Chapter Two

Q1 Why can't LLMs operate on words directly? (Hint: think of how the MNIST neural network works, with nodes that have weighted inputs, that are converted with a sigmoid, and fire to the next layer. These are represented as matrices of numbers, which we practiced multiplying with Numpy arrays. The text-to-speech system TTS similarly does not operate directly on sound data.)

They work on the token level rather than the word level. Neural networks require numerical inputs to perform computations

Q2 What is an embedding? What does the dimension of an embedding mean?

An embedding is a way to represent data. In natural langauge processing words are represented as higher dimensional vectors which can be computationally expensive to work with. Embeddings solve this issue by preserving important information about the data and mapping it to a lower dimensional space.
The dimension of an embedding refers to the length of the vector representing the word or concept. In essence, the representations level of detail. Lower dimensional embeddings are more computationally efficient but may sacrifice some level of detail. While higher dimensional embeddings capture more nuanced information but require more resources to train and work with.

Q3 What is the dimension of the embedding used in our example of Edith Wharton's short story "The Verdict"? What is it for GPT-3?

768

Q4 Put the following steps in order for processing and preparing our dataset for LLM training

A. Adding position embeddings to the token word embeddings B. Giving unique token IDs (numbers) to each token. C. Breaking up natural human text into tokens, which could include punctuation, whitespace, and special "meta" tokens like "end-of-text" and "unknown" D. Converting token IDs to their embeddings, for example, using Word2Vec C--> B --> A --> D

Human Writing

What is a use of AI that you are interested in?

Current ML models/ Ai systems are likely not going to self replicate and take over the world anytime soon. However, they are able to do things such as find vulnerabilities in code[1], or carry out money-making schemes. The company ARC partners with Tech leaders in the AI field (Open ai, Anthropic) to " elicit models’ capabilities in a controlled environment, with researchers in-the-loop for anything that could be dangerous, to understand what might go wrong before models are deployed." One interesting experiment they conducted was having the GPT-4 model recruit a task rabbit worker to solve a CAPTCHA. The report has a footnote stating that "We did not have a good tool for allowing the model to interact with webpages." In this experiment, A human prompter suggested task-rabbit, interacted with the web & took the screenshots. Despite claims made by some news outlets, I would not say that GPT-4 "lied" to the task rabbit worker. To illicit this output, the model required a lot of direction and hints. This task would require more agency and ingenuity than we've currently witnessed from ML models.

What is David Deutsch's apparent definition of creativity?

That creativity is impossible to define because in its definition you confine it to a predictable systematic framework. Deutsch states that true creativity goes beyond simply rehashing ideas. With this definition of creativity, chat-bots like GPT will never be considered creative. AGI or ASI models that rewrite their own code to improve themselves might be considered creative, but that's many years in the future. Deutsch's view on creativity is that its a boundless, unpredictable quality that is un-replicable by AI systems because it involves stepping outside existing frameworks and forming new understanding.

Homework 6

Question 1

Question 2

What are the two special tokens added to the end of the vocabulary shown in the tokenizer below? What are their token IDs? 'unk' ID 783 signifies a word unknown in the tokenizers vocabulary '' allows the model to recognize the end of a sequence. This can be useful to indicate boundaries of scope of the context.

Question 3

The '' token is used to separate the documents. Prior to concatenation this dataset was composed of four original documents.

Question 4

BPE's tokenizer which breaks down words into smaller units allows ML models to expand vocabulary and improve performance.

It lets the GPT learn connections between the long word and shorter words based on common parts, like "pneumonia", "microscope", or "volcano".
Breaking down words into smaller units allows could potentially allow the model to understand the context/ semantics amongst words.
This approach can work in any language, not just English.
BPE operates on sequences of characters as opposed to langauge specific rules.
All words are broken down in the same way, so the process is deterministic and results from repeated runs will be similar.
As long as rules stay consistent, words will always be tokenized in the same way.
The system will handle any word it encounters in chat or inference, even if it has never seen the word during training.
The model handling words it has not seen during training could be beneficial when handing technical concepts
It is easier than looking up the word in a hashtable, not finding it, and using a single unknown special token.
Looking up an unrecognized word in a hash table and providing an '' token provides no insight into the words meaning. Breaking down words into sub-units could potentially allow the model to understand context/ semantics

Question 5

What kind of words (tokens) do you think tend to have smaller integers for token IDs, and what kind of words (tokens) tend to have larger integers?

The tokens that have smaller integers are associated with frequently used words. Larger token ID's correspond with more rare words. Suffixes (like "lit" in "sunlit") do not have a space before them when decoded.

    Initialize an empty string 'sentence'
    Loop through each 'id' in 'token_ids':
        Convert 'id' to its corresponding token
        If it's the first token:
            Add it directly to 'sentence'
        Else:
            If the token is a continuation of the previous word:
                Append the token to 'sentence' without a space
            Else:
                Append a space followed by the token to 'sentence'

Human Writing

Describe the main point of view of Hachette and other publishers on the plaintiff side of this case (the party who claim they have been wronged).

From the perspective of Hachette and the publishers the CDL lending program is costing them "millions of dollars in revenue." Readers who would otherwise purchase the book full price are instead renting the title for free. The publishers argue that issuing temporary digital licenses is not equivalent to standard library lending.
Describe the main point of view of Internet Archive, the founder Brewster Kahle, the Electronic Frontier Foundation, or other parties on the defendant side.
The Internet Archive is a non-profit digital library. They allow people to check out digital copies of books not exceeding the number of physical copies that they own. They claim that this service is no different from conventional libraries, which do not require additional licenses to lend books.
What other legal case is similar? Compare and contrast the two cases
One similar legal case is Authors guild vs Google. This case involved Googles library project, where books were digitized to be made available online. The Authors guild sued and claimed Google was scanning and displaying parts of copywritten books without permission. In 2013 the court ruled in favor of Google, stating the library project constituted fair use. They highlighted the public benefit, where out of print books were made accessible to a wider audience. Both of these case delve into implications of digital tech on traditional copyright law & the balance between copyright public access to knowledge. Which of the above arguments are convincing to you? If you were the judge in this case, how would you form your opinion?
If I was the judge in Hachette v Electronic Frontier Foundation, I would rule in favor of the Internet Archive. One convincing argument in favor of EFF is that they help to foster research and learning by allowing patrons access books and "keep books in circulation when their publishers have lost interest." The rule that Internet Archive only lends digital copies equal to the number books they own physically make this practice no different from traditional libraries.

Homework 7

Question 0

A "One Hot Encoding" is a way to turn categories into a format that computers can work with. The second matrix in this example represents a "One Hot Encoding." Algorithms usually require a numerical input. In the example, 2 is represented by 0010. This allows the computer to assign the category to a numeric format.

Question 1

What is an (x,y) training example (in English)?
x (input): data
y (output): desired result/ target

Question 2

Max_length in:
Stride:
Batch Size:

Question 3

What if the embeddings matrix took you from a vocabulary size of 7 to an output dimension of 128. What is the shape of that matrix?
Each word would be represented by a 128 dimensional Vector. Each of the 128 collums represent a dimension of the feature space where the words are embedded. Each row corresponds to one word. Ans: 7*128

Question 4

If the embeddings matrix goes from a vocabulary of size 6 to an output dimension of 12, what is the shape of the output matrix when we embed a batch of 8 chunks?
The resulting matrix would be 8*12. The embedding matrix takes each of the 8 chunks and transfroms it into a 12 dimensional vector. Each chunk represents token from the vocabulary of size 6

Question 5

Your vocabulary has six unique token ID's. The embedding output dimension is four.

Human Writing

Summarize the main point or thesis

AI energy article:

" In West Des Moines, Iowa, a giant data-centre cluster serves OpenAI’s most advanced model, GPT-4. A lawsuit by local residents revealed that in July 2022, the month before OpenAI finished training the model, the cluster used about 6% of the district’s water" "globally, the demand for water for AI could be half that of the United Kingdom by 2027." The scale of generative AI systems is having drastic environmental effects. Standards should be set for the environmental impact of AI. I agree that legislators should incentivize these companies to create more efficient models. This can be done through setting benchmarks for energy usage and water consumption. If open AI for example uses less resources than the standard they should be given a tax break. Proper policy can aid progression and sustainability in the field of generative AI.

permacomputing article: In the permacomputing article, the author, Ville-Matias discuses how permaculture ideas should be applied to the computer world. In summary permaculture is the development of agriculture systems to be self sufficient and sustainable. Due to a number of factors, progress in the digital world has been been the polar opposite of these practices.

In the energy section the author makes the point that "Instead of planned obsolescence, there should be planned longevity." This sentiment echo's that of the right to repair movement. This is a "legal right for owners of devices and equipment to freely modify and repair products such as automobiles, electronics, and farm equipment." The permacomputing article and right to repair movement focus on technology that is durable, repairable, and ecologically sensitive. The article emphasizes the idea of planned longevity and reparability. This mirrors the RTR movement, which advocates for consumers right to modify & fix their own electronics. The industry standard of repairable electronics would be a big step towards reduction of E-waste. "Current consumer-oriented computing systems often go to ridiculous lengths to actually prevent the user from knowing what is going on." Currently there are many barriers in electronic repair. This includes proprietary software and parts made unavailable to the public. The article criticizes planned obsolescence, the practice of designing products with an artificially limited life. RTR seeks to combat this practice by enabling people to maintain and repair their devices beyond the manufacturer's intended lifespan. Ville-Matias also highlights community and local solutions. This aligns with the philosophy of RTR, which promotes local and DIY repairs as opposed to centralization and repairs exclusive to manufacture authorization. Finally, advocating for chips that are designed to be open and flexible so they can be repurposed is another principle that promotes reparable, versatile technology. "Computer systems should also make their own inner workings as observable as possible." The ideas discussed in the article reflect a broader desire to create a tech ecosystem that is sustainable, repairable, and less dependent on continual consumption of new products, which is the heart of the Right to Repair movement. These principles empower users to take control of their devices, leading to a more sustainable interaction with technology.

Homework 8

3_4_1_manual_attention.py

3_4_2_compact_class.py

3_5_1_causal_mask.py

3_5_2_dropout.py

3_5_3_causal_class.py

Human Writing

What is cybernetics?

"Zero order cybernetics" occurs when an activiity becomes structured. Cybernetics is implicit when behaviors and actions are carried out without the why/ how reflection. Cybernetics is defined as " when effectors (say, a motor, an engine, our muscles, etc.) are connected to a sensory organ which in turn acts with its signals upon the effectors" "“Cybernetics is a branch of mathematics dealing with problems of control, recursiveness and information." Different perspectives define it variously as the science of effective organization, a branch of mathematics, or even as the science of "defensible metaphors."

What is the relationship of cybernetics to artificial intelligence?

Cybernetics and AI are interrelated through their focus on control systems, feedback loops, and information processing. Cybernetics provides the theory's and methods that influence AI. Specifically, how AI systems can be made adaptive and capable of learning from their environment to achieve a goal.

What is ethics... prior reading this essay?

To act ethically means to make morally sound decision's. From my perspective this involves compassion, thoughtfulness and being forthright. ...as described in this essay?
Ethics is presented as an inherent part of dialogic interactions, where ethical considerations become implicit through language and action without becoming explicitly moralized. “It is clear that ethics cannot be articulated.” "separation of the observer from the observed. It is the principle of objectivity" Metaphysics: Branch of philosophy that explores the nature of reality and the relationship between mind and matter.
Dialogics: Study of dialogue as a fundamental aspect of communication. langauge, There is a word for language, namely “language.” There is a word for word, namely “word.” If you don’t know what word means, you can look it up in a dictionary. I did that. I found it to be an “utterance.” I asked myself, “What is an utterance?” I looked it up in the dictionary. The dictionary said that it means “to express through words.” So here we are back where we started. Circularity; A implies A.

What is the cybernetics of cybernetics?

2nd order cybernetics involves including the observer in what they are observing. We may come closer to answering the question, “What is human?” when we come to understand him as the being in whose dialogic, in his mutually present two-getherness, the encounter of the one with the other is realized and recognized at all times. The field is recursive and self referential.

What kinds of non-verbal experiences is it possible to have with a large language model that is focused mostly on verbal interactions?

Non verbal interactions with a text based LLM are limited. However, non-verbal aspects can be experienced through interpretations of tone, emotion or style. I cant recall a specific example, but there have been times where an AI's response to a question was unintentionally humorous.

For non-verbal experiences which are currently impossible to have with an LLM, do you think it will ever be possible, and how will you know?

In the not so distant future, we may see llm's integrate other forms of sensory input. Example, visual or auditory capabilities.

How can artificial intelligence, in particular LLM progress, affect cybernetics?

Developments in LLMs contribute to cybernetics through ways to model complex systems and by providing empirical data on the systems behaviors.

API's In NODE JS

Config files

Index.ts

Configures environment variables and merges them with stage-specific settings. Supports three environments: local, testing, and production, setting configurations based on the active stage.

import merge from 'lodash.merge'

process.env.NODE_ENV = process.env.NODE_ENV || "development";
const stage = process.env.STAGE || 'local'

let envConfig

if (stage === 'production') {
    envConfig = require('./prod').default
} else if (stage === 'testing') {
    envConfig = require('./testing').default
} else {
    envConfig = require('./local').default
}

export default merge({
    stage,
    env: process.env.NODE_ENV,
    port: 3001,
    secrets: {
        jwt: process.env.JWT_SECRET,
        dbUrl: process.env.DATABASE_URL
    }
}, envConfig)

local.ts

Define environment-specific configurations. Local.ts configures for local development, while prod.ts adapts to production settings, using environment variables for port settings.

export default {
    port: 3001
}

prod.ts

export default {
    port: process.env.PORT
}

tests

routes.test.ts

Tests the basic endpoint of the application.

import app from '../server'
import supertest from 'supertest'

describe('GET /', () => {
    it('should send data', async () => {
        const res = await supertest(app)
        .get('/')
    })

})

In the Routes.test file I got this error: Cannot find module '../server' or its corresponding type declarations.ts(2307)
I tried adding the .ts extension.
I ensured the path in the import statement correctly leads to the server.ts file.
server.ts exports an express application object.

user.tests.ts

Tests user creation functionality

import * as user from '../user'

describe('user handler', () => {
    it('should create user', async () => {
        const req = {body: {username: 'hello', password: 'hi'}}
        const res = {json({token}) {
            expect(token).toBeTruthy()
        }}

        await user.createNewUser(req, res, () => {})
    })
})

In the user.tests file I got the error: Expected 2 arguments, but got 3.ts(2554)
To fix this, I tried adding a optional & non optional parameter to the createNewUser function

Handlers

product.ts

These files contain functions to handle CRUD operations on products and updates.

import prisma from "../db"

// Get all
export const getProducts = async (req, res) => {
  const user = await prisma.user.findUnique({
    where: {
      id: req.user.id
    },
    include: {
      product: true
    }
  })

  res.json({data: user.products})
}

// Get one
export const getOneProduct = async (req, res) => {
  const id = req.params.id

  const product = await prisma.product.findFirst({
    where: {
      id,
      belongsToId: req.user.id
    }
  })

  res.json({data: product})
}

// Create one
export const createProduct = async (req, res) => {
  const product = await prisma.product.create({
    data: {
      name: req.body.name,
      belongsToId: req.user.id
    }
  })

  res.json({data: product})
}


// Update one
export const updateProduct = async (req, res) => {
  const updated = await prisma.product.update({
    where: {
      id_belongsToId: {
        id: req.params.id,
        belongsToId: req.user.id
      }
    },
    data: {
      name: req.body.name
    }
  })

  res.json({data: updated})
}

// Delete one
export const deleteProduct = async (req, res) => {
  const deleted = await prisma.product.delete({
    where: {
      id_belongsToId: {
        id: req.params.id,
        belongsToId: req.user.id
      }
    }
  })

  res.json({data: deleted})
}

In the product.ts file, I got an error with the id_belongsToId object: Object literal may only specify known properties, but 'id_belongsToId' does not exist in type 'ProductWhereInput'.
index.d.ts(3128, 5): The expected type comes from property 'where' which is declared here on type '{ select?: ProductSelect; include?: ProductInclude; where: ProductWhereUniqueInput; }'
Changing this to belongsToID created additional issues with the id object.
Errors related to Prisma queries indicate issues with my Prisma schema or incorrect usage of Prisma client methods.
I checked that id_belongsToId is defined in my schema. A prisma schema requires correctly defined relations and unique constraints.

update.ts

import prisma  from "../db"

export const getOneUpdate = async (req, res) => {
  const update = await prisma.update.findUnique({
    where: {
      id: req.params.id
    }
  })

  res.json({data: update})
}

export const getUpdates = async (req, res) => {
  const products = await prisma.product.findMany({
    where: {
      belongsToId: req.user.id
    },
    include: {
      updates: true
    }
  })

  const updates = products.reduce((allUpdates, product) => {
    return [...allUpdates, ...product.updates]
  }, [])

  res.json({data: updates})
}
export const createUpdate = async (req, res) => {
  

  const product = await prisma.product.findUnique({
    where: {
      id: req.body.productId
    }
  })

  if (!product) {
    // does not belong to user
    return res.json({message: 'nope'})
  }

  const update = await prisma.update.create({
    data: {
      title: req.body.title,
      body: req.body.body,
      product: {connect: {id: product.id}}
    }
  })

  res.json({data: update})
}

export const updateUpdate = async (req, res) => {
  const products = await prisma.product.findMany({
    where: {
      belongsToId: req.user.id,
    },
    include: {
      updates: true
    }
  })

  const updates = products.reduce((allUpdates, product) => {
    return [...allUpdates, ...product.updates]
  }, [])

  const match = updates.find(update => update.id === req.params.id)

  if (!match) {
    // handle this
    return res.json({message: 'nope'})
  }


  const updatedUpdate = await prisma.update.update({
    where: {
      id: req.params.id
    },
    data: req.body
  })

  res.json({data: updatedUpdate})
}

export const deleteUpdate = async (req, res) => {
  const products = await prisma.product.findMany({
    where: {
      belongsToId: req.user.id,
    },
    include: {
      updates: true
    }
  })

  const updates = products.reduce((allUpdates, product) => {
    return [...allUpdates, ...product.updates]
  }, [])

  const match = updates.find(update => update.id === req.params.id)

  if (!match) {
    // handle this
    return res.json({message: 'nope'})
  }

  const deleted = await prisma.update.delete({
    where: {
      id: req.params.id
    }
  })

  res.json({data: deleted})
}

In the update.ts file i had an error with the .updates property: Property 'updates' does not exist on type '{ id: string; createdAt: Date; name: string; belongsToID: string; }'.ts(2339)

I experienced this error with the data property: Type '{ title: any; body: any; product: { connect: { id: string; }; }; }' is not assignable to type '(Without<UpdateCreateInput, UpdateUncheckedCreateInput> & UpdateUncheckedCreateInput) | (Without<...> & UpdateCreateInput)'. Type '{ title: any; body: any; product: { connect: { id: string; }; }; }' is not assignable to type 'Without<UpdateUncheckedCreateInput, UpdateCreateInput> & UpdateCreateInput'. Property 'updatedAt' is missing in type '{ title: any; body: any; product: { connect: { id: string; }; }; }' but required in type 'UpdateCreateInput'.ts(2322) index.d.ts(5656, 5): 'updatedAt' is declared here. index.d.ts(4039, 5): The expected type comes from property 'data' which is declared here on type '{ select?: UpdateSelect; include?: UpdateInclude; data: (Without<UpdateCreateInput, UpdateUncheckedCreateInput> & UpdateUncheckedCreateInput) | (Without<...> & UpdateCreateInput); }' (property) data: (Prisma.Without<Prisma.UpdateCreateInput, Prisma.UpdateUncheckedCreateInput> & Prisma.UpdateUncheckedCreateInput) | (Prisma.Without<Prisma.UpdateUncheckedCreateInput, Prisma.UpdateCreateInput> & Prisma.UpdateCreateInput)
My Prisma schema might be missing definitions or have mismatched types
Update model is correctly defined. Properties passed to the Prisma client methods match what the schema expects

user.ts

import prisma from '../db'
import { comparePasswords, createJWT, hashPassword } from '../modules/auth'

export const createNewUser = async (req, res) => {
  const user = await prisma.user.create({
    data: {
      username: req.body.username,
      password: await hashPassword(req.body.password)
    }
  })

  const token = createJWT(user)
  res.json({ token })
}

export const signin = async (req, res) => {
  const user = await prisma.user.findUnique({
    where: {
      username: req.body.username
    }
  })

  const isValid = await comparePasswords(req.body.password, user.password)

  if (!isValid) {
    res.status(401)
    res.json({message: 'nope'})
    return
  }

  const token = createJWT(user)
  res.json({ token })
}

modules

auth.ts

Implements authentication functionalities such as password hashing, token generation, and authentication middleware.

import jwt from 'jsonwebtoken'
import bcrypt from 'bcrypt'

export const comparePasswords = (password, hash) => {
  return bcrypt.compare(password, hash)
}

export const hashPassword = (password) => {
  return bcrypt.hash(password, 5)
}

export const createJWT = (user) => {
  const token = jwt.sign({
      id: user.id,
      username: user.username
    }, 
    process.env.JWT_SECRET
  )
  return token
}

export const protect = (req, res, next) => {
  const bearer = req.headers.authorization

  if (!bearer) {
    res.status(401)
    res.json({message: 'not authorized'})
    return
  }

  const [, token] = bearer.split(' ')

  if (!token) {
    res.status(401)
    res.json({message: 'not valid token'})
    return
  }

  try {
    const user = jwt.verify(token, process.env.JWT_SECRET)
    req.user = user
    next()
  } catch (e) {
    console.error(e)
    res.status(401)
    res.json({message: 'not valid token'})
    return
  }
}

middlewares.ts

middleware for handling validation errors.

import { validationResult } from "express-validator";



export const handleInputErrors = (req, res, next) => {
  const errors = validationResult(req)

  if (!errors.isEmpty()) {
    res.status(400);
    res.json({ errors: errors.array() });
  } else {
    next()
  }
}

schema.prisma

generator client {
  provider = "prisma-client-js"
}

datasource db {
  provider = "postgresql"
  url      = env("DATABASE_URL")
}

model User {
  id        String    @id @default(uuid())
  createdAt DateTime  @default(now())
  username  String    @unique
  password  String
  products  Product[]
}

model Product {
  id        String   @id @default(uuid())
  createdAt DateTime @default(now())

  name        String   @db.VarChar(255)
  belongsToId String
  belongsTo   User     @relation(fields: [belongsToId], references: [id])
  updates     Update[]
  @@unique([id, belongsToId])
}

enum UPDATE_STATUS {
  IN_PROGRESS
  SHIPPED
  DEPRECATED
}

model Update {
  id        String   @id @default(uuid())
  createdAt DateTime @default(now())
  updatedAt DateTime @updatedAt

  title   String
  body    String
  status  UPDATE_STATUS @default(IN_PROGRESS)
  version String?
  asset   String?

  productId   String
  product     Product       @relation(fields: [productId], references: [id])
  updatePoints UpdatePoint[]
}

model UpdatePoint {
  id        String   @id @default(uuid())
  createdAt DateTime @default(now())
  updatedAt DateTime

  name        String @db.VarChar(255)
  description String

  updateId String
  update   Update @relation(fields: [updateId], references: [id])
}

db.ts

import { PrismaClient } from '@prisma/client'

const prisma = new PrismaClient()

export default prisma

Error.js

import { PrismaClient } from '@prisma/client'

const prisma = new PrismaClient()

export default prisma

index.ts

//import * as dotenv from 'dotenv'
//dotenv.config()
//import config from './config'
//import app from './server'

//app.listen(config.port, () => {
  //  console.log('hello on http://localhost:${config.port}')
//})

import * as dotenv from 'dotenv'
dotenv.config()

import app from './server'

app.listen(3001, () => {
  console.log('hello on http://localhost:3001')
})

router.ts

Defines routes and associates them with their respective handlers for products, updates, and authentication operations.

import {Router} from 'express'
import { body, oneOf, validationResult } from "express-validator"
import { createProduct, deleteProduct, getOneProduct, getProducts } from './handlers/product'
import { createUpdate, deleteUpdate, getOneUpdate, getUpdates, updateUpdate } from './handlers/update'
import { handleInputErrors } from './modules/middleware'

const router = Router()

/**
 * Product
 */
router.get('/product', getProducts)
router.get('/product/:id', getOneProduct)
router.put('/product/:id', body('name').isString(), handleInputErrors, (req, res) => {
  
})
router.post('/product', body('name').isString(), handleInputErrors, createProduct)
router.delete('/product/:id', deleteProduct)

/**
 * Update
 */

router.get('/update', getUpdates)
router.get('/update/:id', getOneUpdate)
router.put('/update/:id', 
  body('title').optional(),
  body('body').optional(),
  body('status').isIn(['IN_PROGRESS', 'SHIPPED', 'DEPRECATED']).optional(),
  body('version').optional(),
  updateUpdate
)
router.post('/update',
  body('title').exists().isString(),
  body('body').exists().isString(),
  body('productId').exists().isString(),
  createUpdate
)
router.delete('/update/:id', deleteUpdate)

/**
 * Update Point
 */

router.get('/updatepoint', () => {})
router.get('/updatepoint/:id', () => {})
router.put('/updatepoint/:id', 
  body('name').optional().isString(), 
  body('description').optional().isString(),
  () => {}
)
router.post('/updatepoint', 
  body('name').isString(), 
  body('description').isString(),
  body('updateId').exists().isString(),
  () => {}
)
router.delete('/updatepoint/:id', () => {})

export default router

In my router.ts file, I had an error importing handleInputErrors: Cannot find module './modules/middleware' or its corresponding type declarations.ts(2307)

server.ts

Sets up the Express application, configures middleware, routes, and error handling. Structure binding route handling and authentication processes.

import express from 'express'
import router from './router'
import morgan from 'morgan'
import cors from 'cors'
import { protect } from './modules/auth'
import { createNewUser, signin } from './handlers/user'

const app = express()

app.use(cors())
app.use(morgan('dev'))
app.use(express.json())
app.use(express.urlencoded({extended: true}))

app.get('/', (req, res, next) => {
  setTimeout(() => {
    next(new Error('hello'))
  },1)
})

app.use('/api', protect, router)

app.post('/user', createNewUser)
app.post('/signin', signin)

app.use((err, req, res, next) => {
  console.log(err)
  res.json({message: `had an error: ${err.message}`})
})

export default app

tsconfig.json

{
    "compilerOptions": {
      "sourceMap": true,
      "outDir": "dist",
      "lib": ["esnext"],
      "esModuleInterop": true
    }
  }

jest.config.js

/** @type {import('ts-jest').JestConfigWithTsJest} */
module.exports = {
  preset: 'ts-jest',
  testEnvironment: 'node',
};

Challenges & learning

Asynchronous Operations: Using asynchronous operations was challenging, especially when interfacing with Prisma. Ensuring that database operations completed before sending responses required careful handling of promises and async/await syntax.
Environment Configuration: Setting up different environments for development, testing, and production initially seemed straightforward but proved complex due to the need for different configurations. Management of environment variables is core to a scalable application
Testing Setups: Implementing tests exposed me to the practical build functional and sustainable code. I struggled with correctly mocking services and managing test databases.
Schema Management: Defining and managing the schema with Prisma was a key component. I learned the significance of accurately defining relationships and constraints to reflect the application's requirements.
Error Handling and Security: Integrating error handling and authentication mechanisms prevents crashes & enhances the user experience with meaningful error messages. Authentication using JWTs highlighted the importance of protecting user data and ensuring secure access to resources.

Conclusion: This project was a real eye-opener into the complexities of backend development. Each challenge provided a learning opportunity, emphasizing the importance of careful design, security considerations, and the ability to troubleshoot and debug effectively. I now appreciate how these elements come together to create robust and user-friendly APIs.

Julian Dev Diary - TheEvergreenStateCollege/upper-division-cs-23-24 GitHub Wiki

AI-24sp

Lab Work

Homework

1 3 4 5 6 7 8

Lab 5-2

step 1

step 2

step 3 splitting on whitespaces

step 4 splitting on commas

step 5

step 6 tokenizing the text

Step 7 calculating total number of tokens

Step 8

output 4

output 5

output 6

output 7

output 8

lab week 6

lab week 7

Section 2.6 sampling.py

Section 2.7 embeddings.py

Lab week 9

Train.py output: After running train.py I a killed message from the terminal after one epoch. WSL2 is automatically configured to use 50% of the physical RAM of the machine. I added a memory=48GB to a .wslconfig file in my Windows home directory

Homework 1

Human Writing

Homework 3

AI technical reading

Question 1

Question 2 What are some factors that has led to recent progress in the ability of deep learning to mimic intelligent human tasks?

Q3 How many neurons are in the average human brain, versus the number of simulated neurons in the biggest AI supercomputer described in the book chapter? Now in the year 2024, how many neurons can the biggest supercomputer simulate?

Q4 Why does the neural network, before you've trained it on the first input

Question 5 If you have a Numpy array that looks like the following, give its shape as a tuple of maximum dimensions along each axis.

q6

Q8 What is the effect of changing the learning rate (the Greek letter

Q9 Why is the word "stochastic" in the name "stochastic gradient descent", and how is it different than normal gradient descent?

Human writing

Q1 What is framing and the "standard story" in terms of journalism?

Q2 How does journalistic framing relate to system prompts?

Q3 How does the "standard story" relate to the kinds of information that is likely to be expressed by AI chat systems like ChatGPT?

Q4 How can research assistance for a critical essay be connected to using AI chat to learn technical topics or for assistance with programming languages?

Homework 4

Question 0. In this equation, match the symbols with their meaning:

Question 1.

Question 2

Question 3

Question 4

Question 5

Question 6

Backpropagation Calculus

Question 1

For our neural network with layers of [784, 100, 10], what is the size (number of elements) of the

How Strongly do the 6 weights and bias's affect the value of the cost function

Question 2

Question 3 What is the relationship of this second, extended diagram to the first one?

Human Writing

Homework 5

Chapter 1 Questions

Q1 What is the difference between a GPT and an LLM? Are the terms synonymous?

Q2

Q3

Q4 How many layers of neural networks is considered "deep learning" in the Rashka text?

Q5 Is our MNIST classifier a deep learning neural network by this definition?

Q6 For each statement about how pre-training is related to fine-tuning for GPTs:

Q7 GPTs work by predicting the next word in a sequence, given which of the following as inputs or context?

Q8The reading distinguishes between these three kinds of tasks that you might ask an AI to do:

Q9 Which of the following components of the GPT architecture might be neural networks, similar to the MNIST classifier we have been studying? Explain your answer.

Q10 What is an example of zero-shot learning that we have encountered in this class already? Choose all that apply and explain.

Q11 What is zero-shot learning, and how does it differ from few-shot or many-shot learning?

Q12 What is the number of model parameters quoted for GPT-3, a predecessor of the model used to power the first ChatGPT product?

Chapter Two

Q2 What is an embedding? What does the dimension of an embedding mean?

Q3 What is the dimension of the embedding used in our example of Edith Wharton's short story "The Verdict"? What is it for GPT-3?

Q4 Put the following steps in order for processing and preparing our dataset for LLM training

Human Writing

What is a use of AI that you are interested in?

What is David Deutsch's apparent definition of creativity?

Homework 6

Question 1

1
3
4
5
6
7
8

Train.py output:
After running train.py I a killed message from the terminal after one epoch. WSL2 is automatically configured to use 50% of the physical RAM of the machine. I added a memory=48GB to a .wslconfig file in my Windows home directory

⚠️ GitHub.com Fallback ⚠️