Generative AI Book Notes - doraithodla/notes GitHub Wiki

Generative AI -

Creating new unique data or content using ML algorithms.
Improve communications and automate processes
Prompt Design as a technique to improve the accuracy of a model
Developing the future with ChatGPT - code review and optimization, document generation, code generation
Mastering marketing with ChatGPT (use of prompts)
- A/B Testing
- Keyword targeting suggestions
- Social media sentiment analysis
Research with ChatGPT
- literature reviews
- experiment design
- bibliography generation
Trending use cases
- Intelligent Search Engines
- AI Assistants
- Report Generator

"Generative AI is a subfield of AI and DL that focuses on generating new content, such as images, text, music, and video, by using algorithms and models that have been trained on existing data using ML techniques.

Algorithmic composition

Wavenet architecture by Google
Magenta project by Google
Jukebox by OpenAI
AI Composer assistants
Flow Machine by Sony CSL Research
Music Transformer

GenAI for Videos

GAN development
DeepMind's motion to video
Nvidia's vid2vid
The Vid2Vid system can generate temporally consistent videos, meaning that they maintain smooth and realistic motion over time. The technology can be used to perform a variety of video synthesis tasks, such as the following:
Converting videos from one domain into another (for example, converting a daytime video into a nighttime video or a sketch into a realistic image)
Modifying existing videos (for example, changing the style or appearance of objects in a video)
Creating new videos from static images (for example, animating a sequence of still images)
Meta's make a video Make-A-Video (https://makeavideo.studio/), a new AI system that allows users to convert their natural language prompts into video clips.

The key innovation of VAEs is the introduction of a probabilistic interpretation of the latent space. Instead of learning a deterministic mapping of the input to the latent space, the encoder maps the input to a probability distribution over the latent space. This allows VAEs to generate new samples by sampling from the latent space and decoding the samples into the input space.

the VAE first takes in a picture of a cat or a dog and compresses it down into a smaller set of numbers into the latent space, which represent the most important features of the picture. These numbers are called latent variables.

Both models – VAEs and GANs – are meant to generate brand new data that is indistinguishable from original samples, and their architecture has improved since their conception, side by side with the development of new models such as PixelCNNs, proposed by Van den Oord and his team, and WaveNet, developed by Google DeepMind,

Products

Tome AI