AI - henk52/knowledgesharing GitHub Wiki
AI
Introduction
Vocabulary
- AGI eval
- BBH
- Code
- Comprehension
- Fine-tuning - a process of retraining a base, foundational model on new data.
- GGUF - a file format or storing modles for inference with GGML.
- Knowledge
- LLM - Large Language Model
- Math
- ML - Machine Learning
- MMLU
- Modality
- Pretrained -
- Finetuned -
- MT-Bench -
- Parameters - the number of learnable and adjustable parameters that the model contains.
- Parameters are the internal variables that the model adjusts during the training process to learn patterns and relationships in the data.
- In the context of neural networks, which are the architecture underlying LLMs, parameters include weights and biases.
- QLearning - temporal difference learning
- Q* - the optimal Q, when you are able to sample from all possible outcomes for ever over time
- Quantized - refers to the process of reducing the precision of numerical representations. It involves representing numerical values with fewer bits, typically by converting them to a lower bit-width format, such as using 8-bit integers instead of 32-bit floating-point numbers.
- to make models more efficient in terms of memory usage and computation.
- Reasoning
- Tree-of-thought prompting - forcing the llm to concidere the subject from different persoectives via agents.
Models
- Mistral 7B
- Bloom
Comparison points
- MMLU
- Knowledge
- Reasoning
- Comprehension
- AGI eval
- Math
- BBH
- Code
Prompting
-
Master the Perfect ChatGPT Prompt Formula (in just 8 minutes)!
-
task (mandatory) -
- Always start the task sentence with an action verb,
- Generate
- Give
- Write
- Analyze
- etc.
- Include what the endgoal is
- simple task
- multi-taks
- Always start the task sentence with an action verb,
-
context (important) -
- you need to limit the endless possibilities for context
- what is the subjects background
- what does success look like
- what environment are they in
-
exemplar (important) - including a relevant example will greatly improve the quality of your output
-
persona (nice to have) - who you want chatgpt to be
- think of someone you wish you had instant access to witht eh task you're facing
- you are an experienced physical therapist with over 20 years of
-
format (nice to have) - visualize the exact format you want the end result to be in
- e.g.
- output in table format with coumn headers: feedback, team and priority
- e-mails
- bullet points
- code blocks
- paragraphs
- markdown
- e.g.
-
tone (nice to have) -
- use a casual tone of voice
- use a formal tone of voice
- give me a witty output
- show enthusiasm
- sound pessimistic
-
e.g.
- context task
- I'm a 70kg make, give me a 3-month training program
- context task
-
examples
- proofread this email below and correct all typos and grammar mistakes. Bold all changes you make: xxx
Tools
LM Studio
Ollama
- Importing Open Source Models to Ollama
- Ollama
- running ollama with Radeon GPU
- How to install the Radeon drivers on linux
- OLLAMA: How to Run Local Language Models Like a Pro
- Run ollama with an AMD GPU on Arch
Installing Ollama
- OLLAMA_HOST=0.0.0.0:11434 ollama serve
fine tuning
-
What If Your LLM Could Become an Expert on Anything You Want?
-
not adding new data
-
restructuring its existing knowledge
-
how to behave in specific manner
-
advantages
- improved UX
- higher quality output
- less hallucinations
- shorter prompts
- excellent accuracy
-
fine tune on
- ?
- custom tone, to be used as a company support ai, to give brand consistency.
- language translation -
- data extraction -
-
tools for finetuning
- CodeAlpaca-20k
- Evol-Instruct-Code-80k-v1
- codeparrot/github-code
-
terms/concepts
- Transformers - Transformers provides APIs and tools to easily download and train state-of-the-art pretrained models
- Datasets- Datasets is a library for easily accessing and sharing datasets for Audio, Computer Vision, and Natural Language Processing (NLP) tasks
- PEFT - Parameter-Efficient Fine-Tuning
- methods enable efficient adaptation of pre-trained language models (PLMs) to various downstream applications without fine-tuning all the model's parameters
- trl - Transformer Reinforcement Learning
- a set of tools to train transformer language models. In this case the Supervised Fine-tuning step (SFT)
- QLora
- Quantized model
- SFT - Supervised Fine-tuning step
-
You need a database for fine-tuning
- google?
- kaggel
- create your own
-
follow instructions on Fine_tuned_Llama_PEFT_QLora.ipynb
-
fine tuning with Fine Tune LLaMA 2 In FIVE MINUTES! - "Perform 10x Better For My Use Case"
Scratchpad
TODO
look at
- WizardCoder-python-34b
- huggingfaceh4/open_llm_leaderboard
LLM vs ML
"LLM" (large language model) and "ML" (machine learning) are related concepts within the broader field of artificial intelligence.
-
Large Language Model (LLM): LLM refers to a type of machine learning model, specifically designed for natural language processing tasks. Examples of large language models include GPT-3 (Generative Pre-trained Transformer 3) and similar architectures. LLMs are trained on massive amounts of text data to understand and generate human-like language. They are used in various natural language understanding and generation tasks, such as language translation, text completion, and question answering.
-
Machine Learning (ML): Machine learning is a broader concept that encompasses various algorithms and techniques designed to enable computers to learn from data. It involves developing models that can recognize patterns, make predictions, or perform tasks without being explicitly programmed for those tasks. Machine learning can be categorized into different types, including supervised learning, unsupervised learning, and reinforcement learning.
In summary, LLMs are a specific application of machine learning, focusing on language-related tasks. The development and success of large language models have been driven by advancements in machine learning techniques, particularly in the realm of deep learning and neural networks.
Math behind LLM
The math behind Large Language Models (LLMs) involves the use of neural network architectures, particularly deep learning techniques. LLMs, such as GPT (Generative Pre-trained Transformer) models, are built upon transformer architectures. Here are key mathematical concepts involved:
Linear Algebra:
Matrices and Vectors: LLMs process input data, such as sequences of words or tokens, using matrices and vectors. The weights connecting the neurons in the neural network are represented by matrices.
Matrix Multiplication: The core operation in neural networks is matrix multiplication. It is used to transform input vectors through the layers of the network.
Calculus:
Derivatives: Training a neural network involves minimizing a cost function. Calculus, specifically derivatives, is used to find the gradient of the cost function with respect to the model parameters. This gradient is used in optimization algorithms like stochastic gradient descent to update the model parameters during training.
Probability and Statistics:
Softmax Activation: The softmax function is often used in the output layer of a language model to convert the raw model outputs into probabilities. This is crucial for generating coherent and meaningful sequences of words.
Cross-Entropy Loss: Cross-entropy is a common loss function used in language models. It measures the difference between the predicted probability distribution (model output) and the true distribution (ground truth).
Attention Mechanism:
Attention Weights: Transformers, the architecture underlying many LLMs, use attention mechanisms. Attention involves assigning weights to different parts of the input sequence when processing a particular token. The attention weights are computed using softmax and are influenced by the similarity between tokens.
Optimization Algorithms:
Stochastic Gradient Descent (SGD) and Variants: These algorithms are used to update the model parameters during training. They involve adjusting weights in the direction that minimizes the loss function.
Activation Functions:
ReLU (Rectified Linear Unit), Sigmoid, Tanh: These are activation functions that introduce non-linearity into the model, enabling it to learn complex patterns.
In summary, the math behind LLMs involves a combination of linear algebra, calculus, probability, and statistics, with specific attention to neural network architectures and optimization algorithms. The use of deep learning techniques allows these models to capture intricate patterns and relationships in large amounts of language data.
Further training
Further training a Large Language Model (LLM) involves fine-tuning or continuing the pre-training process on specific datasets or tasks. The process may vary depending on the architecture of the LLM, but here are general steps for fine-tuning:
Data Preparation:
Assemble a dataset that is relevant to the specific task you want the LLM to perform better on. This could be a domain-specific corpus or a dataset tailored to a particular language task.
Model Selection:
Choose the pre-trained LLM that best aligns with your task. For example, GPT models pre-trained on a diverse range of data can be fine-tuned for various tasks.
Fine-tuning Architecture:
Modify the architecture if necessary. Some LLMs come with specific fine-tuning mechanisms, allowing you to adjust hyperparameters or add task-specific layers.
Loss Function:
Define a task-specific loss function. This guides the model towards optimal performance for the specific task you're focusing on.
Optimization:
Use an optimization algorithm, such as stochastic gradient descent (SGD) or its variants, to minimize the defined loss function. Adjust learning rates and other hyperparameters based on the specifics of your task.
Training:
Train the model on your task-specific dataset. Depending on the scale of the fine-tuning task, you may need fewer training steps compared to the initial pre-training.
Validation and Evaluation:
Monitor the model's performance on a validation set throughout training. Evaluate its performance on a separate test set to ensure generalization to new data.
Iterative Process:
Fine-tuning is often an iterative process. You may need to experiment with different hyperparameters, adjust the dataset, or fine-tune for different durations to achieve the desired performance.
It's crucial to note that fine-tuning should be done responsibly, considering ethical implications and potential biases in the data. Also, be aware of potential overfitting to the specific fine-tuning dataset, which might affect the model's generalization to new, unseen data.
Specific tools and libraries like Hugging Face's Transformers or TensorFlow's Keras API often provide convenient interfaces for fine-tuning popular LLMs. Always refer to the documentation provided by the model's developers for guidelines specific to the model you are using.