AI - henk52/knowledgesharing GitHub Wiki
AI
Introduction
Vocabulary
- ABI - Always Be Iterating
- Agentic workflow - happens within the same session? where the llm switches between roles?
- deterministic outcomes
- well defined taks
- Agentic frameworks
- Not framworkd
- No framework - no abstractions, you connect to LLMs directly.
- MCP - Model Context Protocol.
- Not a framework, it is a protocol.
- Three components
- Host - is an LLM app like Claude or our Agent architecture.
- MCP Client - lives inside the host and connects 1:1 to the MCP server.
- MCP server - provides tools, context and prompts.
- Midlayer
- OpeAI Agents SDK -
- CrewAI - For multi agent systems, handled through configurations.
- Top layer
- Langraph -
- AutoGen -
- Not framworkd
- AGI eval
- AI Agent - is like an expert designed to help with taks and answer questions.
- Autonomous problem solving
- Complex task automation
- Unpredicable environments.
- BBH
- Chain of thought prompting - ask the AI to explain its reasoning as a step by step process.
- through out the prompting sequens tag on the prompt: explain your though process.
- Code
- Coder agent - seems to be an agent that uses code(python) to come up with the answer.
- Comprehension
- Fine-tuning - a process of retraining a base, foundational model on new data.
- GGUF - a file format or storing models for inference with GGML.
- Knowledge
- LLM - Large Language Model
- Math
- MCP - Model Context Protocol. For connecting an AI agent to any API you configure the MCP to.
- Meta prompting - use AI to help you come up with a prompt.
- ML - Machine Learning
- MMLU
- Modality
- Pretrained -
- Finetuned -
- MT-Bench -
- Multimodal prompting
- Audio
- Code
- Pictures
- Video
- Parameters - the number of learnable and adjustable parameters that the model contains.
- Parameters are the internal variables that the model adjusts during the training process to learn patterns and relationships in the data.
- In the context of neural networks, which are the architecture underlying LLMs, parameters include weights and biases.
- Prompt chaining - guides generative ai tool through a series of interconnected prompts, adding new layers of complexity along the way.
- QLearning - temporal difference learning
- Q* - the optimal Q, when you are able to sample from all possible outcomes for ever over time
- Quantized - refers to the process of reducing the precision of numerical representations. It involves representing numerical values with fewer bits, typically by converting them to a lower bit-width format, such as using 8-bit integers instead of 32-bit floating-point numbers.
- to make models more efficient in terms of memory usage and computation.
- RAG - Retriaval Augmented GenerationWhat is Agentic RAG?.
- Reasoning
- Tree-of-thought prompting - forcing the llm to concidere the subject from different perspectives via agents.
- e.g
- for developing novel plots with new characters
- creating outline for drafting sections in lengthy documents
- e.g
Models
- Mistral 7B
- Bloom
Comparison points
- MMLU
- Knowledge
- Reasoning
- Comprehension
- AGI eval
- Math
- BBH
- Code
Challenges with AI
- Hallucination - invent things that doesn't exist or lie.
- Biases -
Workflow overview
Prompt chaining
building effective agents
graph LR;
in((in))-->llm1;
llm1-->gate(gate)
gate-->llm2;
llm2-->llm3;
llm3-->out((out));
- angular box - LLM
- round corner box - gate - code
Routing
Direct an input into a specialized sub-task, ensuring separation of concerns.
graph LR;
in((in))-->router[LLM Router];
router-->llm1;
router-->llm2;
router-->llm3;
llm1-->out((out));
llm2-->out;
llm3-->out((out));
- router - triage and route to the most fit llm.
- e.g. medical diagnostics.
Parallelization
Breaking down tasks and running multiple subtasks concurrently.
graph LR;
in((in))-->router(coordinator);
router-->llm1;
router-->llm2;
router-->llm3;
llm1-->aggregator(Aggregator);
llm2-->aggregator;
llm3-->aggregator;
aggregator-->out((out));
- coordinator - .
- aggregator - e.g. llm1-3 generates different parts of a report and the aggregator puts them all together.
- or all LLMs can do the same tasks and aggregator then creates the "mean"
Orchestrator-worker
Complex tasks are broken down dynamically and combined
graph LR;
in((in))-->router[Orchestrator];
router-->llm1;
router-->llm2;
router-->llm3;
llm1-->aggregator[Synthesizer];
llm2-->aggregator;
llm3-->aggregator;
aggregator-->out((out));
- Orchestrator - .
- Synthesizer - e.g. llm1-3 generates different parts of a report and the aggregator puts them all together.
- or all LLMs can do the same tasks and aggregator then creates the "mean"
Evaluator-optimizer
LLM output is validated by another
graph LR;
in((in))-->generator;
generator -->|solution|evaluator;
evaluator -->|reject w feedback|generator;
evaluator-->|accepted|out((out));
e.g. code generator/security evaluator.
By contrast, agents
- Open-ended
- feedback loops
- No fixed path
graph LR;
in([human])-->llm[LLM call];
llm -->|Action|environment([environment]);
environment-->|feedback|llm;
llm-->stop(stop);
Agents
Defining agents(it's ambiguous)
Agentic AI Engineering: Complete 4-Hour Workshop feat. MCP, CrewAI and OpenAI Agents SDK
-
AI Agents are programs where LLM outpus control the workflow.
- In practice, describes an AI solution that involves any or all of these:
- Multiple LLM calls
- LLMs with ability to use tools
- An environment where LLMs interact
- A planner to coordinate activities
- Autonomy
- In practice, describes an AI solution that involves any or all of these:
-
Workflows - systems where LLMs and tools are orchestrated through predefined code paths.
-
Agents - Systems where LLMs dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks.
-
Tools - give LLMs autonomy
- give an LLM the power to carry out actions like query a database or message other LLMs
-
Risks of agent systems
- Unpredictable path
- Unpredictable output
- Unpredictable costs
- Addressing risks
- Monitor
- Guardrails ensure your agents behave safely, consistenly, and within you intended boundaries.
Prompting
Google prompt engineering
-
Persona -
- e.g. act as an anime expert
-
Task - what you want the AI to do.
- e.g. suggest an anime gift for my friend's birthday
-
output?
- e.g organize that data as a table
-
Context
- e.g. turning 29 years old and their top anime characters are xxx
-
Referenes - provide examples to the AI
- e.g. past presents the person has enjoyed.
-
Constraints -
-
Evaluate - you evaluate the output, to decide if you like it or not.
-
Iterate - refine the prompt
- Iteration methods
- revisit the prompting framework.
- separate your prompts into shorter sentences.
- task1, task2, task3 etc.
- e.g: Summarize the key data points and information in this report. Then create visual graphs from the data and shorten the key information into bullets.
- summarize the key data points and information in this report.
- create visual graphs with the data you summarized
- Shorten the key information you summarized into bullets.
- ask it to tell it like a story ???
- Introduce constraints
- e.g. for Generate a playlist for a road trip
- only use brazilian music style
- only use chilled and adventures tempo
- only songs about heart break
- e.g. for Generate a playlist for a road trip
- Iteration methods
Example
-
I'm a gym manager and we have a new gym schedule. Write an email informing our staff of the nes schedule. Highlight the fact that the M/W/F Cardio Blast class changed from 7:00am to 6:00am. Make the email professional and friendly, and short so that the readers can skim it quickly
- Persona: I'm a gym manager
- Context: we have a new gym schedule.
- Task: Write an email informing our staff of the nes schedule. Highlight the fact that the M/W/F Cardio Blast class changed from 7:00am to 6:00am.
- Output: Make the email professional and friendly, and short so that the readers can skim it quickly
-
Attached is a google sheet of store data. How can I create a new column in sheets that calculates the average sales per customer for each store.
- context: Attached is a google sheet of store data.
- task: How can I create a new column in sheets that calculates the average sales per customer for each store.
-
prompt chaining
- Generate three options for a one-sentece summary of this novel manuscript. The summary should be similar in voice and tone to the manuscript but more catchy and engaging.
- Taks: Generate three options for a one-sentece summary of this novel manuscript.
- Output: The summary should be similar in voice and tone to the manuscript but more catchy and engaging.
- Create a tagline that is a combination of the previous three options, with a special focus on the exciting plot twist and mystery of the book. Find the catchiest and most impactful combination. The tagline should be concise and leave the reader hooked and wanting to read more.
- Task: Create a tagline that is a combination of the previous three options
- Context: with a special focus on the exciting plot twist and mystery of the book
- Output: Find the catchiest and most impactful combination. The tagline should be concise and leave the reader hooked and wanting to read more.
- Generate a six-week promotional plan for a book tour, including what locations I should visit and what channels I should utilize to promote each stop on the tour.
- Generate three options for a one-sentece summary of this novel manuscript. The summary should be similar in voice and tone to the manuscript but more catchy and engaging.
-
Tree-of-thought prompting
- imagine three different designers are pitching their design to me. All designers will write down one step of their thining, then share it with the group. Then all experts will go on to the next step, etc. If any expert realizes they're wront at any point they will leave. The question is: Generate an image that is visually energetic, and getures images of art supplies and computers. Show me three suggestions in very different styles from simple to detailed and complex.
- prompt after looking at the generated images:
- I like the first one, and I would like to expand the idea a little bit more and perhaps generate three different color schemes for that concept.
- prompt after looking at the generated images:
- imagine three different designers are pitching their design to me. All designers will write down one step of their thining, then share it with the group. Then all experts will go on to the next step, etc. If any expert realizes they're wront at any point they will leave. The question is: Generate an image that is visually energetic, and getures images of art supplies and computers. Show me three suggestions in very different styles from simple to detailed and complex.
-
You can combine tree of thought and chain of thought by asking the ai to explain its reasoning at each itteration, so you can provide feedback.
-
write a casual summary
- write a a summary in a friendly, easy to understand tone like explaining to a curious friend
- you can also reference other e-mails that you have written in the past and tell the AI to match the tone
ChatGPT prompt formula
-
Master the Perfect ChatGPT Prompt Formula (in just 8 minutes)!
-
task (mandatory) -
- Always start the task sentence with an action verb,
- Generate
- Give
- Write
- Analyze
- etc.
- Include what the endgoal is
- simple task
- multi-taks
- Always start the task sentence with an action verb,
-
context (important) -
- you need to limit the endless possibilities for context
- what is the subjects background
- what does success look like
- what environment are they in
-
exemplar (important) - including a relevant example will greatly improve the quality of your output
-
persona (nice to have) - who you want chatgpt to be
- think of someone you wish you had instant access to witht eh task you're facing
- you are an experienced physical therapist with over 20 years of
-
format (nice to have) - visualize the exact format you want the end result to be in
- e.g.
- output in table format with coumn headers: feedback, team and priority
- e-mails
- bullet points
- code blocks
- paragraphs
- markdown
- e.g.
-
tone (nice to have) -
- use a casual tone of voice
- use a formal tone of voice
- give me a witty output
- show enthusiasm
- sound pessimistic
-
e.g.
- context task
- I'm a 70kg make, give me a 3-month training program
- context task
-
examples
- proofread this email below and correct all typos and grammar mistakes. Bold all changes you make: xxx
AI Agent prompting
-
Act as a career development training simulator. Your task is to help interns master interview skills and conduct converstaions with potential managers. You need to support the following types of conversations: Articulating strenght and skills. Communicating professionally and confidently. Discussing future career development goals. Once an intern has picked a conversation topic, provide details abou the situation and the iterviewers role. Then act as the inverviewer and allow the intern to participate as the employee. Make sure to guide the conversation in a way that will allow the intern to exercise their interview skills. Continue the role play until the intern replies with "JAZZ HANDS" After the inter give s the stop rule "JAZZ HANDS" provides them with key takeaways for the simulation and skulls they can work on.
-
Always ask it to explain its reasoning, this will give better outcomes Agentic AI Engineering: Complete 4-Hour Workshop feat. MCP, CrewAI and OpenAI Agents SDK
- it is more likely to output tokens consistent with its reasoning, if you have asked for the reasoning. (local bias)
Tools
LM Studio
Ollama
- Importing Open Source Models to Ollama
- Ollama
- running ollama with Radeon GPU
- How to install the Radeon drivers on linux
- OLLAMA: How to Run Local Language Models Like a Pro
- Run ollama with an AMD GPU on Arch
Installing Ollama
- OLLAMA_HOST=0.0.0.0:11434 ollama serve
Agentic frameworks
CrewAI core concepts
- Agent: an autonomous unit, wih
- an LLM
- a role
- a goal
- a backstory
- memory
- tools
- Task: a specific assignment to be carried out, with
- a description
- expected output
- agent
- crew: a team of agents and tasks; either:
- Sequentail: run tasks in order they are defined
- Hierarchical: use a manager LLM to assign
Lightweight, but somewhat more opinionated than OpenAI agents SDK - more terminology, more prescriptive.
Five steps to a crew AI project
- Create the project
crewai create crew my_project
- Fill in the confi yaml files to define the Agents and Tasks
- agents.yaml
- tasks.yaml
- there are other ways but this is the easiest.
- complete the crew.py module to create the Agents, Takss and Crew, referencing the config.
- stiches together your crew.
- update main.py to set any inputs.
- run with:
crewai run
fine tuning
-
What If Your LLM Could Become an Expert on Anything You Want?
-
not adding new data
-
restructuring its existing knowledge
-
how to behave in specific manner
-
advantages
- improved UX
- higher quality output
- less hallucinations
- shorter prompts
- excellent accuracy
-
fine tune on
- ?
- custom tone, to be used as a company support ai, to give brand consistency.
- language translation -
- data extraction -
-
tools for finetuning
- CodeAlpaca-20k
- Evol-Instruct-Code-80k-v1
- codeparrot/github-code
-
terms/concepts
- Transformers - Transformers provides APIs and tools to easily download and train state-of-the-art pretrained models
- Datasets- Datasets is a library for easily accessing and sharing datasets for Audio, Computer Vision, and Natural Language Processing (NLP) tasks
- PEFT - Parameter-Efficient Fine-Tuning
- methods enable efficient adaptation of pre-trained language models (PLMs) to various downstream applications without fine-tuning all the model's parameters
- trl - Transformer Reinforcement Learning
- a set of tools to train transformer language models. In this case the Supervised Fine-tuning step (SFT)
- QLora
- Quantized model
- SFT - Supervised Fine-tuning step
-
You need a database for fine-tuning
- google?
- kaggel
- create your own
-
follow instructions on Fine_tuned_Llama_PEFT_QLora.ipynb
-
fine tuning with Fine Tune LLaMA 2 In FIVE MINUTES! - "Perform 10x Better For My Use Case"
Scratchpad
TODO
look at
- WizardCoder-python-34b
- huggingfaceh4/open_llm_leaderboard
LLM vs ML
"LLM" (large language model) and "ML" (machine learning) are related concepts within the broader field of artificial intelligence.
In summary, LLMs are a specific application of machine learning, focusing on language-related tasks. The development and success of large language models have been driven by advancements in machine learning techniques, particularly in the realm of deep learning and neural networks.
Large Language Model (LLM)
LLM refers to a type of machine learning model, specifically designed for natural language processing tasks. Examples of large language models include GPT-3 (Generative Pre-trained Transformer 3) and similar architectures. LLMs are trained on massive amounts of text data to understand and generate human-like language. They are used in various natural language understanding and generation tasks, such as language translation, text completion, and question answering.
Machine Learning (ML)
Machine learning is a broader concept that encompasses various algorithms and techniques designed to enable computers to learn from data. It involves developing models that can recognize patterns, make predictions, or perform tasks without being explicitly programmed for those tasks. Machine learning can be categorized into different types, including supervised learning, unsupervised learning, and reinforcement learning.
Math behind LLM
The math behind Large Language Models (LLMs) involves the use of neural network architectures, particularly deep learning techniques. LLMs, such as GPT (Generative Pre-trained Transformer) models, are built upon transformer architectures. Here are key mathematical concepts involved:
- Linear Algebra:
- Matrices and Vectors: LLMs process input data, such as sequences of words or tokens, using matrices and vectors. The weights connecting the neurons in the neural network are represented by matrices.
- Matrix Multiplication: The core operation in neural networks is matrix multiplication. It is used to transform input vectors through the layers of the network.
- Calculus:
- Derivatives: Training a neural network involves minimizing a cost function. Calculus, specifically derivatives, is used to find the gradient of the cost function with respect to the model parameters. This gradient is used in optimization algorithms like stochastic gradient descent to update the model parameters during training.
- Probability and Statistics:
- Softmax Activation: The softmax function is often used in the output layer of a language model to convert the raw model outputs into probabilities. This is crucial for generating coherent and meaningful sequences of words.
- Cross-Entropy Loss: Cross-entropy is a common loss function used in language models. It measures the difference between the predicted probability distribution (model output) and the true distribution (ground truth).
- Attention Mechanism:
- Attention Weights: Transformers, the architecture underlying many LLMs, use attention mechanisms. Attention involves assigning weights to different parts of the input sequence when processing a particular token. The attention weights are computed using softmax and are influenced by the similarity between tokens.
- Optimization Algorithms:
- Stochastic Gradient Descent (SGD) and Variants: These algorithms are used to update the model parameters during training. They involve adjusting weights in the direction that minimizes the loss function.
- Activation Functions:
- ReLU (Rectified Linear Unit), Sigmoid, Tanh: These are activation functions that introduce non-linearity into the model, enabling it to learn complex patterns.
In summary, the math behind LLMs involves a combination of linear algebra, calculus, probability, and statistics, with specific attention to neural network architectures and optimization algorithms. The use of deep learning techniques allows these models to capture intricate patterns and relationships in large amounts of language data.
Further training
Further training a Large Language Model (LLM) involves fine-tuning or continuing the pre-training process on specific datasets or tasks. The process may vary depending on the architecture of the LLM, but here are general steps for fine-tuning:
- Data Preparation:
- Assemble a dataset that is relevant to the specific task you want the LLM to perform better on. This could be a domain-specific corpus or a dataset tailored to a particular language task.
- Model Selection:
- Choose the pre-trained LLM that best aligns with your task. For example, GPT models pre-trained on a diverse range of data can be fine-tuned for various tasks.
- Fine-tuning Architecture:
- Modify the architecture if necessary. Some LLMs come with specific fine-tuning mechanisms, allowing you to adjust hyperparameters or add task-specific layers.
- Loss Function:
- Define a task-specific loss function. This guides the model towards optimal performance for the specific task you're focusing on.
- Optimization:
- Use an optimization algorithm, such as stochastic gradient descent (SGD) or its variants, to minimize the defined loss function. Adjust learning rates and other hyperparameters based on the specifics of your task.
- Training:
- Train the model on your task-specific dataset. Depending on the scale of the fine-tuning task, you may need fewer training steps compared to the initial pre-training.
- Validation and Evaluation:
- Monitor the model's performance on a validation set throughout training. Evaluate its performance on a separate test set to ensure generalization to new data.
- Iterative Process:
- Fine-tuning is often an iterative process. You may need to experiment with different hyperparameters, adjust the dataset, or fine-tune for different durations to achieve the desired performance.
It's crucial to note that fine-tuning should be done responsibly, considering ethical implications and potential biases in the data. Also, be aware of potential overfitting to the specific fine-tuning dataset, which might affect the model's generalization to new, unseen data.
Specific tools and libraries like Hugging Face's Transformers or TensorFlow's Keras API often provide convenient interfaces for fine-tuning popular LLMs. Always refer to the documentation provided by the model's developers for guidelines specific to the model you are using.