Developing Apps with OpenAI - bryanneliu/Nature-Language-Processing GitHub Wiki

Prompts

Prompts are not specific to the OpenAI API but are the entry point for all LLMs. Prompts are the input text that you send to the model, and they are used to instruct the model on the specific task you want it to perform. For the ChatGPT and GPT-4 models, prompts have a chat format, with the input and output messages stored in a list.

Tokens

Tokens are words or parts of words. A rough estimate is that 100 tokens equal approximately 75 words for an English text. Requests to the OpenAI models are priced based on the number of tokens used: that is, the cost of a call to the API depends on the length of both the input text and the output text.

Parameters

Temperature

Controls randomness. Lowering results in less random completions. As the temperature approaches zero, the model will become deterministic and repetitive.

Explanations: ["temperature" is a parameter that controls the randomness (the balance between creativity and predictability) of the generated output.

Higher values of temperature allow for more randomness, resulting in diverse and creative completions.
Lower values reduce randomness, make the output more deterministic and focused.
Approaching Zero Makes the Model Deterministic ([dɪˌtɜːrmɪˈnɪstɪk], 不可逆转，不可抗拒的) and Repetitive ([rɪˈpetətɪv], 重复的): As you decrease the temperature to near zero, the model becomes more deterministic. It tends to choose the most likely option consistently, leading to more repetitive and predictable output.]

Maximum length

The maximum number of tokens to generate shared between the prompt and completion. One token is roughly 4 characters for standard English text.

Explanations: [This limit is shared between the initial prompt and the subsequent completion. Both the provided prompt and the generated text together contribute to this token limit.]

Stop sequences

Up to four sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

Explanations: [Up to Four Sequences: The API allows you to specify up to four sequences of text that, when encountered during text generation, will signal the model to stop generating further tokens.

API Stops Generating Further Tokens: When the language model encounters any of the specified sequences during text generation, it will halt the generation process. This is useful for controlling the length of the generated text.

Returned Text Excludes the Stop Sequence: The text returned by the API will not include the specified stop sequence. The stopping mechanism prevents the model from including the sequence itself in the output.

For example, if you set the stop sequences to be ["", "STOP"], and the model encounters either of these sequences during generation, it will stop and the returned text will not include the "" or "STOP" sequences. This helps in creating more controlled and specific outputs.]

Top P - [0,1]

Controls diversity via nucleus sampling: 0.5 means half of all likelihood-weighted options are considered.

Explanations: [ Controls Diversity via Nucleus Sampling: Nucleus ([ˈnuːkliəs], 核，中心) sampling is a method for generating diverse and varied outputs from a language model. It achieves this by considering a subset (or nucleus) of the most likely options at each step of text generation.

0.5 Means Half of All Likelihood-Weighted Options are Considered: The parameter 0.5 indicates the threshold or probability cutoff for selecting options during nucleus sampling. In this case, it means that only the top 50% (half) of the most likely options, based on their likelihood scores, will be considered at each step of the generation process.]

Frequency penalty

How much to penalize new tokens based on their existing frequency in the text so far. Decreases the model's likelihood to repeat the same line verbatim.

Explanations: [ Penalize New Tokens Based on Existing Frequency: This suggests that the model considers the frequency of tokens that have already been generated in the text up to a certain point. New tokens are penalized based on how frequently they have appeared in the text so far.

Decreases the Model's Likelihood to Repeat the Same Line Verbatim: By penalizing new tokens based on their existing frequency, the model is discouraged from generating the same words or phrases repeatedly. It helps in reducing verbatim ([vɜːrˈbeɪtɪm], 逐字的) repetition in the generated text, making the output more diverse and coherent ([koʊˈhɪrənt], 连贯的，有条理的).

Higher penalization would result in less repetition, while lower penalization might allow for more repetition in the generated output.]

Presence penalty

How much to penalize new tokens based on whether they appear in the text so far. Increase the model's likelihood to talk about new topics.

Explanations: [ Adjusting this parameter allows users to control how much the model penalizes the repetition of tokens, thereby influencing the model's tendency to explore new topics and generate varied content. A higher penalty would steer the model toward generating more diverse text, fostering ([ˈfɑːstərɪŋ] 促进) the exploration of different ideas or subjects.]

OpenAI API and Apps

https://github.com/openai/openai-python https://platform.openai.com/docs/api-reference/

%pip install openai
%pip install openai --upgrade

Using ChatGPT and GPT-4

Both ChatGPT and GPT-4 use the same endpoint: openai.ChatCompletion. The GPT 3.5 Turbo and GPT-4 models are optimized for chat sessions.

import os
from openai import OpenAI

client = OpenAI(api_key=contents)
chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "system",
            "content": "You are a helpful teacher",
        },
        {
            "role": "user",
            "content": "Are there any other measurements than time complexity for an algorithm?"
        },
        {
            "role": "assistant",
            "content": "Yeah, measurements like space complexity"
        },
        {
            "role": "user",
            "content": "what is it"
        }
    ],
    model="gpt-3.5-turbo",
)

print(chat_completion.choices[0].message.content)

# Space complexity measures the amount of memory or space required by an algorithm to run. It is an important consideration when analyzing the efficiency and scalability of an algorithm. The space complexity of an algorithm is typically expressed in terms of the amount of additional space or memory the algorithm needs in relation to the size of the input. It helps in determining how much memory resources an algorithm consumes and how this usage may grow as the input size increases. Similar to time complexity, space complexity is often expressed using Big O notation.

The conversation format in the input message allows multiple exchanges to be sent to the model. Note that the API does not store previous messages in its context. The question "what is it" refers to the previous answer and only makes sense if the model has knowledge of this answer. The entire conversation must be sent each time to simulate a chat session.

Explanations: When using the OpenAI API for chat-based interactions, you need to provide the complete conversation history in each input to ensure that the model has the necessary context for generating meaningful responses in the context of a chat session. The model doesn't inherently ([ɪnˈhɪrəntli] 固有的) retain information about past responses.

Input Options for the Chat Completion Endpoint

Output Result Format for the Chat Completion Endpoint

From Text Completions to Functions

Function

name: String (required)
description: String
parameters: Object - described in a JSON schema

Code example

Instead of creating a completed prompt to ensure that the model answers in a specific format. You can use a function definition to convert natural language into API calls or SQL queries.

NLP query -> SQL query
Use SQL to query a database -> results
Generate NLP results based on above results

# NLP query -> SQL query
functions = [
    {
        "name": "find_product",
        "description": "Returns the top products from a SQL query",
        "parameters": {
            "type": "object",
            "properties": {
                "sql_query": {
                    "type": "string",
                    "description": "A SQL query",
                }
            },
            "required": ["sql_query"],
        }
    }
]

user_question = "I need the top 3 products where the price is less than 2.00"
messages = [{"role": "user", "content": user_question}]
response = client.chat.completions.create(
    model = "gpt-3.5-turbo-0613", messages=messages, functions=functions
)
response_message = response.choices[0].message
print(response_message)
messages.append(response_message) # response_message is a ChatCompletionMessage object, append it to messages.
# ChatCompletionMessage(content=None, role='assistant', function_call=FunctionCall(arguments='{\n  "sql_query": "SELECT * FROM products WHERE price < 2.00 ORDER BY price ASC LIMIT 3"\n}', name='find_product'), tool_calls=None)

# Use SQL to query a database -> results
def find_product(sql_query):
    # Execute sql_query to get results. results below is a fake result.
    results = [
        {"name":"pen", "color":"blue", "price":1.99},
        {"name":"pen", "color":"red", "price":1.78},
        {"name":"pen", "color":"orange", "price":1.78},
    ]
    return results

import json
function_args = response_message.function_call.arguments
sql_query = json.loads(function_args)
products = find_product(sql_query)
print(products)
# [{'name': 'pen', 'color': 'blue', 'price': 1.99}, {'name': 'pen', 'color': 'red', 'price': 1.78}, {'name': 'pen', 'color': 'orange', 'price': 1.78}]

# Generate NLP results based on above results
messages.append(
    {"role":"function", "name":"find_product", "content":json.dumps(products)}
)
response = client.chat.completions.create(
    model = "gpt-3.5-turbo-0613", messages=messages, functions=functions
)
print(response.choices[0].message)
# ChatCompletionMessage(content='The top 3 products where the price is less than $2.00 are:\n\n1. Pen (Blue) - $1.99\n2. Pen (Red) - $1.78\n3. Pen (Orange) - $1.78', role='assistant', function_call=None, tool_calls=None)