Section 2: Parameters and Getting Data - calisley/dpi-681 GitHub Wiki

Simple Text Analysis with OpenAI's API

The goal of this activity is to get you comfortable with using prompt engineering in your code and working with JSON output. If you get stuck, try asking ChatGPT first! It will likely give you a faster response than I can.

Getting New Code

At the start of every section, we need to update our code bases with new files I have created. To do so, open your dpi-681 folder in terminal and run

git pull

1. Setting Up Your API Key

Before you can run the script, you need to set your OpenAI API key:

Locate the API Key: Log into your OpenAI account and navigate to your API keys section.
Insert the API Key: Replace the empty string in the script with your API key.

client = OpenAI(
    api_key="YOUR_API_KEY_HERE"  # Insert your API key here
)

2. Understanding the Script Structure

Script Overview

Reading the Article: The script reads an article from a file. The 'reading' is done here:

# Read the article content from the file
with open('./section-2/article.txt', 'r', encoding='utf-8') as file:
    article_content = file.read()

Chat Completion Request: It sends the article to the OpenAI model to generate a JSON representation. You should be familiar with what this looks like.

System vs. User Prompts

System Prompt: Defines the behavior of the LLM prior to generating the response. A good place to tell the model (for example) "You always return JSON objects.
User Prompt: The actual message that includes the article content along with a command telling the bot what to do.

Hint: You might want to use both the system prompt (to define the expected JSON structure) and the user prompt (to include the article content).

3. Prompt Engineering

Leverage the skills we learned in class this week to customize the system prompt to clearly define what you expect in the output. For instance:

system_prompt = """
You are an expert content analyzer. Your task is to read the given article and return a classification of the article's political slant from the choices
"left" "neutral" and "right". 
"""

Modifying the Message Array

Ensure that both the system and user messages are included in your request. For example:

completion = client.chat.completions.create(
    model="gpt-4o",  # or the appropriate model
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": "Analyze the article as instructed:\n" + article_content}
    ],
    # You can also adjust additional parameters here if needed
)

Tip: The clearer your instructions in the system prompt, the better your data will be!

4. Parameter Tuning

Model and Parameters

Model Selection: Ensure you're using a model that supports your required functionality (e.g., gpt-4o or any other version you have access to).
Temperature and Max Tokens: If your output needs to be more precise, consider lowering the temperature to make the output more deterministic. You might also want to adjust the maximum tokens if your JSON output is large.

Testing and Iteration

Test the Script: Run your script and examine the JSON output. If it’s not as expected, adjust the system prompt or parameters.
Iterate: Experiment with different prompts and settings until the JSON structure aligns with your needs.

*Note! Small tweaks in the prompt wording can make a *big ** difference in the output.