Section 2: Parameters and Getting Data - calisley/dpi-681 GitHub Wiki
Simple Text Analysis with OpenAI's API
The goal of this activity is to get you comfortable with using prompt engineering in your code and working with JSON output. If you get stuck, try asking ChatGPT first! It will likely give you a faster response than I can.
Getting New Code
At the start of every section, we need to update our code bases with new files I have created. To do so, open your dpi-681
folder in terminal and run
git pull
1. Setting Up Your API Key
Before you can run the script, you need to set your OpenAI API key:
- Locate the API Key: Log into your OpenAI account and navigate to your API keys section.
- Insert the API Key: Replace the empty string in the script with your API key.
client = OpenAI(
api_key="YOUR_API_KEY_HERE" # Insert your API key here
)
2. Understanding the Script Structure
Script Overview
- Reading the Article: The script reads an article from a file. The 'reading' is done here:
# Read the article content from the file
with open('./section-2/article.txt', 'r', encoding='utf-8') as file:
article_content = file.read()
- Chat Completion Request: It sends the article to the OpenAI model to generate a JSON representation. You should be familiar with what this looks like.
System vs. User Prompts
- System Prompt: Defines the behavior of the LLM prior to generating the response. A good place to tell the model (for example) "You always return JSON objects.
- User Prompt: The actual message that includes the article content along with a command telling the bot what to do.
Hint: You might want to use both the system prompt (to define the expected JSON structure) and the user prompt (to include the article content).
3. Prompt Engineering
Leverage the skills we learned in class this week to customize the system prompt to clearly define what you expect in the output. For instance:
system_prompt = """
You are an expert content analyzer. Your task is to read the given article and return a classification of the article's political slant from the choices
"left" "neutral" and "right".
"""
Modifying the Message Array
Ensure that both the system and user messages are included in your request. For example:
completion = client.chat.completions.create(
model="gpt-4o", # or the appropriate model
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": "Analyze the article as instructed:\n" + article_content}
],
# You can also adjust additional parameters here if needed
)
Tip: The clearer your instructions in the system prompt, the better your data will be!
4. Parameter Tuning
Model and Parameters
- Model Selection: Ensure you're using a model that supports your required functionality (e.g.,
gpt-4o
or any other version you have access to). - Temperature and Max Tokens: If your output needs to be more precise, consider lowering the temperature to make the output more deterministic. You might also want to adjust the maximum tokens if your JSON output is large.
Testing and Iteration
- Test the Script: Run your script and examine the JSON output. If itโs not as expected, adjust the system prompt or parameters.
- Iterate: Experiment with different prompts and settings until the JSON structure aligns with your needs.
*Note! Small tweaks in the prompt wording can make a *big ** difference in the output.