How to create config with GenAI - grishasen/proof_of_value GitHub Wiki
GenAI Application Config Generator
Overview
GenAI config page provides an interactive interface to:
- Upload an Interaction History (IH) sample (Parquet/JSON/ZIP/GZIP)
- Inspect its schema and data
- Auto-generate a tailored configuration file via an LLM
- Download the generated config for use in the Value Dashboard pipeline
Outcomes
- Interactive Exploration: Quickly understand your IH dataset’s schema & quality.
- Automated Configs: Eliminate manual editing of complex config templates — LLM crafts a tailored configuration.
- Tweak prompt or rerun with new data samples to refine pipeline settings.
- Generated configs slot directly into the Value Dashboard pipeline ready for reporting.
Workflow
-
Template Load & API Key Setup
- Reads a TOML template (
config_template.toml
) from disk. - Sidebar: prompts for OpenAI API key (or falls back to
OPENAI_API_KEY
env var). - Lets the user select a supported chat model (e.g.
gpt-4o-mini
).
- Reads a TOML template (
-
Dataset Upload & Pre-Processing
- User uploads a data file (ZIP/Parquet/JSON/GZIP).
- Columns are renamed to “Title Case” for consistency.
- Missing “extension” columns from the template are added or filled with defaults.
-
Feature Engineering & Cleanup
- Parses timestamp columns (
OutcomeTime
,DecisionTime
) into PolarsDatetime
. - Derives new fields:
- Day, Month, Year, Quarter from
OutcomeDateTime
- ResponseTime = time delta between decision and outcome
- Day, Month, Year, Quarter from
- Drops irrelevant or redundant columns (IDs, labels, metadata).
- Parses timestamp columns (
-
Schema & Data Summary
- Builds and displays a schema table with unique counts.
- Shows overall DataFrame summary statistics.
- Offers an expandable sample-data view.
-
LLM-Driven Config Generation
- Constructs a detailed prompt that:
- Ingests the dataset schema, template config, and file name/type.
- Instructs the LLM to map template reports, metrics, filters, and grouping keys to actual columns.
- Specifies rules for grouping on categorical/string columns (unique values between 2 and 99), plus time dimensions.
- On “Generate config” click:
- Calls
OpenAI.chat_completion()
to produce a new config file. - Writes to a new file under
temp_configs/
with a UUID name. - Reloads the app configuration (
set_config
) to clear caches and apply the new file. - Presents a Download button for the user’s generated
config.toml
.
- Calls
- Constructs a detailed prompt that:
Requirements
- Valid OpenAI API credentials (environment or input).
- A template config at
value_dashboard/config/config_template.toml
. - Uploaded dataset must be valid CDH IH export.
Summary
This one-page tool streamlines the end-to-end process of:
- Loading & inspecting IH datasets
- Deriving date/time features & metrics
- Auto-generating matching pipeline configs via GenAI
- Downloading and applying configurations for value reporting
It greatly reduces manual effort in maintaining report definitions and ensures that the dashboard pipeline is always aligned with the latest data schema.