Cx Experience Analysis Tables (05) - Govarthan-Boopalan/Customer_Behaviour_Analysis GitHub Wiki

This script generates a customer experience analysis PDF report by querying a MySQL database and formatting the results using ReportLab. Below is a detailed breakdown of its components.


Table of Contents

  1. Script Overview
  2. Flowchart
  3. Code Explanation
  4. Usage
  5. Improvement Suggestions

Script Overview

Purpose

  • Connects to a MySQL database using SQLAlchemy.
  • Executes predefined SQL queries to analyze customer experience metrics.
  • Generates a multi-page PDF report with tables and contextual explanations using ReportLab.

Features

  • Multi-page PDF: Includes a title page, product/review analysis, retention metrics, and common complaints.
  • Dynamic Data Fetching: Uses pandas.read_sql to execute queries and convert results to DataFrames.
  • Table Styling: Applies consistent formatting (colors, alignment, headers) to all tables.
  • Error Handling: Gracefully disposes database connections on failure.

Flowchart

Start
  │
  ├── Connect to MySQL via SQLAlchemy
  │
  ├── Create PDF Document Structure
  │   ├── Title Page (ShopEasy Branding, Date, Footnote)
  │   └── Page Breaks for Multi-Section Layout
  │
  ├── Execute SQL Queries
  │   ├── Lowest-Rated Products
  │   ├── Negative Reviews
  │   ├── Customer Retention Metrics
  │   └── Common Complaints
  │
  ├── Convert Data to Styled Tables
  │   ├── Add Headers & Descriptions
  │   └── Apply TableStyle (Colors, Grids, Fonts)
  │
  ├── Assemble PDF Elements
  │   ├── KeepTogether (Prevent Content Splitting)
  │   └── Add Final Footnote
  │
  ├── Build and Save PDF
  │
  └── End

Code Explanation

Imports

import pandas as pd                           # Data manipulation
from sqlalchemy import create_engine         # Database connection
from reportlab.lib.pagesizes import letter   # PDF page size
from reportlab.platypus import (             # PDF components
    SimpleDocTemplate, Table, TableStyle,
    Paragraph, Spacer, PageBreak, KeepTogether, HRFlowable
)
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from reportlab.lib import colors             # Color definitions

Key Functions

create_pdf_report()

The core function driving the script:

  1. Database Connection:

    DB_URI = "mysql+mysqlconnector://root:456123@localhost/ShopEasy"
    engine = create_engine(DB_URI)
    
    • Uses SQLAlchemy to connect to MySQL with credentials.
  2. PDF Setup:

    doc = SimpleDocTemplate(output_path, pagesize=letter)
    elements = []  # Holds all PDF content
    
    • Initializes the PDF document and an empty list to store elements (tables, text).
  3. Title Page:

    • Adds a stylized title, subtitle, horizontal rule (HRFlowable), and footer using Paragraph and custom styles.
  4. Query Execution:

    with engine.connect() as connection:
        df = pd.read_sql(queries[key], connection)
    
    • Executes SQL queries and loads results into DataFrames.
  5. Table Creation:

    data = [df.columns.tolist()] + df.values.tolist()
    table = Table(data)
    table.setStyle(...)  # Applies styling rules
    
    • Converts DataFrame to a ReportLab-compatible table format.
  6. Content Assembly:

    elements.append(KeepTogether([Paragraph(...), Spacer(...), table]))
    
    • KeepTogether ensures headings, descriptions, and tables stay on the same page.

SQL Queries & Table Descriptions

Queries Dictionary

queries = {
    "negative_reviews": "SELECT ... WHERE cr.Rating <= 2;",
    "lowest_rated_products": "SELECT ... ORDER BY AvgRating ASC LIMIT 5;",
    # Additional queries omitted for brevity
}
  • Key Queries:
    • Negative Reviews: Fetches reviews with ratings ≤ 2.
    • Low Retention Rate: Uses a CTE to calculate customer retention metrics.
    • Common Complaints: Aggregates frequent low-rated review texts.

Table Descriptions

table_descriptions = {
    "lowest_rated_products": "This table lists the products that consistently receive...",
    # Additional descriptions omitted
}
  • Provides context for each table in the PDF report.

PDF Styling

Custom Paragraph Styles

styles.add(ParagraphStyle(
    name="ReportTitle",
    fontSize=36,
    alignment=1,  # Centered
    fontName="Helvetica-Bold"
))
  • Defines styles for titles, subtitles, footnotes, and body text.

Table Styling

table.setStyle(TableStyle([
    ('BACKGROUND', (0, 0), (-1, 0), colors.HexColor('#4F81BD')),  # Header color
    ('TEXTCOLOR', (0, 0), (-1, 0), colors.whitesmoke),           # Header text
    ('GRID', (0, 0), (-1, -1), 1, colors.black)                  # Grid lines
]))
  • Applies alternating background colors and grid lines for readability.

Usage

  1. Prerequisites:
    pip install pandas sqlalchemy mysql-connector-python reportlab
    
  2. Run the Script:
    python 05_Cx_experience_Analysis_tables.py
    
  3. Output:
    • A PDF named Cx_Experience_Analysis_SQL_Tables.pdf is saved to ../results.

Improvement Suggestions

  1. Parameterization:
    • Allow database credentials and output path to be configured via environment variables.
  2. Dynamic Queries:
    • Add command-line arguments to customize query limits (e.g., LIMIT 5LIMIT 10).
  3. Enhanced Error Logging:
    • Log errors to a file instead of printing to the console.
  4. Additional Visuals:
    • Include charts (e.g., retention rate pie charts) using ReportLab's drawing tools.
  5. Connection Pooling:
    • Use SQLAlchemy's connection pooling for scalability.

Conclusion

This script automates the generation of customer experience reports by combining SQL data extraction with PDF formatting. Its modular structure allows for easy expansion, such as adding new queries or modifying table styles. By leveraging pandas and ReportLab, it efficiently bridges database analysis and stakeholder-friendly reporting.