Cx Experience Analysis Tables (05) - Govarthan-Boopalan/Customer_Behaviour_Analysis GitHub Wiki
This script generates a customer experience analysis PDF report by querying a MySQL database and formatting the results using ReportLab. Below is a detailed breakdown of its components.
Table of Contents
Script Overview
Purpose
- Connects to a MySQL database using SQLAlchemy.
- Executes predefined SQL queries to analyze customer experience metrics.
- Generates a multi-page PDF report with tables and contextual explanations using ReportLab.
Features
- Multi-page PDF: Includes a title page, product/review analysis, retention metrics, and common complaints.
- Dynamic Data Fetching: Uses
pandas.read_sql
to execute queries and convert results to DataFrames. - Table Styling: Applies consistent formatting (colors, alignment, headers) to all tables.
- Error Handling: Gracefully disposes database connections on failure.
Flowchart
Start
│
├── Connect to MySQL via SQLAlchemy
│
├── Create PDF Document Structure
│ ├── Title Page (ShopEasy Branding, Date, Footnote)
│ └── Page Breaks for Multi-Section Layout
│
├── Execute SQL Queries
│ ├── Lowest-Rated Products
│ ├── Negative Reviews
│ ├── Customer Retention Metrics
│ └── Common Complaints
│
├── Convert Data to Styled Tables
│ ├── Add Headers & Descriptions
│ └── Apply TableStyle (Colors, Grids, Fonts)
│
├── Assemble PDF Elements
│ ├── KeepTogether (Prevent Content Splitting)
│ └── Add Final Footnote
│
├── Build and Save PDF
│
└── End
Code Explanation
Imports
import pandas as pd # Data manipulation
from sqlalchemy import create_engine # Database connection
from reportlab.lib.pagesizes import letter # PDF page size
from reportlab.platypus import ( # PDF components
SimpleDocTemplate, Table, TableStyle,
Paragraph, Spacer, PageBreak, KeepTogether, HRFlowable
)
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from reportlab.lib import colors # Color definitions
Key Functions
create_pdf_report()
The core function driving the script:
-
Database Connection:
DB_URI = "mysql+mysqlconnector://root:456123@localhost/ShopEasy" engine = create_engine(DB_URI)
- Uses SQLAlchemy to connect to MySQL with credentials.
-
PDF Setup:
doc = SimpleDocTemplate(output_path, pagesize=letter) elements = [] # Holds all PDF content
- Initializes the PDF document and an empty list to store elements (tables, text).
-
Title Page:
- Adds a stylized title, subtitle, horizontal rule (
HRFlowable
), and footer usingParagraph
and custom styles.
- Adds a stylized title, subtitle, horizontal rule (
-
Query Execution:
with engine.connect() as connection: df = pd.read_sql(queries[key], connection)
- Executes SQL queries and loads results into DataFrames.
-
Table Creation:
data = [df.columns.tolist()] + df.values.tolist() table = Table(data) table.setStyle(...) # Applies styling rules
- Converts DataFrame to a ReportLab-compatible table format.
-
Content Assembly:
elements.append(KeepTogether([Paragraph(...), Spacer(...), table]))
KeepTogether
ensures headings, descriptions, and tables stay on the same page.
SQL Queries & Table Descriptions
Queries Dictionary
queries = {
"negative_reviews": "SELECT ... WHERE cr.Rating <= 2;",
"lowest_rated_products": "SELECT ... ORDER BY AvgRating ASC LIMIT 5;",
# Additional queries omitted for brevity
}
- Key Queries:
- Negative Reviews: Fetches reviews with ratings ≤ 2.
- Low Retention Rate: Uses a CTE to calculate customer retention metrics.
- Common Complaints: Aggregates frequent low-rated review texts.
Table Descriptions
table_descriptions = {
"lowest_rated_products": "This table lists the products that consistently receive...",
# Additional descriptions omitted
}
- Provides context for each table in the PDF report.
PDF Styling
Custom Paragraph Styles
styles.add(ParagraphStyle(
name="ReportTitle",
fontSize=36,
alignment=1, # Centered
fontName="Helvetica-Bold"
))
- Defines styles for titles, subtitles, footnotes, and body text.
Table Styling
table.setStyle(TableStyle([
('BACKGROUND', (0, 0), (-1, 0), colors.HexColor('#4F81BD')), # Header color
('TEXTCOLOR', (0, 0), (-1, 0), colors.whitesmoke), # Header text
('GRID', (0, 0), (-1, -1), 1, colors.black) # Grid lines
]))
- Applies alternating background colors and grid lines for readability.
Usage
- Prerequisites:
pip install pandas sqlalchemy mysql-connector-python reportlab
- Run the Script:
python 05_Cx_experience_Analysis_tables.py
- Output:
- A PDF named
Cx_Experience_Analysis_SQL_Tables.pdf
is saved to../results
.
- A PDF named
Improvement Suggestions
- Parameterization:
- Allow database credentials and output path to be configured via environment variables.
- Dynamic Queries:
- Add command-line arguments to customize query limits (e.g.,
LIMIT 5
→LIMIT 10
).
- Add command-line arguments to customize query limits (e.g.,
- Enhanced Error Logging:
- Log errors to a file instead of printing to the console.
- Additional Visuals:
- Include charts (e.g., retention rate pie charts) using ReportLab's drawing tools.
- Connection Pooling:
- Use SQLAlchemy's connection pooling for scalability.
Conclusion
This script automates the generation of customer experience reports by combining SQL data extraction with PDF formatting. Its modular structure allows for easy expansion, such as adding new queries or modifying table styles. By leveraging pandas
and ReportLab
, it efficiently bridges database analysis and stakeholder-friendly reporting.