Analysing Customer Sentiment with Microsoft Fabric AI Functions - ivinnyaraujo/dataengineer-datascience-python GitHub Wiki

In the fast-paced world of data science and engineering, integrating AI into data workflows for data enrichment has become a necessity rather than a strategic business advantage. To facilitate this process, data professionals can now leverage Microsoft Fabric's recently introduced suite of AI Functions that enable enriching and transforming data using large language models (LLMs) with just a few lines of code. Whether classifying text or detecting sentiment, these functions simplify data workflows and empower teams to extract deeper insights from their data.

One particularly relevant use case for using AI functions is the analysis of text data. Text often contains a wealth of information, from emotions and opinions to facts and context, but its unstructured nature makes it challenging to analyse at scale. AI functions help address this by extracting and categorising key elements, making the data more structured, interpretable, and actionable. For instance, ai.classify can be configured to assign custom labels to text inputs. While it returns categorical outputs, these can be mapped to numerical values, enabling seamless integration with dashboards, KPIs, or statistical models for quantitative analysis.

Customer Sentiment Analysis with ai.analyse_sentiment

To demonstrate the practical use of these functions, this article showcases the ai.analyze_sentiment function within a Microsoft Fabric notebook to evaluate customer satisfaction for a clothing store, based on text feedback collected throughout 2024.

Text data from a mock sample that was used to create categorical fields using AI functions to classify customer sentiment. custom_satisfaction_comment: is the original text; sentiment: AI function sentiment analysis output; sentiment_category: mapped attributes to custom categories to make it more business-friendly

By analysing sentiment across different months, I was able to categorise customer emotions into actionable insights, ranging from 'Very Satisfied' to 'Not Satisfied'. The enriched data was written to a Lakehouse table and can be visualised in Power BI, enabling the business to monitor satisfaction trends and drive continuous improvement. See code below, and the notebook can be accessed here.

### Customer Sentiment Analysis using AI Functions
### Author: Ivy Araujo (21/06/2025)
### Fabric Notebook (PySpark (Python))


# Import Required Libraries
from pyspark.sql.functions import when, col
import pandas as pd
import matplotlib.pyplot as plt

# AI Function Extensions for Spark DataFrames
from synapse.ml.spark.aifunc.DataFrameExtensions import AIFunctions
from synapse.ml.services.openai import OpenAIDefaults

# Set Default Configuration for OpenAI
defaults = OpenAIDefaults()
defaults.set_deployment_name("gpt-35-turbo-0125")

# Load Input Data (from Lakehouse)
df = spark.sql("SELECT * FROM Sandbox_DataScience_Test.custom_satisfaction_sentiment_analysis")
display(df)

# Sentiment Analysis using ai.analyze_sentiment
sentiment = df.ai.analyze_sentiment(
    input_col="custom_satisfaction_comment",
    output_col="sentiment"
)
display(sentiment)

# Map Sentiment Labels to Business-Friendly Categories
mapped_sentiment = sentiment.withColumn(
    "sentiment_category",
    when(col("sentiment") == "positive", "Very Satisfied")
    .when(col("sentiment") == "neutral", "Satisfied")
    .when(col("sentiment") == "mixed", "Neutral")
    .when(col("sentiment") == "negative", "Not Satisfied")
    .otherwise("N/A")
)

# Display Final Result
display(mapped_sentiment.select(
    "date",
    "custom_satisfaction_comment",
    "sentiment",
    "sentiment_category"
))

# Write Output to Lakehouse
mapped_sentiment.write \
    .format("delta") \
    .mode("overwrite") \
    .saveAsTable("Sandbox_DataScience_Test.processed_custom_satisfaction_sentiment_analysis")

# Preview Output Table
display(spark.sql("SELECT * FROM Sandbox_DataScience_Test.processed_custom_satisfaction_sentiment_analysis LIMIT 5"))

# [Optional] - Visualise Monthly Sentiment Distribution
# Convert Spark DataFrame to Pandas
pdf = mapped_sentiment.select("date", "sentiment_category").toPandas()

# Convert 'date' to datetime and extract month name
pdf['date'] = pd.to_datetime(pdf['date'], format='%d/%m/%Y')
pdf['Month'] = pdf['date'].dt.strftime('%B')

# Group by Month and SentimentCategory and get counts
sentiment_counts = pdf.groupby(['Month', 'sentiment_category']).size().unstack(fill_value=0)

# Reorder months
month_order = ['January', 'February', 'March', 'April', 'May', 'June',
               'July', 'August', 'September', 'October', 'November', 'December']
sentiment_counts = sentiment_counts.reindex(month_order).dropna(how='all')

# Convert to percentage (i.e., row-wise normalization)
sentiment_percent = sentiment_counts.div(sentiment_counts.sum(axis=1), axis=0) * 100

# Plot normalised stacked bar chart (percentage per month)
sentiment_percent.plot(kind='bar', stacked=True, figsize=(12, 6), colormap='tab20')

plt.title('Sentiment Distribution per Month (as %)')
plt.xlabel('Month')
plt.ylabel('Percentage of Responses')
plt.xticks(rotation=45)
plt.legend(title='Sentiment')
plt.tight_layout()
plt.show()

By visualising sentiment over time, the business can identify which months show higher or lower customer satisfaction and investigate the underlying causes. For instance, lower satisfaction during peak seasons may indicate a need to hire additional staff or improve service processes. These insights support data-driven decisions that enhance both customer experience and operational efficiency. Moreover, this process can be fully automated within a data pipeline, allowing new feedback records to be continuously analysed for sentiment as they are added to the dataset, ensuring insights remain current, relevant, and actionable at all times.

This bar chart shows the percentage distribution of customer sentiment across each month of 2024, based on customer text feedback collected from a clothing store. After classifying the text data into four sentiment categories, the results are visualised to better analyse and understand trends over time.

Enriching text data with AI functions in Microsoft Fabric isn’t just about automation, it unlocks a new level of analytics. By embedding capabilities like sentiment analysis directly into data workflows, teams can transform complex, unstructured feedback into structured, analysable fields. This makes it easier to categorise information, uncover trends, and generate insights that are ready to support data-driven decisions across the business.