Sunday, April 20 2025

Sentiment Analysis of Customer Reviews Using Snowflake Cortex & Coalesce.io

In the era of data-driven decision-making, understanding customer sentiment is crucial for businesses aiming to enhance products and services. Snowflake Cortex, with its powerful AI and machine learning capabilities, enables organizations to analyze text data like customer reviews directly within the Snowflake Data Cloud. When paired with Coalesce.io’s low-code data transformation platform, this process becomes accessible to data teams of all skill levels, streamlining the creation of AI-powered pipelines. In this blog, we explore a practice use case for training: performing sentiment analysis on customer reviews using Snowflake Cortex and Coalesce.io, leveraging a dataset of 1,020 reviews.

Use Case Overview

This practice use case simulates an e-commerce company analyzing customer feedback for five products: Wireless Headphones, Smartphone Case, Laptop Stand, Bluetooth Speaker, and USB Charger. The goal is to determine the sentiment (positive, negative, or neutral) of each review using Snowflake Cortex’s SENTIMENT function and build a scalable pipeline in Coalesce.io to automate the process. With a dataset of 1,020 reviews, this exercise is ideal for data analysts, engineers, or business intelligence professionals looking to practice AI-driven analytics.

Objectives

  • Use Snowflake Cortex to analyze the sentiment of customer reviews.
  • Build a low-code data pipeline in Coalesce.io to process and store results.
  • Generate actionable insights, such as identifying products with negative feedback.
  • Provide a hands-on training scenario for data teams.

Why Use Snowflake Cortex and Coalesce.io?

Snowflake Cortex offers fully managed AI functions, including SENTIMENT, which scores text from -1 (negative) to 1 (positive). By keeping data and computation within Snowflake, Cortex ensures security, governance, and performance without external dependencies.

Coalesce.io simplifies pipeline development with its drag-and-drop interface and pre-built nodes. The Cortex package in the Coalesce Marketplace provides ready-to-use nodes for functions like SENTIMENT, enabling rapid deployment without complex SQL coding.

Together, they empower teams to:

  • Build AI-driven pipelines quickly.
  • Democratize access to advanced analytics for non-technical users.
  • Scale workflows seamlessly with Snowflake’s compute power.

Setting Up the Environment

Prerequisites

  • A Snowflake account with Cortex functions enabled.
  • A Coalesce.io account connected to your Snowflake environment.
  • Permissions to create tables, load data, and use Cortex functions.
  • A dataset of 1,020 customer reviews (CUSTOMER_REVIEWS).

Dataset

The dataset contains 1,020 reviews across five products, stored in the TRAINING_DB.CUSTOMER_ANALYTICS.CUSTOMER_REVIEWS table with columns:

  • REVIEW_ID (1 to 1,020)
  • PRODUCT_NAME (e.g., Wireless Headphones)
  • REVIEW_TEXT (e.g., “Amazing sound quality!”)
  • REVIEW_DATE (April 1–30, 2025)

The reviews include a balanced mix of sentiments (~40% positive, 30% negative, 30% neutral).

Step-by-Step Guide: Building the Pipeline in Coalesce.io

Step 1: Prepare the Data

  • Create the Table in Snowflake (if not already done):

CREATE DATABASE IF NOT EXISTS TRAINING_DB;
CREATE SCHEMA IF NOT EXISTS CUSTOMER_ANALYTICS;
CREATE TABLE TRAINING_DB.CUSTOMER_ANALYTICS.CUSTOMER_REVIEWS (
    REVIEW_ID INT,
    PRODUCT_NAME STRING,
    REVIEW_TEXT STRING,
    REVIEW_DATE DATE
); 
  • Load the Dataset:
  1. Use the SQL INSERT statements generated by the Python script (see original use case) to populate the table with 1,020 records.
  2. Alternatively, upload a CSV file to a Snowflake stage and load it using COPY INTO.

Step 2: Set Up Coalesce.io

  • Connect to Snowflake: In Coalesce.io, configure your Snowflake connection with the appropriate credentials.
1.png, Apr 2025

 

 

 

 

 

 

 

 

  • Install the Cortex Package: Navigate to the Coalesce Marketplace. Install the “Snowflake Cortex” package, which adds nodes for functions like SENTIMENT.

2.png, Apr 2025

 

  • Open a Workspace: Click the Launch button to open the development environment.

 

3.png, Apr 2025

 

  • Add a Source Node: Add a Source Node into the pipeline canvas. Connect it to the CUSTOMER_REVIEWS table (TRAINING_DB.CUSTOMER_ANALYTICS_STAGING.CUSTOMER_REVIEWS).

 

4.png, Apr 2025

 

  • Add the Cortex Function Node: Right-click on the source node, then select Add Node. From the Cortex menu, choose the Cortex Function node..

 

5.png, Apr 2025

 

  • Edit Cortext Fucntion Node: Double-click the Cortex Function node, rename it to "CUSTOMER_REVIEWS", and choose the desired target storage location.

 

6.png, Apr 2025

 

  • Appy Sentiment Analysis: In the Transformation pane, add a new attribute named SENTIMENT_SCORE and set its data type to Number(5,2). Then, from the Cortex Packages section on the right, choose Sentiment and select SENTIMENT_SCORE as the column name from the available options.

 

7.png, Apr 2025

 

 

 

  • Categorize Sentiments: Add another attribute named SENTIMENT_CATEGORY in the Transform Node to classify sentiments based on scores. Use the following CASE statement in the Transform pane:.
 CASE
          WHEN SENTIMENT_SCORE > 0.5 THEN 'Positive'
          WHEN SENTIMENT_SCORE < -0.5 THEN 'Negative' 
          ELSE 'Neutral' 
END

 

8.png, Apr 2025

 

 

  • Create a Target Node: Click the Create button to create the table in the specified target schema.

 

9.png, Apr 2025

 

  • Validate and Deploy: In Coalesce.io, click Validate Run to check the pipeline for errors. Once validation is complete, click the Run button to execute the pipeline. Coalesce will then run the node and populate the table with customer review sentiments, including the corresponding sentiment categories.

 

10.png, Apr 2025

Step 4: Analyze the Results

 

  1. View Sentiment Scores: Query the REVIEW_SENTIMENT table in Snowflake:

 

11.png, Apr 2025

Benefits of Using Coalesce.io

 

  • No Coding Required: The drag-and-drop interface eliminates the need for manual SQL, making the pipeline accessible to non-technical users.
  • Rapid Development: Build and deploy the pipeline in under 30 minutes.
  • Scalability: Handle large datasets with Snowflake’s compute power.

Learning Outcomes

By completing this use case, you’ll:

 

  • Master the use of Snowflake Cortex’s SENTIMENT function for text analysis.
  • Gain hands-on experience building low-code pipelines in Coalesce.io.
  • Learn to interpret sentiment scores and translate them into business insights.
  • Practice creating scalable workflows for real-world applications.

Best Practices

 

  • Test Incrementally: Start with a subset of reviews (e.g., 100) to validate the pipeline before processing all 1,020 records.
  • Monitor Costs: Cortex functions use Snowflake credits. Use Snowflake cost tracking to optimize compute usage.
  • Version Control: Use Coalesce’s Git integration to manage pipeline versions.
  • Data Quality: Ensure REVIEW_TEXT is non-null to avoid errors in Cortex functions.

Next Steps

 

  • Expand the Dataset: Add more reviews or include additional products to test scalability.
  • Automate with Schedules: Set up Coalesce jobs to run the pipeline daily for new reviews.
  • Visualize Insights: Connect the REVIEW_SENTIMENT table to a BI tool like Tableau for interactive dashboards.

Conclusion

This practice use case demonstrates how Snowflake Cortex and Coalesce.io empower data teams to unlock customer insights through sentiment analysis. By leveraging Cortex’s AI capabilities and Coalesce’s low-code platform, you can build robust, scalable pipelines with minimal effort. Whether you’re a data engineer or a business analyst, this workflow offers a practical way to practice AI-driven analytics and drive business value.

Exploring Dynamic Metadata with Bind Variables in Snowflake SHOW Commands

Snowflake, a leading cloud data platform, continues to enhance its AI Data Cloud with powerful features for developers and data professionals. One of the latest additions, introduced as of April 2025, is Dynamic Metadata with Bind Variables in SHOW Commands. This feature revolutionizes how users  […]

Continue reading

Friday, April 18 2025

Exploring Snowflake Cortex COMPLETE Multimodal: Unlocking Insights from Text and Images

science-employment-slide.jpeg, Apr 2025

Introduction Snowflake Cortex, a fully managed AI and machine learning service within the Snowflake Data Cloud, has revolutionized how businesses analyze and derive insights from their data. With the introduction of the Cortex COMPLETE Multimodal function, now in public preview as of April 2025,  […]

Continue reading

Wednesday, March 26 2025

Snowpipe Streaming API: Real-Time Data Ingestion Made Simple in Snowflake

window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-E1JD9VR1C7'); In today’s fast-moving world, waiting hours—or even minutes—for data to land in your warehouse can feel like an eternity. Whether you’re tracking customer  […]

Continue reading

Snowflake Data Classification: Making Sense of Your Data the Easy Way

What is Snowflake Data Classification? Picture this: You’ve got a giant filing cabinet stuffed with papers—some are receipts, some are letters, and some are secret plans. Sorting through it all by hand would take forever. Snowflake Data Classification is like a magic scanner that looks at every  […]

Continue reading

Understanding Secure Views in Snowflake: Protecting Your Data Made Simple

What Are Secure Views? Imagine you’re running a library. You’ve got shelves full of books—some are public novels anyone can read, but others are private journals only certain people should see. Now, suppose you want to let visitors browse a catalog of book titles without revealing the private stuff  […]

Continue reading

Friday, March 21 2025

Real-Time Error Alerts from Snowflake Stored Procedures to Slack

1.png, Mar 2025

Introduction In today’s data-driven world, ensuring the reliability of your data pipelines is critical. Snowflake, a powerful cloud data platform, allows you to automate complex workflows using stored procedures. But what happens when a procedure fails? Manually checking logs or waiting for someone  […]

Continue reading

Saturday, September 14 2024

Anomaly Detection using LAG function

STORE_SALE.png, Sep 2024

This article provides an in-depth look at the LAG function, covering its syntax, practical use cases, and how it can enhance your data analysis skills. Understanding the LAG Function The LAG function is an SQL window function that lets you retrieve data from a previous row in the same result set,  […]

Continue reading

- page 1 of 4