Introduction

Snowflake Cortex, a fully managed AI and machine learning service within the Snowflake Data Cloud, has revolutionized how businesses analyze and derive insights from their data. With the introduction of the Cortex COMPLETE Multimodal function, now in public preview as of April 2025, Snowflake empowers users to process both text and images seamlessly using a single SQL command. This blog explores the capabilities, use cases, and benefits of the COMPLETE Multimodal function, showcasing how it simplifies multimodal data analysis and drives faster, deeper insights.

What is Snowflake Cortex COMPLETE Multimodal?

The Cortex COMPLETE Multimodal function is a powerful, instruction-following feature that leverages advanced multimodal large language models (LLMs) to process text, images, and unstructured data within Snowflake. Unlike traditional Cortex LLM functions focused solely on text, the multimodal capability allows users to analyze visual content, such as charts, diagrams, or documents, alongside textual prompts. This eliminates the need to move data to external platforms, ensuring security, governance, and efficiency.

Key features include:

  • Support for Multiple Models: Models like Anthropic’s Claude Sonnet 3.5 offer varying capabilities, latency, and cost, allowing users to choose based on task complexity.
  • Image and Text Processing: Process single or multiple images (up to 100 with the PROMPT helper function) alongside text prompts.
  • SQL-Based Interface: Execute complex AI tasks using simple SQL queries, making it accessible to data analysts and engineers.
  • Secure Data Handling: Data remains within Snowflake’s secure perimeter, adhering to enterprise-grade governance and privacy standards.

Why Multimodal Matters

In today’s data-driven world, businesses generate vast amounts of unstructured data, including images, PDFs, and text. Traditionally, analyzing such data required multiple tools, complex integrations, and data movement, leading to inefficiencies and security risks. Cortex COMPLETE Multimodal addresses these challenges by:

  • Simplifying Workflows: No need to hop between platforms for text and image analysis.
  • Enhancing Insights: Combine visual and textual data to uncover richer, context-aware insights.
  • Reducing Costs: Pay-per-use pricing and serverless architecture make advanced AI affordable.

For example, a retailer can analyze customer feedback (text) alongside product images to assess sentiment and quality, all within Snowflake.

How It Works

The COMPLETE Multimodal function processes inputs through a straightforward SQL syntax. Users specify:

  • Model: Choose a multimodal model (e.g., claude-3-5-sonnet).
  • Prompt: Define the task, such as summarizing a chart or comparing images.
  • Image Path: Reference image files stored in a Snowflake stage (internal or external, e.g., Amazon S3).

Example: Summarizing a Pie Chart

Consider a pie chart (science-employment-slide.jpeg) showing the distribution of occupations where mathematics is critical in 2023. The following SQL query summarizes the chart’s insights:

SELECT SNOWFLAKE.CORTEX.COMPLETE(
    'claude-3-5-sonnet',
    'Summarize the insights from this pie chart in 100 words',
    TO_FILE('@myimages', 'science-employment-slide.jpeg')
);

science-employment-slide.jpeg, Apr 2025
















Response:

 

Screenshot 2025-04-18 232747.jpeg, Apr 2025









This example demonstrates how Cortex COMPLETE Multimodal extracts meaningful insights from visual data with minimal effort.

Example: Comparing Ad Creatives

To compare two ad images (adcreative_1.png and adcreative_2.png) and identify their ideal audiences:
 

adcreative_1.png, Apr 2025






adcreative_1.png

adcreative_2.png, Apr 2025






adcreative_2.png

SELECT SNOWFLAKE.CORTEX.COMPLETE(
    'claude-3-5-sonnet',
    PROMPT(
        'Compare this image {0} to this image {1} and describe the ideal audience for each in two concise bullets no longer than 10 words',
        TO_FILE('@myimages', 'adcreative_1.png'),
        TO_FILE('@myimages', 'adcreative_2.png')
    )
);

Response:

 

1.jpeg, Apr 2025











This showcases the function’s ability to handle multiple images and deliver targeted insights.

Setting Up for Success

To use Cortex COMPLETE Multimodal, ensure the following:

  1. Stage Creation: Store images in a Snowflake stage with server-side encryption and directory table enabled. Example:

    CREATE OR REPLACE STAGE input_stage
        DIRECTORY = ( ENABLE = true )
        ENCRYPTION = ( TYPE = 'SNOWFLAKE_SSE' );
    
  2. Permissions: Grant the CORTEX_USER role to users for LLM function access.

  3. Region Availability: Check Snowflake documentation for supported regions, as multimodal features are region-specific.

Use Cases

Cortex COMPLETE Multimodal unlocks a wide range of applications:

  • Business Intelligence: Summarize charts or dashboards for quick decision-making.
  • Marketing: Analyze ad creatives or social media visuals to optimize campaigns.
  • Document Analysis: Extract insights from scanned documents or invoices.
  • E-Commerce: Combine product images and reviews for quality assurance.
  • Healthcare: Analyze medical imaging alongside patient notes for diagnostics.

Benefits

  • Efficiency: Process multimodal data in seconds without external tools.
  • Scalability: Leverage Snowflake’s architecture for large-scale workloads.
  • Security: Keep data within Snowflake’s governance boundary.
  • Accessibility: SQL-based interface democratizes AI for non-experts.

Cost Considerations

Cortex COMPLETE Multimodal incurs compute costs based on tokens processed (input and output). Costs vary by model, with detailed pricing in Snowflake’s Service Consumption Table. Use the COUNT_TOKENS function to estimate token usage and manage budgets effectively.

Conclusion

Snowflake Cortex COMPLETE Multimodal is a game-changer for businesses seeking to harness the power of text and image data. By integrating advanced multimodal LLMs into Snowflake’s secure, scalable platform, it simplifies complex AI tasks and delivers actionable insights with ease. Whether you’re summarizing charts, comparing visuals, or analyzing documents, this function empowers data teams to work smarter, not harder.