Democratizing AI: A Practical Guide to Snowflake Cortex

Introduction

Let’s face it: in today’s data-driven world, AI and ML aren’t just buzzwords anymore; they’re essential for staying competitive. But the ‘elephant in the room’ has always been the complexity and cost. Setting up and managing AI/ML initiatives used to require a team of specialists and a hefty investment in infrastructure. That’s where Snowflake Cortex changes the game – It provides a smooth glidepath for companies to finally get serious about AI/ML, right within the Snowflake Data Cloud. This isn’t about replacing data scientists; it’s about empowering everyone, from business analysts to SQL developers, to leverage AI/ML, bringing these powerful tools to a wider audience.

Let’s do a deeper dive to explore what it can do, look at real-world use cases across various industries, and, most importantly, dive into some practical, hands-on code examples. My goal is to show you how it makes AI/ML accessible judiciously to everyone, not just data scientists.

What is Snowflake Cortex

Snowflake Cortex is basically a built-in AI/ML toolkit within Snowflake. It gets rid of the hassle of integrating with external AI/ML platforms, letting you use AI/ML directly on your data in the familiar Snowflake environment. Its tight integration is a boon & a huge advantage, as it eliminates data movement (brings compute to the data), simplifies data governance, and improves performance.

Here’s the gist of what it offers:

  • Pre-built ML functions: Think of these as ready-to-use tools for common tasks like forecasting, spotting anomalies, and figuring out customer sentiment. You can use them with simple SQL queries – I recently used Cortex ML Classification, without much coding, with great results.
  • Large Language Models (LLMs): Cortex gives you access to powerful LLMs for things like summarizing text, translating languages, and even answering questions based on text data. For example, you can now perform Sentiment Analysis on large text/email of a Customer Grievance, feedback or Survey in a jiffy.
  • AI/ML workflows: Cortex streamlines the whole AI/ML process, from prepping your data to deploying and monitoring your models, taking away the complexity involved in managing data pipelines and model infrastructure.
  • Snowflake ecosystem Integration: It plays nicely with other Snowflake services like Snowpark and the Data Marketplace, so you can build complete data and AI/ML solutions. This allows for seamless data sharing and collaboration, further enhancing the power of the Snowflake platform.

Ready to unlock the power of Snowflake with Hoonartek!

Contact Hoonartek today to schedule a free consultation and discover how we can help you achieve your data-driven goals.

Use Cases

Cortex is surprisingly versatile – There are numerous approaches to solving business problems with AI/ML, and Cortex provides several paths. Let’s explore some more detailed real-world examples across different industry sections.

Retail Sector relies heavily in predicting demand for specific products based on past sales, promotions, and even external factors like weather. Imagine being able to fine-tune your inventory based on accurate forecasts, minimizing waste and maximizing profits. 

Likewise for Financial Services, detecting fraudulent transactions in real-time is a big risk management topic. Cortex gives you options, from pre-built tools to custom models. There are many approaches for this, from anomaly detection to more complex pattern recognition. It can also help with risk assessment, determining the creditworthiness of loan applicants more accurately, and predicting customer churn, allowing banks to proactively retain valuable customers. 

Similarly, Healthcare cares for patient risk stratification, identifying patients at high risk of developing certain diseases based on their medical history- all ideal uses-cases for Cortex. Cortex can also accelerate drug discovery by analyzing vast datasets of biological and chemical data, potentially leading to faster development of new treatments. 

Example 1

If you know SQL, you’re already halfway there as Cortex provides a natural segway into AI/ML. Let’s start with a practical example: forecasting, and go a little deeper this time, focusing on the importance of time series data. Assume that you have below Sales data in a Snowflake table 

Sales_date

Product_id

sales_amount

2023-01-01

1

150.00

2023-01-02

1

175.00

2023-01-03

1

200.00

2023-01-04

1

180.00

2023-01-05

1

220.00

2023-01-06

1

250.00

2023-01-07

1

230.00

2023-01-08

1

260.00

2023-01-09

1

240.00

2023-01-10

1

270.00

2023-01-24

1

280.00

2023-01-25

1

300.00

2023-01-26

1

320.00

2023-01-27

1

350.00

2023-01-28

1

290.00

2023-01-29

1

310.00

2023-01-30

1

280.00

2023-01-31

1

320.00

Now, let’s use the Cortex FORECAST function to predict sales for the next 7 days for product_id = 1:

SELECT sale_date, sales_amount FROM daily_sales WHERE product_id = 1 ORDER BY sale_date;

SELECT FORECAST(sales_amount, 7) AS predicted_sales, 

 DATEADD(day, seq4(), ‘2023-02-01’) AS predicted_date 

 FROM daily_sales WHERE product_id = 1 AND sale_date = ‘2023-01-31’;

This will give you the forecasted sales for the next 7 days. Cortex supports a fail-fast strategy, letting you experiment and iterate quickly. If your initial models aren’t perfect (and they rarely are!), you can pivot to other methods available through Snowpark and integrate them with Cortex. You might, for example, incorporate external data like weather patterns or promotional campaigns to improve forecast accuracy. You can use more advanced techniques like time series decomposition to identify trends, seasonality, and residual noise in your data, which can further improve forecast accuracy.

Example 2

Let’s look at another example: Sentiment Analysis. This time, we’ll use the SENTIMENT function to analyze customer reviews, but we’ll also explore how to aggregate, contextualize, and visualize the results in order to understand customer perception for your products or services.

Below is the tabular representation of an unstructured NoSQL data

review_id

review_text

product_category

1

This phone has an amazing camera! I love it!

Phone

2

The tablet is okay, nothing special, the battery life is average.

Tablet

3

I am extremely disappointed. The laptop was broken on arrival and the customer service was terrible.

Laptop

4

This is the best purchase I have made in a long time! The headphones are great.

Headphones

5

The streaming device is good, but the shipping was slow.

Streaming Device

6

I’m not impressed. The smartwatch didn’t meet my expectations. The battery life is very short.

Smartwatch

 

SELECT review_text, SENTIMENT(review_text) AS review_sentiment, product_category

FROM customer_reviews;

 

SELECT product_category, review_sentiment, COUNT(*) AS review_count

FROM (SELECT review_text, SENTIMENT(review_text) AS review_sentiment, product_category FROM customer_reviews)

    GROUP BY product_category, review_sentiment

    ORDER BY product_category, review_sentiment;


This gives you a count of positive, negative, & neutral reviews for each product category, which you can then visualize in a dashboard or report. Here a simplistic Cortex LLM function allows you identify areas for improvement within specific product lines.

Things to Consider

Even though Cortex simplifies things, there are still some important things to keep in mind:

  • Data Quality Is Key: You’ve got to have ‘skin in the game’. Bad data gives bad models – Garbage in, garbage out, as they say. Spend time cleaning, validating, and preparing your data. This includes handling missing values, outliers, & data consistency.
  • Choose The Right Tools for The Job: Cortex offers a range of functions and LLMs, so pick the ones that fit your specific needs. Don’t use a ‘hammer to screw in a lightbulb’, right? 
  • Don’t Build & Forget: Don’t just build a model & forget but rather evaluate its performance – key is ‘progress, not perfection’. Regularly monitor your models, track their accuracy using appropriate metrics, and retrain them as needed with new data.
  • Think About Explainability: You need to understand why your model is making certain predictions. This is important for building trust & fairness. ‘Black-box’ models are problematic, so strive for transparency. 
  • Be Responsible: ‘Nip any potential biases in the bud’ during data preparation. Biased data leads to unfair outcomes, so it’s essential to address this early on. For example, if training data for a loan application overrepresents a certain demographic, the model might unfairly discriminate other groups.

Example 3

Let’s see one last example of a Cortex LLM function for working with text data. This is where you can truly unlock the value hidden within unstructured data like customer reviews, social media posts, and product descriptions. Imagine you have a table with product descriptions (with more varied descriptions):

product_iddescription
1This high-performance laptop boasts a powerful processor, stunning display, and long battery life. Perfect for professionals and gamers alike. It also features a backlit keyboard and a fingerprint reader for added security.
2A comfortable and stylish office chair with adjustable lumbar support, armrests, and headrest. Ideal for long workdays and promoting good posture. Available in black and gray.
3A lightweight and durable hiking backpack with multiple compartments, a hydration system, and a rain cover. Perfect for outdoor adventures and multi-day trips. Made from recycled materials.

You can use the SUMMARIZE function to get a concise summary of these descriptions:

SELECT description, SUMMARIZE(description) AS summary FROM product_descriptions;

This gives you a shorter version of each description, which is useful for displaying information in limited spaces or quickly grasping the key features of a product. You can also use LLMs for other tasks, like:

This is just the beginning. The ‘north star’ for Cortex is to make AI/ML accessible to everyone working with data, democratizing this powerful technology and putting it in the hands of more users. I’m excited to see what the future holds, with things like automated ML (AutoML), even more advanced LLMs with improved reasoning and contextual understanding, deep learning support for tackling more complex AI problems, and perhaps even bringing AI to the edge for real-time processing of data closer to the source.

Conclusion

Cortex enables a ‘fail-fast strategy’ for exploring the possibilities of AI/ML. You can quickly experiment with different functions and models, learn what works best for your data and your specific business needs, and pivot as needed based on the results you observe. And remember, it’s about ‘progress, not perfection’. The key is to start experimenting, learn from your results, and continuously improve your models over time. 

Snowflake Cortex is more than just a set of tools; it’s a platform for democratizing AI/ML and empowering businesses to unlock the full potential of their data. It removes many barriers to entry, making it easier than ever to integrate AI/ML into your workflows. So, if you’re looking to take your data analysis to the next level and explore the exciting world of AI/ML, I highly recommend exploring Snowflake Cortex. It’s a game-changer, and it’s here to stay.

Ready to unlock the power of Snowflake with Hoonartek? As a Premier Snowflake Service Partner with a proven track record of successful implementations, Hoonartek brings a unique blend of expertise and innovation to your data journey. Our team of 50+ certified Snowflake consultants possesses in-depth knowledge of the platform and best practices. Contact Hoonartek today to schedule a free consultation and discover how we can help you achieve your data-driven goals. Visit: https://hoonartek.com/partners/snowflake/

Shaurya Agarwal e1730799044314

Shaurya Agrawal

Known for his leadership in executing large-scale, data-driven initiatives, Shaurya has successfully delivered projects worth over $100 million for global clients, including renowned organizations like Bank of America, HP, and Toyota, through top consulting firms like KPMG and Capgemini. Based in Austin, Texas, Shaurya holds a Mechanical Engineering degree and an MBA from The University of Texas at Austin, where he is also pursuing a Master’s in Data Science.

Scroll to Top