Exploring Snowflake’s Search Optimization Service

Snowflake initially made a name for itself as the easiest data warehouse to use back in 2014. Since then it has transformed itself and its core technology into a full Snowflake Data Cloud.  While the Snowflake Data Cloud Account at first comes with many amazing features by default, there are many areas where you can optimize Snowflake for your specific needs and use cases.  As Snowflake has grown over the years, it has added a ton of functionality including paid services such as SnowPipe, Materialized Views, Auto Clustering, Search Optimization Service, and others.  


Today, let’s cover their Search Optimization Service.  This service can help the performance of point lookup for certain queries but remember it is available on the Enterprise Edition or higher for Snowflake (so Standard Edition users, you are out of luck if you wanted to use this service – you will need to upgrade your Account Edition.) This service is best for business users who rely on quick access to data to make critical business decisions. Alternatively it can be useful for data scientists who want to continuously explore specific subsets of data. Essentially it is a maintenance service that runs in the background of Snowflake and creates search access paths. These paths make it easier to load and populate data quickly, as well as update stale data. 


To turn on such a feature, you must first ensure you are using an account that has access to add it to a table. Having access means you have the following privileges: ownership & add search optimization. Once that requirement is met, its as simple as typing in the following into your console:

ALTER TABLE [IF EXISTS] <table_name> ADD SEARCH OPTIMIZATION;


To ensure it is turned on, show your tables and check to see that SEARCH_OPTIMIZATION says ON. A few notes to add on is that you WILL see an increase in credit consumption while the service runs and starts to build the search access paths. You can get an estimate of the cost for specific tables before committing by running the following command: 


SYSTEM$ESTIMATE_SEARCH_OPTIMIZATION_COSTS(‘<table_name>’)


Being strategic with the tables you introduce to the search optimization service will help greatly with reducing those costs. The service fits best for tables that aren’t queried by columns and tables that aren’t clustered.


If you add the service and decide to move it later on, you can easily do so with the correct privileges by running the following command:


ALTER TABLE [IF EXISTS] <table_name> DROP SEARCH OPTIMIZATION;


 This is just one solution to make your life easier and queries faster, however, there are many more out there that are more cost-friendly and do not require you to look thoroughly through your tables. One of the prime examples is Snoptimizer™, our service that scans for all the Snowflake anti-patterns and optimizes your account to help you run cost-effectively. It checks your resource monitors, auto suspend settings, cloud service consumption, and warehouse compute among other things to fix your account and ensure you are fully optimized. If you are interested in getting a trial, you can sign up and explore more here


Too Busy for the Snowflake Summit? We Feel You.

This Snowflake Summit was a roller coaster of emotions, but more often than not, we were thrilled with all the new announcements. With over 61+ sessions, we got to see some of Snowflake’s amazing new features, tons of use cases, and first hand looks on how to use their new tools with step-by-step labs. Most of us are too busy to watch two-days worth of webinars, but that’s where we come in – providing you with your weekly dose of Snowflake Solutions! We decided to help out by highlighting the most important announcements, as well as the sessions we thought were really worth the watch!

This time around Snowflake announced that they have five main areas of innovation: data programmability, global data governance, platform optimization, connected industries, and powered by Snowflake. While magical upgrades and new tools mean more flexibility for users, the reality is that most of these new features are still in private preview, so we (the public) won’t see them in action for some time. Regardless, we’ll still go through the top areas of innovation:

 

Platform optimization

Perhaps one of the most important improvements made this year is the improved storage economics. With reduced storage costs as a result of improved data compression, many will start to see savings on storage for new data. Snowflake has also developed new usage dashboards which will allow users to better track and understand their usage and costs across the platform. Cost optimization on Snowflake has thus far been a tricky subject, and while it seems Snowflake is making progress in that direction, there aren’t enough guardrails to prevent warehouse sizes (and bills) from skyrocketing. If you want to learn about the 1000 ways your company can accidentally lose money on Snowflake (and ways to prevent it), join us to learn more about Cost Optimization here!

 

Global Data Governance

Next up on the list are the six new data governance capabilities now introduced to the Snowflake platform. We’ll deep dive into the coolest three!

  1. Classification: automatically detects personally identifiable information.

    1. Why is this cool? We can apply specific security controls to protect their data!

  2. Row access policies: dynamically restrict the rows of data in the query based on the username, role, or other custom attributes.

    1. Why is this cool? We no longer need multiple secure views and can eliminate the need for maintaining data silos. That’s a win in our book.

  3. Access History: A new view that shows used and unused tables to produce reports.

    1. Why is this cool? You can see what’s actually bringing value and optimize storage costs based on what is frequently accessed or completely abandoned data. Who doesn’t love to save money?

 

Connected Industries

Following we have two upcoming features that we thought were worth mentioning since they will definitely be game changers! These two features are Discover & Transact, and Try Before You Buy, both of which will ease collaboration and data procurement between connected industries. While they are pretty self explanatory, it’s been a long week, so let’s go over them in quick detail.

  1. Discover and Transact: Directly within the Snowflake Data Marketplace, a consumer can now discover data and purchase with a usage-based pricing model.

    1. Why is this cool? Self-service! Duh! This will definitely reduce the cost of selling and delivering data to clients.

  2. Try Before You Buy: Now consumers can access sample data to make sure they’re getting all they need before signing that check.

    1. Why is this cool? Who doesn’t like a free sample?

 

Data programmability

Probably the most important updates are under the data programmability umbrella, so if you’re still with me, hang on a little longer, this is about to get interesting!

There are some innovations that are ready to be used now in public preview, so let’s check them out:

  1. SQL API: This new API enables customers to automate administrative tasks without having to manage infrastructure, there’s no need to maintain an external API management hub!

  2. Schema Detection: Now supports Parquet, Orc, Arvo, and hopefully more file formats in the future.

The good stuff that’s coming soon!

  1. Serverless Tasks: Snowflake will determine and schedule the right amount of computer resources needed for your tasks.

  2. Snowpark and Java UDFs: Snowpark is going to be the Snowflake developer’s new playground, allowing developers to bring their preferred languages directly into the platform. Java UDFS will also enable data engineers and developers to bring their own custom code to Snowflake, enabling better performance on both sides.

  3.  Unstructured Data Support: Soon, we will be able to treat unstructured data the same as structured data, with the ability to store, govern, process, and share.

  4. Machine Learning with Amazon SageMaker: A tool that will  automatically build and insert the best machine learning models into Snowflake!

 

Of course, the Snowflake Conference held various webinars on each of these innovations, so if you’d like to learn more, head over to those respective recordings. Hot topics this time around were definitely data governance and ML, so here are our top videos worth watching!

Conclusion: Again, while we were slightly disappointed to see that most of Snowflake’s new features were still in private preview, it makes us all the more excited for what’s to come! As always, IT Strategists will continue to guide you with these upcoming tools, so stay tuned for more Snowflake Solutions!

Snowflake Data Marketplace Introduction

Introduction

Long gone are the days where consumers have to copy data, use APIs, or wait days, weeks and sometimes even months to gain access to datasets. With Snowflake Data Marketplace, analysts around the world are getting the information they need to make important decisions for their businesses in a blink of an eye and at the palm of their hands.

So what is it and how does it work?

The Snowflake Data Marketplace is essentially a home to a variety of live, ready-to-query data. It utilizes Snowflake Secure Data Sharing to connect providers of data with consumers, as of now providing access to 229 datasets. As a consumer, you can discover and access a variety of third-party data and have those datasets available directly in your Snowflake account to query. There is no need for transformation and joining it with your own data takes only a few minutes. If you need to use several different vendors for data sourcing, the Data Marketplace gives you one single location from where to get the data.

Why is this so amazing?

Companies can finally securely provide and consume live, governed data in real time without having to copy and move data. In the past, access to such information could take days, weeks, months, or even years. With the Data Marketplace, gaining access only takes a couple of minutes. Already over 2000 businesses have requested access to essential data sets available free of charge in our marketplace. This is a gold mine for anyone who desires data-driven decision-making access to live and ready-to-query data, and the best part is that it is globally available, across clouds.

There are many benefits for providers and consumers alike. There are three main points, however, that allow companies to unlock their true potential when using the Data Marketplace.

Source Data Faster and More Easily

  • As we said above, using Snowflake Data Marketplace as a consumer allows users to avoid the risk and hassle of having to copy and move stale data. Instead, securely access live and governed shared data sets, and receive automatic updates in real time.

Monetize Your Own Data

  • As a provider, you can create new revenue streams by joining Snowflake Data Marketplace to market your own governed data assets to potentially thousands of Snowflake data consumers.

Reduce Analytics Costs

  • Using this service, both consumers and providers can virtually eliminate the costs and effort associated with the traditional ETL processes of data ingestion, data pipelines and transformation thanks to direct, secure, and governed access from your Snowflake account to live and ready-to-query shared data.

For more information, watch the video below or visit https://www.snowflake.com/data-marketplace/