A Deep Dive into Data Sharing

Introduction:

Big data refers to extremely large datasets that can be analyzed to identify patterns, trends, and associations. The analysis of big data provides insights into various fields, including business, science, and government. However, the challenge with big data is not just analyzing it, but also storing, managing, and sharing it. This is where technologies like Snowflake come into play, as they offer a secure platform for storing and sharing large amounts of data.

 

Part 1: What is Data Sharing?

Let’s begin with data, data can derive from software that is used by enterprises within their business. For example, how many people are viewing a website, or what kind of people are most interested in a certain brand? On a lower level, data sharing is simply when data resources are shared with many users or applications and at the same time assuring that there is data fidelity to all of those participating.

Now how is this relevant today? Currently, data sources are continuous which in turn means that there have to be data volumes for all the data sources. The main focus, of data sharing, has become how to move these increasing volumes of data and how to ensure that the data is accurate and secure. The cloud comes into play as it is expanding what data sharing is capable of. Now that there is the modern cloud, data sharing can allow people to share live data within their business and outside of it, get rid of data silos, create access to specific data sets, and more. However, this would require a platform that can put data sharing into motion and ensure that it works to its potential and this is where Snowflake comes into the picture.

 

Snowflake and Data Sharing

Snowflake allows for data collaboration while at the same time lowering costs. It gives organizations the ability to securely share data and access live data. Not only is it secured and governed access to shared data but you can also publish data sets. As you can see the possibilities seem endless, but that’s only a brief preview of the capabilities of data sharing within Snowflake so let’s take a deeper look at the many parts that play a role in data sharing in Snowflake and how they come together in data sharing.

 

Part 2: What are Data Providers and Consumers?

A data provider is an account in Snowflake that creates shares that can be accessed by other accounts in Snowflake. When a database is shared, Snowflake supports it through grants that allow access control to objects within the database. There are no restrictions on the number of shares that can be created or accounts that can be added to a share.

A data consumer is an account that creates a database from a Share that is made accessible by another data provider. When you add a shared database to your account, you can access and query the objects within it. There are no limitations on how many Shares you can consume from data providers, but you can only create one database for each Share.

 

What is a Share?

In Snowflake, Shares are objects that contain all the necessary information for sharing a database. Shares include permissions that provide access to the databases and schema containing the object to be shared, as well as access to specific objects within the database. Additionally, consumer accounts are shared with the database and objects.

When a database is created from a Share, the objects shared within it become available to any users within the consumer account. These Shares can be customized, are secure, and are fully controlled by the provider account. This allows objects added to a Share to be accessed in real-time by consumers, and the provider account can also rescind access to a Share or any of its objects.

 

Part 3: How does Secure Data Sharing Function work in Snowflake?

When securely sharing data, the data is not copied or transferred between accounts, as one might assume. Rather, sharing is accomplished through Snowflake’s layer and metadata store. As a result, shared data does not occupy storage space within a consumer account, and therefore does not contribute to monthly data storage costs. However, charges will be incurred for the compute resources required to query the shared data.

Going back to what was previously mentioned, because the data itself is not copied or exchanged it makes secure data sharing an easy and fast setup for providers and it also makes shared data quickly available to consumers. But let’s take a closer look at how data sharing works for both the provider and the consumer:

 

Provider:

We will create a share of a database within your account. You can then grant access to objects within the database. This will enable you to share data from multiple databases, as long as those databases are under the same account. Finally, you can add one or more accounts to the share, including any accounts that you may have within Snowflake.

 

Consumer:

We will set up a read-only database from Share. You can customize access to the database by using the same access control that is provided for objects.

The structure of Snowflake allows providers to share data with many consumers, even those within their organization. Consumers can access shared data from many providers.

 

What Information is shared with Providers?

Snowflake providers have access to certain information about consumers who access their data. This includes the consumer's Snowflake account and organization names. Providers can also view statistical data about data consumption, such as the date of consumption and the number of queries generated by the consumer account on a provider's Share.

In addition, providers can see any information that a consumer provides at the time of data request submissions, such as the consumer's business email and company name.

 

Can I share with Third Parties?

Sharing data is only possible between Snowflake accounts. However, if you're a provider within Snowflake, you may want to share data with a consumer outside of Snowflake. Luckily, Snowflake has created reader accounts to facilitate this process.

Reader accounts enable data to be shared with consumers who are not Snowflake customers without the need for them to become one. These accounts are owned by the provider account that created them. While the provider account uses Shares to share databases with reader accounts, the reader account can only receive data from the provider account that created it.

Users with a reader account can query shared data, but they are unable to perform DML tasks that are available in a full account.

Having introduced data sharing and its workings within Snowflake, let's explore other features that come with Snowflake's data sharing.

 

Part 4: Products that use Secure Data Sharing in Snowflake

Snowflake offers additional products that enable data sharing between providers and consumers. These products include Direct Share, Snowflake Data Marketplace, and Data Exchange.

 

Direct Share:

Direct Share is a simple method of sharing data that enables account-to-account data sharing while utilizing Snowflake's Secure Data Sharing. As the provider (account on Snowflake), you can grant access to your data to other companies, allowing them to view your data within their Snowflake account without the need to move or copy any data.

 

Snowflake Data Marketplace:

All accounts in Snowflake can access the Snowflake Data Marketplace, provided they are in non-VPS regions on supported cloud platforms. The Data Marketplace uses Snowflake's Securing Data Sharing to facilitate connections between providers and consumers, similar to the Direct Share product.

You have the option to access third-party data and import the datasets into your Snowflake account without the need for transformation. This allows you to easily combine it with your existing data. The Data Marketplace provides a central location to obtain data from multiple sellers, simplifying the process of data sourcing.

Additionally, becoming a provider and publishing data within the Data Marketplace is a great way to monetize your data and reach a wider audience.

 

Data Exchange:

Data Exchange enables secure collaboration around data between invited groups, allowing providers to share data with consumers, as well as with your entire organization, including customers, partners, or even just within your unit. It also provides you with the ability to control who has access to your data, and who can publish, consume, or simply view it. Specifically, you can invite others and determine whether they are authorized to provide or consume data. Data Exchange is available for all Snowflake accounts hosted on non-VPS regions and supported cloud platforms.

These three products in Snowflake that use secure data sharing are useful for both provider and consumer accounts (and more) within Snowflake. Now that we have seen how data sharing works and what other features use data sharing in Snowflake, let's take a look at how to use the data that was shared with you or your data that is shared with others and more.

 

Working with Shared Data:

Once you have a grasp of the fundamentals of direct share, Snowflake Marketplace, and data exchange, there are additional concepts and tools available for you to explore.

Within Snowflake, those with an ACCOUNTADMIN role can utilize the Shared Data page on the new web interface to manage and create shares. As we delve further, please note that "inbound" refers to data that has been shared with you, while "outbound" refers to data shared from your account.

 

Data Shared with You:

Provider accounts can share inbound shares with your account using Direct Share, Data Exchange, or the Snowflake Marketplace. Inbound shares allow you to view data shared by providers, including who provided the share and how the data was shared. You can also create a database from a share.

To access your inbound shares, go to the "Share With Me" tab within the Snowflake web interface. Here you will find:

  • Direct shares that are shared with you. These shares are placed into two groups: 1. Direct shares that are ready to be used and 2. Direct shares that have been imported into a database can be queried.
  • Listings for data exchange that you have access to. The data is shown under the name of the initial data exchange. If you have more than one data exchange, each data exchange will be shown within separate sections.
  • Listings for the Snowflake Marketplace data that have been moved into a database and can be queried. However, it does not show shares that are ready to be used. You can find the data listing in the Marketplace menu.

 

Data You Shared:

Your account allows you to share data with consumers through outbound shares. You can share data directly, through data exchange, or via the Snowflake Marketplace (as previously mentioned for inbound shares).

With outbound shares, you can:

  • View the shares you have created or have access to, including information such as the database for the share, consumer accounts that can access the share, the date when the share was created, and the objects that are being shared.
  • Create and edit both a share and its data listing.
  • Remove access to the share for individual consumer accounts.

Returning to the web interface, the "Shared by My Account" tab displays outbound shares from Snowflake Marketplace, data exchange, and direct shares.

When considering shares, icons are located beside each share to indicate the sharing mechanisms like direct sharing, data exchange, or Snowflake Marketplace.

Lastly, there are filters available when viewing your shared data:

  • Type: This is presented as the "Ally Types" drop-down and allows you to differentiate direct shares from listings.
  • Consumer: This is presented as the "Shared With" drop-down and allows you to select a specific consumer or data exchange (where the data has been shared).

 

Data that is Shared

When sharing data, there are many ways you can do this:

  1. Use direct share to directly share data with consumers
  2. In the Snowflake Marketplace, post a listing
  3. In data exchange, post a listing

Furthermore, when you are in the web interface and you want to share data, you will use the “Share Data” drop-down and choose from the list that provides all the platforms where you can share data.

 

Requesting Data

In the web interface, you can view inbound and outbound requests in the "Requests" tab. However, this tab does not display data requests from the Snowflake Data Marketplace.

Let's take a moment to review what inbound and outbound requests mean.

Inbound requests are made by consumers who are seeking access to your data. You can organize these requests by status and review them accordingly. Outbound requests, on the other hand, are requests made by you to obtain data listings from other providers. Just like inbound requests, you can sort them by status. Keep in mind that the requests you make may be rejected, but you can always resubmit them.

 

Managing Exchanges

In certain roles, such as the Data Exchange Admin role or if you have Provider Profile Level Privileges, you can create and organize provider profiles within the “Manage Exchanges” tab. However, if your organization does not have a data exchange, the “Manage Exchanges” tab will not be visible.

Regarding the provider profile, with this role, you can perform the following tasks within a data exchange:

  • Create, update, and delete a profile
  • Update contact email
  • Manage profile editors

Now that we have reviewed data sharing, you should be able to understand all its components and the different functions it offers!

To keep up with new features, regularly visit our website for more information and tips.

 

Conclusion:

This article provides a deep dive into data sharing and how it works within the Snowflake ecosystem. It covers the basics of data sharing, the role of data providers and consumers, and how to secure data-sharing functions. Additionally, it explores Snowflake's products that use secure data sharing, such as Direct Share, Snowflake Data Marketplace, and Data Exchange. The article also explains how to work with shared data, including managing inbound and outbound requests and managing exchanges.

Exploring Snowflake’s Search Optimization Service

Introduction:

 

In today’s article, we’ll explore together Snowflake's Search Optimization Service, a feature that can improve the performance of point lookup for certain queries by creating search access paths. The service is available on the Enterprise Edition or higher for Snowflake and is best for business users who rely on quick access to data for critical business decisions.

The article also covers how to turn on the service, its benefits, and its cost. We also introduce Snoptimizer™, our service that scans for all the Snowflake anti-patterns and optimizes your account to help you run cost-effectively.

 

History of Snowflake & Search Optimization:

 

Snowflake initially made a name for itself as the easiest data warehouse to use back in 2014. Since then it has transformed itself and its core technology into a full Snowflake Data Cloud. While the Snowflake Data Cloud Account at first comes with many amazing features by default, there are many areas where you can optimize Snowflake for your specific needs and use cases. As Snowflake has grown over the years, it has added a ton of functionality including paid services such as SnowPipe, Materialized Views, Auto Clustering, Search Optimization Service, and others.

Alternatively, it can be useful for data scientists who want to continuously explore specific subsets of data. Essentially it is a maintenance service that runs in the background of Snowflake and creates search access paths. These paths make it easier to load and populate data quickly, as well as update stale data.

 

Turning on the Feature:

 

To turn on such a feature, you must first ensure you are using an account that has access to add it to a table. Having access means you have the following privileges: ownership & add search optimization. Once that requirement is met, it’s as simple as typing the following into your console:

 

ALTER TABLE [IF EXISTS] <table_name> ADD SEARCH OPTIMIZATION;

To ensure it is turned on, show your tables and check to see that ‘Search Optimization’ says ON. A few notes to add is that you will see an increase in credit consumption while the service runs and starts to build the search access paths. You can get an estimate of the cost for specific tables before committing by running the following command:

 

SYSTEM$ESTIMATE_SEARCH_OPTIMIZATION_COSTS(‘<table_name>’)

Being strategic with the tables you introduce to the search optimization service will help greatly with reducing those costs. The service fits best for tables that aren’t queried by columns and tables that aren’t clustered.

If you add the service and decide to move it later on, you can easily do so with the correct privileges by running the following command:

 

ALTER TABLE [IF EXISTS] <table_name> DROP SEARCH OPTIMIZATION;

This is just one solution to make your life easier and queries faster, however, there are many more out there that are more cost-friendly and do not require you to look thoroughly through your tables. One of the prime examples is Snoptimizer™, our service that scans for all the Snowflake anti-patterns and optimizes your account to help you run cost-effectively. It checks your resource monitors, auto-suspend settings, cloud service consumption, and warehouse computing among other things to fix your account and ensure you are fully optimized.

 

Conclusion:

 

Snowflake's Search Optimization Service is a powerful feature that can significantly improve the speed and efficiency of certain queries. While it comes with a cost, it can be a valuable investment for business users who rely on quick access to data for critical decision-making. However, it's important to be strategic about which tables you introduce to the service to minimize costs. Additionally, there are alternative solutions available, such as Snoptimizer™, that can optimize your account and help you run cost-effectively. With the right approach, Snowflake's Search Optimization Service can be a powerful tool in your data optimization arsenal.

Automated Modern Data Stack

Welcome! This article describes what a modern data stack is and how companies can leverage it to gain business insights. Building a proper modern data stack has become an essential element for data-driven companies seeking to thrive in today’s fast-paced digital world and data-driven world.

 

What is a Modern Data Stack?

The term ‘modern data stack’ was first coined in the mid-2010s to refer to a collection of cloud-based tools for managing and analyzing data.

A modern data stack essentially allows companies to gain valuable business insights by efficiently storing, integrating, and analyzing huge volumes of data from diverse sources. As data volumes grew exponentially, traditional data warehouses and business intelligence tools were no longer sufficient. The modern data stack emerged as a new approach to data management that could handle large, diverse datasets and support data-driven decision-making at scale.

 

What does a Modern Data Stack include?

A modern data stack includes many different components because each component serves a specific purpose in enabling companies to manage and gain insights from their data. The components work together in an integrated fashion to provide a full solution for data management and analytics. However, the components often vary based on a company’s specific needs and priorities.

 

Some of the most common components include:

  • Cloud data warehouses for scalable storage and computing.
  • Data integration platforms to ingest data from various sources.
  • Data transformation tools to prepare and model the data.
  • Business intelligence tools for analysis and visualization.
  • Data quality and governance tools to ensure data accuracy, security, and compliance.

Now that we’ve gone through a high-level overview of what a modern data stack is and what are some of the most common components, we are proud to present our robust modern data stack solution.

After thoroughly analyzing the leading options, we have assembled a set of technologies that we believe deliver an unparalleled experience.

 

Our ITS Automated Modern Data Stack:

 

The benefit of having your modern data stack automated is that it reduces the need for manual data engineering and integration. Automated tools handle the heavy lifting of data ingestion, transformation, and integration so that your data analysts and scientists can focus on deriving insights and business value from the data. Automation also increases speed, scalability, and reduces costs.

The layers of our automated modern data stack consist of companies with whom we have official partnerships (Snowflake, Fivetran, Coalesce, Hightouch, and Sigma).

 

Base Layer 0 – Snowflake

 

Firstly, a modern data stack needs a base layer like Snowflake because it provides the foundational data storage and computing infrastructure upon which the rest of the stack is built. We choose Snowflake’s cloud data warehouse because it can efficiently store huge volumes of data from diverse sources and run complex queries across all of it.

Our website name, ‘Snowflake Solutions,’ reflects our sole focus on Snowflake as our foundational technology. We have unparalleled expertise and a proven track record of delivering cutting-edge, customized Snowflake solutions for all of our clients. Our Founder, Frank Bell, is considered the top Snowflake optimization expert in the world and has been a leading pioneer of Snowflake’s infrastructure and optimization solutions.

 

What does Snowflake provide?

  • A scalable cloud data warehouse.
  • Separation of computing and storage.
  • A multi-cluster, shared data architecture.
  • Automated data loading and unloading.
  • Time travel for data correction.
  • Data sharing across accounts and organizations.

 

Base Layer 1 – Fivetran

 

Fivetran is a cloud-based data integration platform that helps organizations centralize data from various sources into a unified view. It automates the process of data integration, making it easier for businesses to access and analyze their data in real-time.

 

What does Fivetran provide?

 

  • It uses an ELT to quickly load your data into your warehouse prior to transforming it.
  • Fivetran normalized schemas replicate the data from your sources into the familiar relational database format, so analysts can immediately run queries on it.
  • Fivetran offers over 300+ pre-configured connectors for various data sources and they only take five minutes to set up.
  • Automated schema drift handling, updates, data normalization, and more.
  • Built-in automated governance and security features.
  • Real-time data movement with low impact on the source system.
  • Automated data entry and extraction across systems.
  • It avoids the high costs associated with data integration due to increased engineering resources.

 

Base Layer 2 – Coalesce

 

When we came across the full demonstration of Coalesce, we were blown away. In our view, they are one of the largest game-changers in recent years.

Coalesce is a data transformation tool specifically built for Snowflake that leverages a column-aware architecture. It provides a code-first, GUI driven experience for managing and building those transformations. Coalesce provides the first automated transformation data pipeline tool we have tested that scales with Snowflake and makes the transformation of data pipelines more automated.

 

What does Coalesce provide?

  • It’s easy to use when creating patterned transformations.
  • Extreme transformation flexibility at both object and pipeline levels. Combined code and GUI editing.
  • Automation templates that be shared with your data engineering team.
  • Coalesce separates the build and the deployment of data pipelines. Providing flexibility in testing your data pipeline.
  • You are able to use column-aware metadata for the automated creation of database objects including dimensions.
  • You can easily build data pipelines with Snowflake Streams and Tasks via Coalesce
  • You can quickly implement patterned transformations like Deferred Merge across hundreds or thousands of tables.
  • Built for true cloud scale as a cloud-first tool to operate on top of Snowflake.

 

Base Layer 3 – Hightouch

 

Hightouch can be a really powerful tool. Hightouch is the leading Data Activation platform that syncs data from your warehouse into 125+ SaaS tools with no engineering effort needed.

 

What does Hightouch provide?

 

  • Ease of use in inputting data and extracting value.
  • Amazing loading flexibility that allows you to sync your warehouse data to any SaaS tool with integrations with 100+ destinations.
  • Automation that does not require you to input custom code or use CSVs.
  • Security with cloud computing that never stores your data, and has several certifications that ensure security compliance and data governance: SOC 2 Type 2 Compliant, CDPR compliant, HIPAA compliant, and CCPA compliant.
  • Easy control over who has access & authorization to make changes.

 

Base Layer 4 – Sigma Computing

Sigma Computing is a cloud-based Business Intelligence (BI) platform that is used for data exploration and visualization. It increases speed to insights by using Snowflake’s lightning-fast computing power coupled with the familiarity of spreadsheets. It operates as a calculation engine on top of Snowflake and the best part is that Sigma never, (if done correctly) creates extracts.

 

What does Sigma provide?

  • Offers code-free and code-friendly data manipulation & visualization.
  • Ease of use with a familiar interface that is designed to look like a spreadsheet.
  • Drag & drop functionalities that improve user interactions and do not require any additional technical know-how or skill.
  • The only BI tool that was built for the Snowflake Cloud Platform. Therefore, it has speedy connections that reduce latency as users run queries.

 

Conclusion

Are you unsure about the best automated modern data stack for your business? Given how important choosing the right solution is, schedule a free call with us. We can walk you through the options to find what suits your needs best.

The powerful automated modern technology stack we outlined in this article is the one we employ for the vast majority of our projects. We wholeheartedly endorse all these partners and their solutions to our clients who are primarily transitioning to Snowflake.

Do you have any data automation needs we have not already addressed?

We hope this article proved useful in considering what a truly automated, modern data infrastructure should look like.

Be sure to check out blog for more information regarding Snoptimizer or Snowflake.

To Snowsight or Not to Snowsight

To Snowsight or Not To Snowsight:

Back in June 2021, I was writing a chapter for our new Snowflake Essentials Book on the Snowflake Web User Interface. I intentionally delayed writing that chapter, expecting that Snowsight would be the primary Snowflake Web User Interface as part of Snow Summit 2021. Although that didn't happen, I quickly realized that, given the circumstances and the timelines for the Snowflake Essentials book, I needed to write two separate chapters in the book to cover both the Snowflake Classic Console UI and the Preview App (Snowsight).

As I was writing Chapter 5 – Snowsight, I wondered if Snowsight was now ready to be used as the primary interface with the new changes. The UI changes that came in June/July 2021 finally made Snowsight better than ever before. Before June 2021, it was missing just too many features and the navigation was incredibly inefficient.

The new functionality in Snowsight makes it much better to use now versus the Classic Console (Autosuggest/AutoComplete, Versioning, Sharing Worksheets and Dashboards) These features enhance Snowflake's ease of use in query and code collaboration, increasing efficiency and collaboration. While many of us who have worked with the Classic Console for years may be accustomed to it, it's time for us to learn new UIs because the Classic Console has its limitations, particularly around sharing code and versioning. However, as of October 2021, Snowsight still has a few clunky problems you need to be aware of when switching over. We are working with the Snowsight Product Management team to get these issues fixed, so hopefully, it will happen sooner rather than later.

Let’s cover the amazing new features in Snowsight first and then cover both the Classic Console and Snowsight.

Top Snowsight Improvements:

Here’s an article by Veronika Durgin on Medium where she gives her feedback on Snowsight, the Snowflake Web Interface.

However, here's my perspective:

1. AutoSuggest & AutoComplete

My vote by far is that the combination of what I call AutoSuggest and AutoComplete is a huge improvement and probably has already saved thousands of hours of errors and misspellings of tons of functions, tables, views, procedures, schemas, etc. This is significant and not found in the Classic version.

2. Collaboration

Collaboration is the future. While I still love using the Classic Console, it unfortunately lacks any collaboration capabilities. With Snowsight, I now have the ability to easily share Worksheets, which is huge for me because it allows me to collaborate on queries or help others with their Snowflake work - assuming we are in the same account. I'm hoping that at some point, we will be able to share it with other accounts as well.

3. Visualizations

Visualizations and Collaborations - Using Dashboards. To be frank, when Snowflake first previewed their dashboards in 2020, I didn't find them compelling. After all, I already had tools like Sigma and Tableau that could perform all of those functions, so it was confusing to see this new feature without core administration features. The preview app wasn't fully functional for anyone doing real Snowflake administration until roughly June 2021.

4. Versioning of all Queries

Versioning of queries is a feature available in Snowsight that allows users to save and track different versions of their SQL queries. This feature allows users to easily revert to previous versions of their queries, compare different versions, and collaborate more efficiently with others. It is a significant improvement over the Classic Console, which lacks any versioning capabilities.

 

Conclusion:

In conclusion, Snowsight is an exciting new addition to the Snowflake Web User Interface that offers many valuable improvements over the Classic Console. With features like AutoSuggest and AutoComplete, collaboration capabilities, and query versioning, Snowsight is a more efficient and effective tool for working with Snowflake. While there are still some clunky problems that need to be addressed, Snowsight is definitely worth exploring for anyone who wants to optimize their Snowflake experience.

If you want more information on Snowflake updates be sure to check out our blog for more news.

Snowflake’s Financial Services Data Summit Recap

Snowflake Financial Data Summit:

 

Snowflake held an excellent virtual event this week, focusing on Snowflake's Data Cloud solutions for Financial Services. We appreciated the combination of business and technology content during this Industry Vertical Snowflake "Summit." Our mission, when launching our ITS solutions business, was to provide Business/Technology Focused Solutions.

We firmly believe in bringing the best of both worlds, where business teams and technology teams work together. We wanted to avoid the pitfalls of solutions that were solely business-focused, without considering technology or collaboration. We also sought to avoid technology solutions that lacked business value or had limited value, which led to the product/market fit concepts.

 

Financial Services Data Summit Highlights and Take Aways

 

  • Major emphasis on the Snowflake Financial Services Data Cloud and its partners such as BlackRock, Invesco, State Street, Fiserv, etc.
  • Financial Services Data Provider Presentations. During the conference, we were excited to attend the Data Provider Presentations from Acxiom. The presentations provided us with valuable insights into the data industry and the latest trends in data collection and analysis. Acxiom's experts shared their experiences and knowledge about the challenges and opportunities of working with large datasets, as well as the best practices for data management and security.
  • It is all about the customer. This was a recurring topic among Snowflake customers and partners such as Blackrock, State Street, and data providers. They highlighted the strong partnership with Snowflake and how it has facilitated new data collaborations that were previously unattainable.

 

Financial Services Data Summit By the Numbers:

 

  • Sessions – 17 Tracks – 4 [Customer Centricity – Risk and Data Governance – Digitalize Operations – Platform Monetization] Speakers – 44 Speakers by type.

 

Presenter Company Type Count Breakdown %
Snowflake 16 37.21%
Customers 9 20.93%
Partners – Consulting 6 13.95%
Partners – Data Providers 7 16.28%
Partners – Products 5 11.63%

 

Data Workload Financial Services Recap:

 

Data Warehousing, Engineering, Data Lakes:

 

We still think this Data Warehousing is the workload that works best with Snowflake and what it was originally designed for.  We see many businesses within Financial Services moving to the Snowflake Data Cloud for their Data Warehouse workloads.  Many of the Financial Services companies who presented at the summit also are moving to a combination of Data Lakes and Data Warehousing. The presentations focused on a mix of financial services and processes combined with data technologies to improve the financial services business. Snowflake's data cloud is accelerating how financial services companies transform. Capital One shared an interesting video about being the first to deliver financial services on the cloud, which was not included in the sessions.

 

Data Science and Data Applications:

 

Many of the presentations at the Financial Data Summit were related to building data applications on top of Snowflake.

 

Data Marketplace and Data Exchanges:

From our perspective, the focus of the Financial Services Summit was the Financial Services Data Cloud. The main highlight of Blackrock’s Aladdin Data Cloud and Q&A around that was one of the main presentations.  There was also a very large focus on Financial Services Data Providers of Acxiom, Intelligence, FactSet, etc., and on the Data Cloud Data and Services they provide.

 

Snowflake Data Provider Presentations:

 

Acxiom – Recording Link

Intelligent & FactSet – Recording Link

S&P Global – Recording Link

 

Financial Services Data Cloud Announced:

 

Besides the summit event, Snowflake announced the Snowflake Financial Services Data Cloud as well. We view this as really just a subset of the overall Snowflake Data Cloud vision and definition. We assume Snowflake will continue to roll out industry vertical conceptual concepts of Data Clouds. This is super interesting and transformative at many levels. It is a massive movement to more centralized and shared data versus the historical data silos that have developed within companies.

This is the statement from the press release: “Financial Services Data Cloud, which unites Snowflake’s industry-tailored platform governance capabilities, Snowflake- and partner-delivered solutions, and industry-critical datasets, to help Financial Services organizations revolutionize how they use data to drive business growth and deliver better customer experiences.”

At a high level, this is pretty awesome ****“theoretically” and aligns with a lot of the thought leadership work I’m doing around moving from a paradigm of on-premise closed data systems and silos to an ever-evolving worldwide concept of integrated data.

 

Conclusion:

 

The Snowflake Financial Services Data Summit was an excellent first Vertical Industry Summit with major Financial Services Customers and Partners such as BlackRock, State Street, Invesco, Fiserv, Square, NYSE, Wester Union, Acxiom, etc.  Our favorites [from a practical learning perspective] were:

  1. Fiserv CTO Marc did a great job in this presentation demonstrating Fiserv Applications/Tools on top of Snowflake. Recording Link

2.  Building on Snowflake: Driving Platform Monetization with the Data Cloud. Recording Link

There were many many other great presentations though as well from Providers and Snowflake partners like Alation, AWS, etc.

What is the Snowflake Data Cloud?

Introduction:

 

The Snowflake Data Cloud is an ecosystem that enables thousands of Snowflake customers, partners, data providers, and data service providers to collaborate on data, derive insights, and create value from rapidly growing data sets in a secure, compliant, and seamless manner.

What is the Snowflake Data Cloud is sort of a loaded question for long-time Snowflake Professionals like a large chunk of our Snowflake Solutions Community. From a Snowflake Veteran user standpoint, [Assuming anyone has been deeply working on Snowflake since the fall of 2019 or earlier], this is a massive rebranding. Many of us view the Snowflake Data Cloud as a strategic evolution of Snowflake that can take on more database processing workloads. Previously, most Snowflake customers regarded Snowflake as an analytical RDBMS meant for data warehousing purposes.

My take is that I, and many other strategic thinkers needed to evolve Snowflake to a larger strategic analytical data solution before Snowflake was ready for its IPO in the fall of 2020. Therefore, while many of the pieces were already there, the terminology of the Snowflake Data Cloud was born on June 2, 2020, at the virtual summit and this announcement.

https://www.snowflake.com/news/snowflake-unveils-the-data-cloud-so-organizations-can-connect-collaborate-and-deliver-value-with-data/

 

Part 1: What the Snowflake Data Cloud looks like in 2023:

The Snowflake Data Cloud in 2023 is a mature, integrated data platform that enables organizations to break down data silos, derive insights, and deliver value from their data. It has evolved into a global data exchange where thousands of customers, partners, and data providers can securely and seamlessly share and transact data. The Snowflake Data Cloud now supports all major data workloads, including data warehousing, data lake, data engineering, data science, data applications, and data sharing. It is a powerful, scalable, and compliant platform for data-driven digital transformation.

 

Part 2: The Snowflake Data Cloud Six Major Workloads:

 

Data Workload 1 – Data Warehousing

I believe that Benoit/Thierry's main focus was initially on this workload or use case. In my opinion, this remains Snowflake's best use case and it is still the best cloud data warehouse in 2021. I am trying to be unbiased in my assessment. As someone with hands-on experience building many data warehouses on various platforms, I can attest that Snowflake excels in this regard.

 

Data Workload 2 – Data Engineering

Data Engineering is a good workload for Snowflake. (Add some personal thoughts to this workload).

 

Data Workload 3 – Data Lake

I will postpone our analysis of Snowflake as a data lake. I have been advocating for what I refer to as SnowLake for quite some time, likely since late 2018. However, we are presently in the process of conducting a comparative study of various potential data lake solutions.

 

Data Workload 4 – Data Science

Data Science is a new area of focus for Snowflake. They recently introduced SnowPark as their preferred solution for Data Science Workloads. However, it is still in its early stages and Python language support is not fully integrated yet.

 

Data Workload 5 – Data Applications

This is a fascinating task for Snowflake. They have been handling certain aspects of this task for a considerable period of time. However, in my opinion, the field of Data Applications is quite extensive, and many of its subsets may still require the use of an OLTP-type database.

 

Data Workload 6 – Data Exchange

If you couldn't guess, this is probably my favorite upcoming task and use case. This is another task where I believe Snowflake excels and is currently leading the competition by a significant margin.

At Snowflake Solutions, we have been working with Snowflake to develop Data Sharing, Data Exchange, and Data Marketplace solutions since early 2018. We have been monitoring the growth of the Snowflake Marketplace since its launch in July 2020.

In addition to the Snowflake Data Cloud Platform, it seems that Snowflake customers often consider the Data Cloud concept to encompass Data Sharing, Data Exchanges, and the Data Marketplace as a whole.

 

Conclusion:

 

The Snowflake Data Cloud is becoming more popular in the market. We have personally witnessed the rapid growth of the Data Marketplace and Data Sharing features of the cloud. However, there is still a long way to go before owning and dominating these six major data workloads. Overall, the data cloud is an intriguing strategy with compelling value if you are aligned with Snowflake and its partnerships. The announcement of the Aladdin Data Cloud is probably the most significant market confirmation of this vision.

 

Thank you for taking the time to read our article. We hope you found it informative and gained insight into our viewpoint on the Snowflake Data Cloud. If you would like to learn more, we encourage you to explore additional articles in the Snowflake universe.

What is Snoptimizer?

Part 1: What is Snoptimizer™?

Snoptimizer is an application developed by our team at ITS - Snowflake Solutions, led by our Founder, Frank Bell. Frank is undoubtedly one of the world's foremost experts in Snowflake data optimization and has leveraged his mastery of Snowflake to create this one-of-a-kind automated solution for companies who want to streamline their Snowflake usage.

Snoptimizer™ aggregates the best practices and lessons from Frank’s leading expertise in optimizing Snowflake accounts for companies like Nissan, Fox, Yahoo, and Ticketmaster.

Snoptimizer™ is the first automated cost, performance, and security optimizer application for Snowflake Accounts. It is by far the easiest and fastest way to optimize your Snowflake Account quickly and effectively. The tool service can optimize your Snowflake Account within minutes.

Part 2: Why did we build Snoptimizer?

The reason we built SnoptimizerTM is that we frequently saw Snowflake customers whose accounts were not optimized as much as possible. In nearly all cases, their Snowflake usage was highly inefficient. To address this need, we created Snoptimizer™.

All too often, we were brought in by Snowflake customers for health checks. In 98% of these cases, the customer’s accounts were not optimized as much as we could optimize them. Unfortunately, most of the time their Snowflake usage was actually highly inefficient, and the customer was not using Snowflake as effectively as possible in one or more areas.

Snoptimizer™ was built by Snowflake Data Heroes, an elite group of only 50 worldwide, consultants and product builders who use Snowflake. Our SnoptimizerTM team comprises some of the most experienced Snowflake optimization and migration experts. We have specialized in Snowflake for years, studying every aspect in depth, to provide unparalleled optimization services.

Part 3: The Problem Snoptimizer Solves:

Snowflake is an incredibly scalable and easy-to-use data platform. That said, Snowflake’s data cloud offering is constantly evolving with new features and cost-saving services. Also, the Snowflake database and data cloud concept are relatively new to many administrators and users. While the basics are easy to use compared to other options, optimizing a Snowflake account to maximize efficiency and cost savings is challenging. It requires deep understanding of hundreds of objects like warehouses, resource monitors, query history, materialized views, search optimization, Snowpipe, load history, and more.

A few common customer optimization issues we've encountered:

  1. Poorly configured usage. All too often, we see unused consumption credits wasted due to incorrectly configured warehouses. Remember, cost-based consumption services are great until misused. An unoptimized account may experience performance or security issues. Therefore, we analyzed every area of Snowflake metadata views and developed the most advanced optimizations for cost, performance, and security beyond anything documented or available elsewhere.
  2. Incorrect storage settings or architecture. We often find suboptimal Snowflake settings during health checks, like 10- to 90-day time travel enabled for objects that don’t need it. We also see inefficient lift-and-shift migrations that keep drop-and-recreate architectures which make no sense in Snowflake.
  3. Inefficient warehouse setup. This is one of the first issues we typically fix, often saving our customers hundreds of dollars each day.
  4. Accounts with Significant Cost Risks. As we stated in previous blog posts here at ITS Snowflake Solutions, Snowflake enables awesome scale but if misconfigured it also has major cost risks by default due to its consumption-based pricing, especially for compute and larger warehouses. These is the Snowflake Cost Risks we discussed previously.

 

Part 4: What does Snoptimizer™ do?

Snoptimizer continuously monitors your Snowflake account in three major areas: cost, security, and performance. It scans over 40 Snowflake views to detect anti-patterns that waste resources or hurt performance and security. Snoptimizer is the only service that continuously optimizes your Snowflake account to maximize efficiency and cost savings.

Let’s dive deeper into the three main areas Snoptimizer streamlines on Snowflake usage:

 

Cost Optimization:

The Snoptimizer Cost Optimization service regularly reviews ongoing Warehouse and Resource Monitor configurations. It promptly fixes any incorrectly configured Account settings.

The Snoptimizer service continually optimizes your Snowflake account(s) to reduce costs. It can automatically apply optimizations or require approval before changes. Snoptimizer is your best tool for minimizing Snowflake costs.

Snowflake's RDBMS and DDL/DML are easy to use, but warehouses and compute optimization are also easy to misconfigure. Snoptimizer eliminates this inefficiency and waste in Snowflake computing and storage.

Performance Optimization:

The Snoptimizer team analyzes your Snowflake query history and related data to identify warehouses that are over-provisioned or under-provisioned. We are the only service that automates Snowflake performance optimization and provides recommendations, such as:

  • Right-sizing warehouses to match your workload
  • Leveraging other Snowflake cost-saving features
  • Consolidating unused warehouses
  • Enabling Auto Clustering
  • Using Materialized Views
  • Optimizing Search

We review your Snowflake account to find ways to improve performance and lower costs. Our recommendations are tailored to your specific usage patterns and needs.

Security Optimization:

Snoptimizer is one of your best tools for improving Snowflake security. It continuously monitors your Snowflake account for risks that could compromise your data or account. Since security often depends on company culture, we provide recommendations and best practices to help prevent account breaches and data leaks. Snoptimizer Security Optimization performs frequent checks to identify misconfigurations or vulnerabilities that could be exploited.

Snoptimizer Core Features:

  • Analyzes Snowflake data warehouses to identify inefficient settings
  • Immediately limits “cost exposure” from computing resources
  • Reviews previous queries and usage to optimize performance and efficiency
  • Provides regular reports on Snowflake usage
  • Creates effective monitors for each warehouse’s resources
  • Offers recommendations and automation to optimize your setup
  • Incorporates Snowflake’s best practices for cost optimization, including some undocumented tips

 

Part 5: What results can you expect from using Snoptimizer?

  • On average, we've seen 10-30% cost savings, thousands of security issues fixed, and hundreds of performance problems solved in our tests.
  • In some implementations, we've achieved up to 43% cost savings.

 

Part 6: Try Snoptimizer™ today!

Snoptimizer Banner

Try Snoptimizer today. Sign up and schedule a personal demo with us!

Visit our Website to Explore Snoptimizer

To Recap... How Snoptimizer Helps You

Snoptimizer quickly and automatically optimizes your Snowflake account for security, cost, and performance.
It eliminates headaches and concerns about security risks and cost overruns across your Snowflake account.
It prevents you from making costly mistakes.

In short, Snoptimizer makes managing your Snowflake cost, performance, and security much easier and more automated.
Optimization in a few hours, hassle-free. Get optimized today!

Part 7: Conclusion

The Snowflake Data Cloud continues expanding, and though easy to use, optimizing for cost, security, and performance remains challenging. Snoptimizer makes optimization effortless and affordable, saving you from cost overruns and security issues.

We’d love to help streamline your Snowflake use, optimize data cloud costs, and leverage this tech to boost your business as much as possible.

Sign up and schedule a complimentary consultation with us to streamline your Snowflake usage.

Snowflake Cost Risks

Introduction:

Today’s article discusses the cost risks associated with using the Snowflake Data Cloud. It emphasizes the importance of proper administration and resource monitoring to mitigate these risks. We will also mention our service, Snoptimizer, which can help automate cost optimization and risk minimization related to Snowflake accounts.

Part 1: My Experience with Snowflake’s Data Cloud

As a Snowflake Data Superhero for the past 4 years, I can’t deny, I love using Snowflake. I fell in love with it at the beginning of 2018 when I realized how easily I could execute all of our Big Data Consulting Practice Solutions we had been doing for 18+ years. In the past, we would often run into scale challenges as the data’s size grew but Snowflake brought both ease of use and amazing scale to almost all of my big data consulting projects.

Over the last three years, my team and I have worked on hundreds of Snowflake accounts. I’ve come to realize that if Snowflake Anti-patterns occur or if poor compute security practices are used, Snowflake accounts are exposed to large cost risks, particularly with regard to computing costs. While Snowflake is an amazingly scalable cloud database and is the best cloud data warehouse I’ve used in the last 3+ years, the deployment of a Snowflake Account without proper settings and administration exposes a company to these cost risks.

Part 2: Examples of Data Cloud Cost Risks

Let’s say you actually used the Classic Console to create a new warehouse and used all the default settings…  Even if you didn’t run any query, the cost for the standard settings would be 10 minutes * XL Warehouse (16 credits/hour) @ $3/credit.  It is only $8 for those 10 minutes but it was $8 spent on nothing.  Let’s say you had a rogue (or curious) trainee on a Snowflake Account that didn’t understand what they were doing and does the same thing they ONLY change the size to a 6XL. Your 10-minute run-for-nothing cost exposure is 10 min * 6XL Warehouse (512 credits/hour) @ $3/credit. Your account just spent $256 for 10 minutes of nothing.

Snowflake Cost Risk Use Case 6XL – 1 cluster:

  • Cost per hour @ $3/credit = 3 * 512 = $1536Cost per day @ $3/credit = $36,864

Snowflake Cost Risk Use Case 6XL – 5 clusters:

*[we know this is a worst-case scenario on AWS/Snowflake and this one would be rare BUT without resource monitors and correct permissions it is possible]*Cost per hour @ $3/credit = 3 * 512 = $7680Cost per day @ $3/credit = $184,320

Snowflake Cost Risk Use Case 6XL – 10 clusters:

*[we know this is a worst-case scenario on AWS/Snowflake and this one would be rare BUT without resource monitors and correct permissions it is possible]*Cost per hour @ $3/credit = 3 * 512 = $15,360Cost per day @ $3/credit = $368,640

As you can see, this is unreasonable exposure to cost risks.  If you are a Snowflake administrator, make sure you make appropriate changes to control costs and cost risk.  If you want an automated approach to Cost Risk Management that can be set up easily in a few hours then try our Snoptimizer Cost Risk Solution.

Snowflake Cost Risk Mitigation – Administration – ACCOUNT ADMIN – MUST DO

The most important way to minimize Snowflake Cost Risk is to create Resource Monitors for every warehouse that has to suspend actions.  Here is the code to do that: [replace 150 with your daily credit limit].

CREATE RESOURCE MONITOR "REMOVE_SNOWFLAKE_COST_RISK_EXAMPLE_RM" WITH CREDIT_QUOTA = 50
FREQUENCY = 'DAILY', START_TIMESTAMP = 'IMMEDIATELY', END_TIMESTAMP = NULL
TRIGGERS
ON 95 PERCENT DO SUSPEND
ON 100 PERCENT DO SUSPEND_IMMEDIATE
ON 80 PERCENT DO NOTIFY;

ALTER WAREHOUSE “TEST_WH” SET RESOURCE_MONITOR = “REMOVE_SNOWFLAKE_COST_RISK_EXAMPLE_RM”;

Part 3: Use Snoptimizer

Snoptimizer offers Snowflake users a unique solution to enhance the cost, performance, and security of their Snowflake accounts.

Snowflake account optimization banner.png

It provides you with a consistent and reliable daily analysis of how your Snowflake account is being optimized and rightsized to achieve maximum efficiency.

Try Snoptimizer today. Sign up and schedule a personal demo with us!

Conclusion:

To conclude, we now understand that Snowflake's Data Cloud Cost Risk is a real issue that needs proper administration. While the Snowflake Data Cloud offers immense scale and power for any analytical data processing need, it must be optimized and continuously monitored by a service like Snoptimizer. Remember, with great data processing power comes great cost management responsibility. If an administrator mistakenly grants access to an untrained user to create a 6XL instance that they don't need and isn't within the business budget, it can result in significant costs.

Try Snoptimizer today to avoid a data-driven cost catastrophe.  If a new warehouse appears, we have you covered!

Snowflake Cost Guardrails – Resource Monitors

Introduction:

 

The Snowflake Data Cloud provides impressive scalability and processing power for analytical data. It's an incredible advancement that now allows for launching T-shirt warehouse sizes ranging from XS (1 virtual instance) up to 6XL with 512 EC2 instances when running on AWS on 1 cluster. However, it's important to set up your Snowflake account with resource monitors to control costs. Snowflake Resource Monitors serve as your primary way to control costs within the Snowflake Data Cloud.

Let’s show you how easy it is to set up your Snowflake cost guardrails so your costs don’t go beyond what you expect.

We recommend either hiring a full or part-time Snowflake administrator focused on cost optimization and database organization or using our Snowflake Cost Optimization Tool – Snoptimizer.  Snoptimizer automates setting up resource monitors on your Snowflake account for each warehouse and tons of other cost optimizations and controls on your Snowflake Account. Let’s dig into the only true Snowflake Cost Risk Guardrails you have had for a while, Resource Monitors.

 

Resource Monitors – Your Snowflake Data Cloud Cost Guardrails

Resource Monitors are technically relatively easy to set up from the Snowflake Web GUI or command line. Even though setting up one Resource Monitor is relatively easy, it’s still easy to make incorrect assumptions and not have enough effective monitoring and suspending in place. It is like having your guardrails not installed yet if you do not do the following which is too easy to do:

Finding out that Snowflake consumption-based pricing was so reasonable was game-changing for me and my consulting company.  We could finally provide scale to any analytical challenge and solution we needed to create. This was never possible before. I remember building predictive marketing tools and we often had to crunch large data sets we would often run into scaling challenges and have to spend tons of time and engineering effort to engineer for scale.

 

Try Snoptimizer:

 

snoptimizer

 

Try Snoptimizer today. Sign up and schedule a personal demo with us!

 

To Recap - How Snoptimizer Helps You

Snoptimizer quickly and automatically optimizes your Snowflake account for security, cost, and performance. It eliminates headaches and concerns about security risks and cost overruns across your Snowflake account.

In short, Snoptimizer makes managing your Snowflake cost, performance, and security much easier and more automated.

 

Conclusion:

 

Having properly set up Snowflake Guardrails in the form of Resource Monitors is extremely important. If you're unsure whether or not you have these in place, it's time to take action. Activate Snoptimizer today to optimize your system in just a few hours, and ensure continuous and regular cost optimization monitoring. If a new warehouse appears, we've got you covered!

In conclusion, setting up Snowflake resource monitors is crucial for controlling costs in the Snowflake Data Cloud.

Snowflake Cost Optimization Best Practices

Introduction:

 

I have been working with Snowflake since the beginning of 2018 and it has been one of the most enjoyable and scalable data solutions I have encountered in my 27+ year career as a data engineer, data architect, data entrepreneur, and data thought leader. It is an extremely powerful platform (with nearly unlimited scalability, limited only by Snowflake's allocation of Compute within an Availability Zone) that requires responsible usage.

In the past 3 years, I've analyzed more than 100 Snowflake accounts and found that about 90% of them were not completely optimized for cloud data costs. That's why my team and I are thrilled to introduce Snoptimizer, the first automated Snowflake cost optimization service.

One of the reasons why 90% of those accounts did not have resource monitors or regular optimizations is that Snowflake is initially cost-effective and typically provides significant savings, especially for on-prem migrations that we have completed. However, companies that do not optimize their Data Cloud Costs are missing out on big opportunities! That's why we created Snoptimizer, and I'm also sharing my top 6 Snowflake cost and risk optimizations below. Hope you find them helpful!

 

Part 1: My Best Practices for Optimizing Snowflake Costs and Reducing Cost Risks:

 

Best Practice #1 – Resource Monitors.

One of the initial features of Snoptimizer is the automation of daily Resource Monitors at a warehouse level, which is based on the Snowflake Metadata database history and warehouse and Resource Monitor settings. This is set up immediately following the purchase of Snoptimizer.

By doing this, both cost risk and guardrails for all warehouse computing are reduced. Snoptimizer utilizes Snowflake Metadata and warehouse/Resource Monitor settings to automatically monitor resources daily at the warehouse level. This helps limit risks and ensures that constraints are not exceeded.

 

Best Practice #2 – Auto Suspend Setting Optimization.

Snoptimizer automates another optimization by analyzing the workloads in the Warehouse and making changes to the Auto Suspend settings. Depending on the workload, Snoptimizer can also automate additional cost savings for you.

 

Best Practice #3 – Monitoring Cloud Services Consumption Optimization

Snoptimizer analyzes your Snowflake Account's Cloud Services consumption to quickly identify opportunities for cost savings. We thoroughly review usage and billing details for each service to ensure that only what is necessary is provisioned, reducing waste and minimizing costs. Optimizing Cloud Services is one of the most effective ways to lower your Snowflake spending while still meeting your data and compute demands.

 

Best Practice #4 – Regular Monitoring of Storage Usage Across your Entire Snowflake Account

At Snoptimizer, our goal is to help you save on Storage costs. We start by reviewing your Storage History for the past 60 days to identify any settings that may be causing you to overpay. We commonly find that charges related to Time Travel and/or Snowflake Stages are unnecessary and can be avoided.

At Snoptimizer, we can help you make the most of your storage space. Our service optimizes your settings based on your actual usage, so you're only paying for what you need. By analyzing 60 days of storage history, we're able to find ways to reduce costs by up to 25%, all without sacrificing performance or features.

 

Best Practice #5 – Daily Monitoring of Warehouse Compute.

Besides just adding Resource Monitors that suspend Warehouses we also provide daily monitoring of Snowflake Warehouse consumption reporting daily spikes and anomalies or changes in rolling averages. Most accounts we come across do not have regular monitoring of warehouse usage on regular proactive settings.

 

Best Practice #6 – Regular Monitoring of New Snowflake Services.

Besides monitoring compute warehouses, Snoptimizer also immediately starts monitoring consumption on all existing and new Cloud Services (private preview and after) incurring costs from Automatic Clustering to Search Optimization to Materialized Views and all existing and new costs. This huge benefit automates Snoptimizer. We are ALWAYS optimizing your cost consumption and reducing cost risk! We are always there for you!

 

Part 2: Try Snoptimizer today

 

At Snoptimizer, we offer Snowflake users a one-of-a-kind tool to enhance their Snowflake accounts' cost efficiency, performance, and security.

 

Our tool offers a dependable and regular daily analysis of your Snowflake account optimization and rightsizing, ensuring maximum efficiency.

Try Snoptimizer today. Sign up and schedule a personal demo with us!

 

Conclusion:

 

After reading this article, we hope you have a better understanding of the best practices for optimizing Snowflake costs and reducing cost risks. These practices include using resource monitors, optimizing auto-suspend settings, monitoring cloud service consumption, regularly monitoring storage usage, daily monitoring of warehouse computing, and keeping track of new Snowflake services.

We encourage you to try out our automated Snowflake optimization tool for better cost, security, and performance efficiency.

Try Snoptimizer today! Sign up and schedule a personal demo with us!

Snowflake Create Warehouse Defaults

Overview:

 

I have been working with the Snowflake Data Cloud since it was just an Analytical RDBMS. Since the beginning of 2018, Snowflake has been pretty fun to work with as a data professional and data entrepreneur. It allows data professionals amazing flexible data processing power in the cloud. The key to a successful Snowflake deployment is setting up security and account optimizations correctly from the beginning. In this article, we will discuss the 'CREATE WAREHOUSE' default settings.

 

Snowflake Cost and Workload Optimization is the Key

 

After analyzing hundreds of Snowflake customer accounts, we found key processes to optimize Snowflake for computing and storage costs. The best way to successfully deploy Snowflake is to ensure you set it up for cost and workload optimization.

The Snowflake default "create warehouse" settings are not optimized to limit costs. That is why we built our Snoptimizer service (Snowflake Cost Optimization Service) to automatically and easily optimize your Snowflake Account(s). There is no other way to continuously optimize queries and costs so your Snowflake Cloud Data solution runs as efficiently as possible.

Let's quickly review how Snowflake Accounts' default settings are currently set.

Here is the default screen that comes up when I click +Warehouse in the Classic Console.

 

https://snowflakesolutions.net/wp-content/uploads/Snowflake-Create-Warehouse-Default-Options-Classic-Console-1024x640.jpg

Create Warehouse-Default Options for the Classic Console

Okay, for those already in Snowsight (aka Preview App), here is the default screen within Snowsight (or Preview App) - It is nearly identical.

 

https://snowflakesolutions.net/wp-content/uploads/Snowflake-Create-Warehouse-Default-Options-Snowsight-1-1024x640.jpg

Create Warehouse Default Options for Snowsight

So let's dig into the default settings for these Web UIs that will be there if you just choose a name and click "Create Warehouse" - Let's further evaluate what happens with our Snowflake Compute if you leave the default Create Warehouse settings.

These default settings will establish the initial configuration for our Snowflake Compute. By understanding the defaults, we can determine if any changes are needed to optimize performance, security, cost, or other factors that are important for our specific use case. The defaults are designed to work out of the box for most general-purpose workloads but rarely meet every need.

 

Create Warehouse - Default Setting #1

 

Size (Really Warehouse of Compute Size): X-Large is set. I assume you understand how Snowflake Compute works and know the Snowflake Warehouse T-Shirt Sizes. Notice that the default setting is X-Large Warehouse vs smaller Warehouse settings of (XS, S, M, L) T-shirt default setting. This defaults to the same setting for both the Classic Console and Snowsight (the Preview App).

 

Create Warehouse - Default Setting #2

 

Maximum Clusters: 2

While enabling clustering by default makes sense if you want it enabled, it still has significant cost implications. It assumes the data cloud customer wants to launch a second cluster and pay more for it on this Snowflake warehouse if it has a certain level of queued statements. Sticking with the XL settings - duplicating a cluster has serious cost consequences of $X/hr.

This setting only applies to the Classic Console. It also is only set if you have Enterprise Edition or higher since Standard Edition does not offer Clustering.

 

Create Warehouse - Default Setting #3

 

Minimum Clusters: 1

This is only the default setting for the Classic Console.

 

Create Warehouse - Default Setting #4

 

Scaling Policy: Standard This setting is hard to rate but the truth is if you are a cost-conscious customer you would want to change this to "Economy" by default and not have it set as "Standard". The optimal level though is that your 2nd cluster which is set by Default will kick in as soon as Queuing happens on your Snowflake warehouse versus not launching a 2nd cluster until Snowflake thinks that it has a minimum of 6 minutes of work that 2nd cluster would have to perform.

This is only the default setting for the Classic Console but when you toggle the "Multi-cluster Warehouse" on Snowsight setting this defaults to "Standard" vs. defaulting to "Economy".

 

Create Warehouse - Default Setting #5

 

Auto Suspend: 10 minutes For many warehouses, especially ELT/ETL warehouses, this default setting is typically too high. Loading warehouses that run on regular intervals rarely need such a high cache setting. For example, a loading warehouse that runs on a schedule never needs extensive caching. Our Snoptimizer service finds inefficient and potentially costly settings like this.

For a loading warehouse, Snoptimizer immediately saves 599 seconds of computing time for every interval. As discussed in the Snowflake Warehouse Best Practice Auto Suspend article, this can significantly reduce costs, especially for larger load warehouses.

We talk more about optimizing warehouse settings in this article but reducing this setting can substantially lower expenses with no impact on performance.

NOTE: This defaults to the same setting for both the Classic Console and Snowsight (the Preview App).

 

Snowflake Create Warehouse - Default Setting #6

 

Auto Resume Checkbox: Checked by Default. This setting is fine as is. I do not recall the last time I created a warehouse without "Auto Resume" checked by default. Snowflake's ability to resume a query in milliseconds or seconds once executed brings automated warehouse computing to user needs. This is revolutionary and useful!

NOTE: This defaults to the same for both the Classic Console and Snowsight (the Preview App).

 

Snowflake Create Warehouse - Default Setting #7

 

Click "Create Warehouse": The Snowflake Warehouse is immediately started. This setting I do not prefer. I do not think it should immediately start to consume credits and go into the Running state. It is too easy for a new SYSADMIN to start a warehouse they do not need. The default setting before this is already set to "Resume". The Snowflake Warehouse will already resume when a job is sent to it so there is no need to automatically start.

NOTE: This defaults to the same execution for both the Classic Console and Snowsight (the Preview App).

 

One last thing...

 

As an extra bonus, check the code below in SQL code for those of you who just do not do "GUI".

Let's go to the Snowflake CREATE WAREHOUSE code to see what is happening...

DEFAULT SETTINGS:

CREATE WAREHOUSE XLARGE_BY_DEFAULT WITH WAREHOUSE_SIZE = 'XLARGE' WAREHOUSE_TYPE = 'STANDARD' AUTO_SUSPEND = 600 AUTO_RESUME = TRUE MIN_CLUSTER_COUNT = 1 MAX_CLUSTER_COUNT = 2 SCALING_POLICY = 'STANDARD' COMMENT = 'This sucker will consume a lot of credits fast';

 

Conclusion:

 

Snowflake default warehouse settings are not optimized for cost and workload. The default settings establish an X-Large warehouse, allow up to 2 clusters which increases costs, use a "Standard" scaling policy and 10-minute auto-suspend, and immediately start the warehouse upon creation. These defaults work for general use but rarely meet specific needs. Optimizing settings can significantly reduce costs with no impact on performance.

Too Busy for Snowflake’s Summit? We Feel You.

Introduction:

 

This Snowflake's Summit was a roller coaster of emotions, but more often than not, we were thrilled with all the new announcements. With over 61+ sessions, we got to see some of Snowflake’s amazing new features, tons of use cases, and first-hand looks at how to use their new tools with step-by-step labs. Most of us are too busy to watch two days’ worth of webinars, but that’s where we come in – providing you with your weekly dose of Snowflake Solutions! We decided to help out by highlighting the most important announcements, as well as the sessions we thought were worth the watch!

This time around Snowflake announced that they have five main areas of innovation: data programmability, global data governance, platform optimization, connected industries, and powered by Snowflake. While magical upgrades and new tools mean more flexibility for users, the reality is that most of these new features are still in private preview, so we, the public, won’t see them in action for some time. Regardless, we’ll still go through the top areas of innovation:

 

Platform optimization:

One of the most significant improvements this year is the enhanced storage economics, resulting in reduced storage costs due to improved data compression. As a result, many will begin to see savings on storage for new data. Additionally, Snowflake has developed new usage dashboards, enabling users to better monitor and comprehend their usage and costs across the platform. While it appears that Snowflake is making progress in the direction of cost optimization, the subject has been challenging so far, and there are not enough safeguards in place to prevent warehouse sizes (and bills) from skyrocketing. If you're interested in discovering the various ways your company can inadvertently lose money on Snowflake, as well as strategies for avoiding them, we invite you to register for our upcoming Cost Optimization webinar.

 

Global Data Governance:

Moving forward, we will discuss the six new data governance capabilities that have been added to the Snowflake platform. We will focus on the three that are most exciting.

 

1. Classification:

Automatically detects personally identifiable information.

  • Why is this cool? We can apply specific security controls to protect their data!

 

2. Row access policies:

Dynamically restrict the rows of data in the query based on the username, role, or other custom attributes.

  • Why is this cool? We no longer need multiple secure views and can eliminate the need for maintaining data silos. That’s a win in our book.

 

3. Access History:

A new view that shows used and unused tables to produce reports.

  • Why is this cool? You can see what’s bringing value and optimize storage costs based on what is frequently accessed or completely abandoned data. Who doesn’t love to save money?

 

Connected Industries:

Following we have two upcoming features that we thought were worth mentioning since they will be game-changers! These two features are Discover & Transact and Try Before You Buy, both of which will ease collaboration and data procurement between connected industries.

 

1. Discover and Transact:

Directly within the Snowflake Data Marketplace, a consumer can now discover data and purchase with a usage-based pricing model.

  • This is truly cool because of the self-service aspect! By providing this feature, we can significantly reduce the cost of selling and delivering data to our valuable clients.

 

2. Try Before You Buy:

Now consumers can access sample data to make sure they’re getting all they need before signing that check.

  • Why is this interesting? Everyone loves a free sample!

 

Data Programmability:

 

Probably the most important updates are under the data programmability umbrella. So, if you’re still with me, hang on a little longer, this is about to get interesting!

Some innovations are ready to be used now in public preview, so let’s check them out:

  1. SQL API: This new API enables customers to automate administrative tasks without having to manage infrastructure, there’s no need to maintain an external API management hub!
  2. Schema Detection: Now supports Parquet, Orc, Arvo, and hopefully more file formats in the future.

 

Exciting things to look forward to soon:

  1. Serverless Tasks: Snowflake will determine and schedule the right amount of computer resources needed for your tasks.
  2. Snowpark and Java UDFs: Snowpark is going to be the Snowflake developer’s new playground. It allows developers to bring their preferred languages directly into the platform. Java UDFS will also enable data engineers and developers to bring their custom code to Snowflake. This enables better performance on both sides!
  3. Unstructured Data Support: Soon, we will be able to treat unstructured data the same as structured data, with the ability to store, govern, process, and share.
  4. Machine Learning with Amazon SageMaker: A tool that will automatically build and insert the best machine-learning models into Snowflake!

 

Conclusion:

 

In summary, Snowflake's 2022 Summit exhibited several noteworthy novel features and updates, particularly in the domains of platform optimization, global data governance, and data programmability. Although a significant number of these features are still in private preview, they provide a glimpse into Snowflake's future direction and potential.

Keep an eye out for more updates and guidance from IT Strategists on how to leverage Snowflake's tools and solutions to their fullest potential. Be sure to check out our blog for more news and information.

Snowflake Data Masking

Introduction:

 

Today’s article discusses Snowflake Data Cloud's implementation of dynamic data masking, which is a column-level security feature used to mask data at query runtime. We provide a step-by-step guide on how to create and apply a data masking policy for email addresses in a stored procedure. The article also highlights the benefits of using dynamic data masking policies to secure and obfuscate PII data for different roles without access while displaying the data to roles that need access to it.

Last week, the United States Centers for Disease Control and Prevention (CDC) issued new policies regarding COVID-19 masks. We will focus on how to implement Snowflake Data Cloud's "Data Masking". Let's get started!

 

What is Data Masking?

 

Data Masking is just like it sounds… the hiding or masking of data. This is a practical method to add extra data masking for column-level security. Data Masking overall is a simple concept. It has caught on in our new age of GDPR, PII. What is Snowflake’s Version of Data Masking? Snowflake’s implementation of this is… Dynamic Data Masking.

Dynamic Data Masking is column-level security that uses masking policies to mask data at your query run time. Snowflake’s version of data masking, has several features including Masking policies that are at the schema level. Data Masking currently works to mask data at either the table or view object. The masking policies are applied at query runtime. The masking policies are applied to every location where the column is displayed. Depending on all the variables of your role, your role hierarchy, your masking policy conditions, and SQL execution content then you will see fully masked data, partially masked data, or just plain text!

Now that you know what Snowflake Data Cloud Dynamic Data Masking is then…. how do you use it? Data Masking within Snowflake is enabled with Data Definition Language (DDL). Here is the basic syntax constructs you use for the masking policy object. It is your typical object CREATE, ALTER, DROP, SHOW, DESCRIBE. This is a common feature for most Snowflake objects, and one of the reasons why I prefer Snowflake. Most of the time, it's reliable, easy to use, and consistent.

So, let’s have some fun and create a data masking policy for email addresses in a simple example. There are 3 main parts for creating and applying a dynamic data mask on Snowflake to a column. Here we go:

 

PART 1 – Enable and Grant Masking Policy

 

To enable masking policy on Snowflake, follow these steps:

  1. Grant create masking policy on schema to a role. For example: GRANT CREATE MASKING POLICY ON SCHEMA DEMO_MASKING_DB.DEMO TO ROLE "DATA_MASKING_ADMIN_ROLE";
  2. Use the account admin role to grant apply masking policy on account to the role. For example: GRANT APPLY MASKING POLICY ON ACCOUNT TO ROLE "DATA_MASKING_ADMIN_ROLE";

Replace "DEMO_MASKING_DB.DEMO" with the actual schema name and "DATA_MASKING_ADMIN_ROLE" with the actual role name.

Remember to grant the necessary privileges to the roles that will use the masking policy.

 

PART 2 – Create a Masking Policy

To create a masking policy in Snowflake, follow these steps:

  1. Use a role that has the necessary privileges to create a masking policy.
  2. Use the schema where the table or view that needs the masking policy is located.
  3. Use the CREATE MASKING POLICY statement to create the policy. For example:
CREATE OR REPLACE MASKING POLICY MASK_FOR_EMAIL AS (VAL STRING) RETURNS STRING ->
CASE
WHEN CURRENT_ROLE() IN ('HR_ROLE') THEN VAL
ELSE '*********'
END;

Replace MASK_FOR_EMAIL with the name of your masking policy. In this example, the policy masks the email column with asterisks for all roles except for the HR_ROLE.

Remember to grant the necessary privileges to the roles that will use the masking policy.

 

PART 3 – Apply the Masking Policy to a Column in a View or Table

 

To apply the masking policy to a column in a view or table in Snowflake:

  1. Use a role that has the necessary privileges to modify the table or view.
  2. Use the schema where the table or view that needs the masking policy is located.
  3. Use the ALTER TABLE or ALTER VIEW statement to modify the column and apply the masking policy. For example:
ALTER TABLE IF EXISTS EMPLOYEE MODIFY COLUMN EMAIL SET MASKING POLICY MASK_FOR_EMAIL;

Replace EMPLOYEE with the name of your table and EMAIL with the name of the column that needs the masking policy. Replace MASK_FOR_EMAIL with the name of your masking policy.

Remember to grant the necessary privileges to the roles that will use the masking policy.

(just creating a masking policy is not enough. Kind of like wearing a covid mask under your mouth and nose.  Even though you have a mask, it’s not applied really so it’s not working)

 

 

 

We will show you how to do all of this in detail below.

 

Dynamic Data Masking Example

Let’s say we want to create a data mask for email addresses in our row using a stored procedure.

If you have not been using our Snowflake Solutions Demo Database Training Example then let’s create a database, schema, and table to use.


/* SETUP DEMO DATABASE AND TABLE FOR DATA MASKING DEMO and PROOF OF CONCEPT */
USE ROLE SYSADMIN;  /*use this role or equivalent */
CREATE OR REPLACE DATABASE DEMO_MASKING_DB;
CREATE SCHEMA DEMO;
CREATE OR REPLACE TABLE EMPLOYEE(ID INT, FULLNAME VARCHAR,HOME_ADDRESS VARCHAR,EMAIL VARCHAR);
INSERT INTO EMPLOYEE VALUES(1,'Frank Bell','1000 Snowflake Lane North Pole, Alaska', 'fbell@snowflake.com');
INSERT INTO EMPLOYEE VALUES(2,'Frank S','1000 Snowflake Lane North Pole, Alaska', 'franks@snowflake.com');
INSERT INTO EMPLOYEE VALUES(3,'Craig Stevens','1000 Snowflake Lane North Pole, Alaska', 'craig@snowflake.com');
CREATE WAREHOUSE IF NOT EXISTS MASK_WH WITH WAREHOUSE_SIZE = XSMALL, INITIALLY_SUSPENDED = TRUE, auto_suspend = 60;


/* PART 0 – create and grant roles for DATA MASKING DEMO – REPLACE FREDDY WITH YOUR USERNAME– there is more to do when you use custom roles with no privileges */USE ROLE SECURITYADMIN;CREATE ROLE IF NOT EXISTS EMPLOYEE_ROLE;CREATE ROLE IF NOT EXISTS MANAGER_ROLE;CREATE ROLE IF NOT EXISTS HR_ROLE;CREATE ROLE IF NOT EXISTS DATA_MASKING_ADMIN_ROLE;GRANT USAGE ON DATABASE DEMO_MASKING_DB TO ROLE EMPLOYEE_ROLE;GRANT USAGE ON SCHEMA DEMO_MASKING_DB.DEMO TO ROLE EMPLOYEE_ROLE;GRANT SELECT ON TABLE DEMO_MASKING_DB.DEMO.EMPLOYEE TO ROLE EMPLOYEE_ROLE;GRANT USAGE ON DATABASE DEMO_MASKING_DB TO ROLE HR_ROLE;GRANT USAGE ON SCHEMA DEMO_MASKING_DB.DEMO TO ROLE HR_ROLE;GRANT SELECT ON TABLE DEMO_MASKING_DB.DEMO.EMPLOYEE TO ROLE HR_ROLE;GRANT USAGE,MODIFY ON DATABASE DEMO_MASKING_DB TO ROLE “DATA_MASKING_ADMIN_ROLE”;GRANT USAGE,MODIFY ON SCHEMA DEMO_MASKING_DB.DEMO TO ROLE “DATA_MASKING_ADMIN_ROLE”;GRANT USAGE ON WAREHOUSE MASK_WH TO ROLE EMPLOYEE_ROLE;GRANT USAGE ON WAREHOUSE MASK_WH TO ROLE HR_ROLE;GRANT ROLE EMPLOYEE_ROLE TO USER FREDDY;GRANT ROLE MANAGER_ROLE TO USER FREDDY;GRANT ROLE HR_ROLE TO USER FREDDY;GRANT ROLE DATA_MASKING_ADMIN_ROLE TO USER FREDDY;



/* PART 1 – enable masking policy ON ACCOUNT AND GRANT ACCESS TO ROLE */GRANT CREATE MASKING POLICY ON SCHEMA DEMO_MASKING_DB.DEMO TO ROLE “DATA_MASKING_ADMIN_ROLE”;USE ROLE ACCOUNTADMIN;GRANT APPLY MASKING POLICY ON ACCOUNT TO ROLE “DATA_MASKING_ADMIN_ROLE”;



/* PART 2 – CREATE MASKING POLICY /USE ROLE DATA_MASKING_ADMIN_ROLE; USE SCHEMA DEMO_MASKING_DB.DEMO;CREATE OR REPLACE MASKING POLICY MASK_FOR_EMAIL AS (VAL STRING) RETURNS STRING ->CASEWHEN CURRENT_ROLE() IN (‘HR_ROLE’) THEN VALELSE ‘********’END;


/* PART 3 - APPLY MASKING POLICY TO EMAIL COLUMN IN EMP:LOYEE TABLE */ALTER TABLE IF EXISTS EMPLOYEE MODIFY COLUMN EMAIL SET MASKING POLICY MASK_FOR_EMAIL;



**AWESOME - NOW YOU NOW HAVE CREATED AND APPLIED YOUR DATA MASK! Let's Test it out.



/* TEST YOUR DATA MASK !!! --> TEST by QUERYING TABLE WITH DIFFERENT ROLES AND SEE RESULTS */
/* Notice the EMAIL is MASKED with ******* */
USE ROLE EMPLOYEE_ROLE;
SELECT * FROM DEMO_MASKING_DB.DEMO.EMPLOYEE;
/* Notice the EMAIL is NOT MASKED */
USE ROLE HR_ROLE;
SELECT * FROM DEMO_MASKING_DB.DEMO.EMPLOYEE;

ADDITIONAL DETAILS:

  • **MASKS are really custom data definition language (DDL) objects in Snowflake. *YOU can always get their DDL by using the Snowflake standard GET_DDL function or using DESCRIBE./ EXAMPLES for reviewing the MASKING POLICY // when using SECURITYADMIN or other roles without USAGE you must use the full DATABASE.SCHEMA.POLICY PATH */

USE ROLE SECURITYADMIN;DESCRIBE MASKING POLICY DEMO_MASKING_DB.DEMO.MASK_FOR_EMAIL;

USE ROLE ACCOUNTADMIN; /* when using SELECT that means the ROLE MUST HAVE USAGE enabled which the SECURITYADMIN role does not have by default */

SELECT GET_DDL(‘POLICY’,’DEMO_MASKING_DB.DEMO.MASK_FOR_EMAIL’);

 

Conclusion:

 

Dynamic Data Masking Policies are a great way to secure and obfuscate your PII data to different roles without access where necessary while at the same time displaying the PII data to the roles that need access to it. We hope this tutorial has helped you understand Dynamic Data Masking on Snowflake. For further information on Snowflake, check out our blog for more tips and tricks.

New Data Shares Added in January 2021 on the Snowflake Data Marketplace

Title

Data Shares Removed from the Snowflake Data Marketplace

If we have made any errors than let us know.
Title

New Data Shares Available in November 2020 on the Snowflake Data Marketplace

Title

New Data Shares Available in January 2021 on the Snowflake Data Marketplace

Title

New Data Shares Available in December 2020 on the Snowflake Data Marketplace

Title

How Snowflake Pricing Works

Introduction:

 

Snowflake pricing is determined by how much you use compute resources such as warehouses (virtual compute instances) and storage, as well as other costs like cloud services, cloud storage, and data transfer. Most of your Snowflake costs will be for computing resources, which typically account for 90% or more of your monthly costs. With Snowflake, you don't have to make any upfront commitments or sign any long-term contracts.

You can start with a free trial account, and you won't be charged if you don't use any billable services. You only pay for what you use. Snowflake pricing may vary depending on the platform and region you're using.

 

Which tools can help optimize your Snowflake costs?

 

There are a multitude of tools that you can leverage to confidently optimize and minimize your Snowflake costs. One such tool is Snoptimizer, which can be a game-changer for your organization.

Snoptimizer is the first automated Snowflake Cost Optimization Service that ensures significant cost savings (up to 50% on Snowflake compute) without sacrificing performance.

We built Snoptimizer because we saw a significant need in the marketplace. We were often called in by Snowflake customers for Snowflake Health checks, and 98% of the time, their accounts were not fully optimized.

Snoptimizer runs regularly and scours your Snowflake Operations Account Meta Data (over 40 views) continuously looking for Snowflake storage and computing anti-patterns and inefficiencies related to cost.

 

Usage-Based Pricing:

 

Usage Based Pricing in cloud services, especially in Snowflake, can be incredibly awesome sometimes. The fact that we can even start an account off with 400 credits for 30 days for a Proof of Concept (POC) is just amazing to me. Before this, our consulting company hesitated to introduce these more expensive solutions to our consulting clients which were small or medium size businesses because these solutions were out of their pricing comfort zone (especially when working with analytical databases that could scale like Exadata, Teradata, and Netezza).

 

What is the pricing on Snowflake?

 

For those of you who are new to Snowflake, let's start with Snowflake consumption pricing basics. Snowflake overall is usage or consumption-based pricing.  This means you only pay for what you use. Technically, you could set up a free Snowflake Trial Account and never pay anything because you never used any of the services that have a cost. T

For most Snowflake Accounts, Snowflake Compute or the Snowflake Warehouses (which are virtual compute engines) is where 90% or more of your costs are. The other four cost areas of Storage Costs, Cloud Computing Costs, Cloud Services Costs, and Data Transfer costs are typically easily 10% or less of the Snowflake SaaS costs per month.  Often the others can even be 1% or less unless you have certain use cases or end up mistakenly using Snowflake Cost Anti-patterns.

Please keep in mind that as soon as your Snowflake Account is provisioned, you the administrator, or a person with their credit card associated with the account have extreme cost risk by default.  Our best practice is to always enable Snowflake Cost Optimization with Snoptimizer immediately after provisioning a Snowflake Account. If you decide against that then at the very least you should limit access or set up standard Snowflake Cost Minimization Guardrails and Snowflake Cost Optimization and Cost Minimization Best Practices.

For those of you who are more Snowflake savvy and already know the basics then let's cover more advanced Snowflake pricing details.

Snowflake Compute Pricing - Advanced

 

One of the first things that Snoptimizer does is automate daily Resource Monitors at a warehouse level based on all the Snowflake Metadata Database history and warehouse and Resource Monitor settings. This gets set almost immediately after you purchase Snoptimizer. This has both huge cost risk reduction limits and guardrails for all of your warehouse compute.

One cool thing you can do is reduce your default query time out to 4 hours or less instead of 2 days by default with the following code.

ALTER WAREHOUSE SET STATEMENT_TIMEOUT_IN_SECONDS = 14400;

[/signinlocker]

 

How to Optimize Your Costs?

 

Over the last 3 years, my teams and I have analyzed over 100 Snowflake accounts, and about 95% of them were not fully optimized for both Cloud data costs and Cloud cost risk minimization. This is why my team and I are so excited to have created Snoptimizer (the first automated Snowflake Cost Optimization Service) - Easily optimize your Snowflake Data Cloud Account here in a few hours.

I think the reason why 90% of those accounts didn't have resource monitors or regular optimizations in place was initially Snowflake is incredibly cost-effective and typically had massive savings, especially from on-prem migrations that we have done.  However, companies that do not optimize their Data Cloud Costs are missing out big time!

 

Try Snoptimizer today:

 

Snoptimizer quickly and automatically optimizes your Snowflake account for security, cost, and performance. It eliminates headaches and concerns about security risks and cost overruns across your Snowflake account.

 

 

Try Snoptimizer today. Sign up and schedule a personal demo with us!

Optimization in a few hours, hassle-free!

 

Conclusion:

 

I hope the Snowflake Basic and Advanced Pricing information above is useful to you on your Snowflake Journey. For me, finding out that Snowflake consumption-based pricing was so reasonable was game-changing for both myself and my consulting company. Before Snowflake, we couldn't provide compute scale with enough speed to many of the largest big analytical challenges and solutions our clients needed.

I remember building predictive marketing tools and we often had to crunch large data sets we would often run into scaling challenges and have to spend tons of time and engineering effort to engineer for scale. Keep in mind that if you don't use Snowflake's Services smartly, you can end up spending a lot of money. Therefore, we recommend using Snoptimizer to help you reduce your costs.

 

If you're looking to optimize your Snowflake account costs, try Snoptimizer today!

Sign up and schedule a personal demo with us!

 

SnowCompare

SnowCompare is the easiest and fastest way to compare & deploy Snowflake data from one database or schema to another.

For example, when you clone a database several times and you want to understand the differences between the clone and the original database, you can use SnowCompare to easily view the difference.

Even though Snowflake allows you to code and write SQL to compare data,  it's still regularly cumbersome for a regular user or analyst. Therefore, we highly recommend using SnowCompare since it is easier to use.
Get on the waiting list for this free tool!  We plan to release it in October.

Find out more about all the benefits SnowAdmin has to offer you and your business. Sign up for a free proof of concept!

Also, if you want you want to learn about more news or features, be sure to check out our blog on a regular basis.