Understanding end-to-end encryption in Snowflake

End-to-End Encryption in Snowflake: Your Data, Always Secure
What is it?

End-to-end encryption (E2EE) in Snowflake safeguards your data at every step, from your device to Snowflake and back. No one intercepts or sees your data in plain text, minimizing security risks.

How it works:

Data Upload:
Upload data files to either Snowflake's internal stage (automatically encrypted) or your own external stage on a cloud storage service (optional client-side encryption recommended).

Data Processing:
Snowflake encrypts data files uploaded to external stages.
All data at rest remains encrypted in Snowflake's secure cloud storage.
Transformations and operations on data happen in an encrypted state and re-encrypted upon completion.

Data Output:
Unload query results to either an internal or external stage, with optional client-side encryption for external stages.
Downloaded data files remain encrypted until decrypted on your device.

Client-Side Encryption:

This optional layer adds another security blanket for data in external stages:

You create a secret master key shared with Snowflake.
Your cloud storage service client encrypts data with a random key, then encrypts that key with your master key.
Both encrypted files are uploaded to the cloud storage service.
Downloading involves decrypting the random key with your master key, then using it to decrypt the data file – all on your device.

Ingesting Client-Side Encrypted Data:

Create a named stage object in Snowflake with the CREATE STAGE command, specifying the cloud storage service, credentials, and your Base64-encoded master key as the MASTER_KEY parameter.
Load data from the stage into your Snowflake tables like usual.

Benefits:

Stronger data security throughout its journey.
Reduced attack surface by minimizing exposure to unencrypted data.
Flexibility to use any client or tool supporting client-side encryption.
Named stage objects simplify data access control without revealing encryption keys.

Remember:

Snowflake always encrypts data at rest and in transit, with an additional layer through client-side encryption (optional).

- You control your secret master key for client-side encryption.
- Snowflake adheres to the specific client-side encryption protocol of your chosen cloud storage service.
- With E2EE in Snowflake, your data enjoys maximum protection, empowering you to focus on valuable insights, not security worries.

Snowflake Access Control: Best Practices

Snowflake Access Control Best Practices: Securing Your Account and Data

This guide outlines key best practices and considerations for managing access control in your Snowflake account, ensuring secure access to data. It focuses on role-based access control (RBAC) for granular object access based on user roles.

Understanding the ACCOUNTADMIN Role:

Most powerful role: Grants access to account-level configuration, billing, and running SQL statements.
Not a super-user: Requires object-specific privileges for viewing/managing objects.
Hierarchy and precautions:
Other administrator roles (USERADMIN, SECURITYADMIN, SYSADMIN) are children.
Assign cautiously – limited users, multi-factor authentication (MFA), redundancy.
Avoid using for object creation – delegate to business-aligned roles.

Avoiding ACCOUNTADMIN Misuse:

Focus on account management: Use for initial setup and ongoing oversight.
Create object access roles: Align with business functions and grant to SYSADMIN.
Use alternate roles for automated scripts: Leverage SYSADMIN hierarchy for object operations.

Accessing Database Objects:

Securable objects: Tables, functions, stages, etc., within schemas and databases.
Required privileges:

USAGE on container database and schema.
Specific object privileges (e.g., SELECT for tables).
Example: Accessing 'mytable' in 'mydb.myschema' requires USAGE on both and SELECT on mytable.

Managing Custom Roles:

Initial state: Isolated, requires assignment to users and managing roles.
- ACCOUNTADMIN limitations: Cannot modify/drop objects created by custom roles.
Grant to SYSADMIN or hierarchy: Enables full control and inheritance by lower roles.

Aligning Object Access with Business Functions:

Role hierarchies: Grant roles to other roles for inheritance and flexibility.
Object access roles vs. functional roles:
- Object access roles: Grant permissions on specific objects.
- Functional roles: Group object access roles for related business functions.
Assign lower-level functions to higher-level functions as needed.
Grant top-level functional roles to SYSADMIN for comprehensive control.

Example: Managing access to fin and hr databases with different data sensitivities:

Access roles:
db_hr_r: Read-only access to hr tables.
db_fin_r: Read-only access to fin tables.
db_fin_rw: Read-write access to fin tables.
Functional roles:
accountant: Requires db_fin_rw and may need db_hr_r.
analyst: Requires both db_hr_r and db_fin_r.

Remember:

The technical difference between object access and functional roles lies in their logical use and organization.
Customize these best practices to fit your specific security needs and organizational structure.

By implementing these best practices, you can effectively manage access control in your Snowflake account, ensuring secure access to sensitive data and maintaining a robust security posture.

1. Create Roles:

Use a user administrator (USERADMIN role) or role with CREATE ROLE privilege to establish access and functional roles:
db_hr_r: Read-only access to hr database tables.
db_fin_r: Read-only access to fin database tables.
db_fin_rw: Read-write access to fin database tables.
accountant: For financial tasks, requiring db_fin_rw and potentially db_hr_r.
analyst: For data analysis, requiring both db_hr_r and db_fin_r.

2. Grant Object Privileges:

Use a security administrator (SECURITYADMIN role) or role with MANAGE GRANTS privilege to grant minimum necessary permissions to each access role (e.g., USAGE on database and schemas, SELECT on tables).

3. Construct Role Hierarchy:

Grant access roles to functional roles (e.g., db_fin_rw to accountant).
Grant functional roles to the system administrator (SYSADMIN) role for comprehensive control.

4. Assign Roles to Users:

Grant functional roles to users based on their job duties (e.g., accountant to user1, analyst to user2).

Additional Considerations:

Database Roles: Enable database owners to manage access within their databases, align with data sharing needs, and can't be directly activated in sessions.
Managed Access Schemas: Restrict grant decisions to schema owners or roles with MANAGE GRANTS privilege, enhancing security.
Future Grants: Simplify management by automatically granting privileges on newly created objects of specified types in a schema.
Query Results: Accessible only to the executing user for security reasons.
Cloned Objects: Retain privileges granted on contained objects, but not those granted on the source object itself.

Remember:

- Regularly review and update access controls as needed.
- Align role assignments with current business needs and security requirements.

Overview of Access Control

Demystifying Snowflake Access Control: Key Concepts and Framework

This topic dives into the core principles of access control in Snowflake, where granularity and flexibility reign supreme.

Snowflake's hybrid approach:

Borrows from Discretionary Access Control (DAC): Object owners grant access rights directly.
Leverages Role-based Access Control (RBAC): Access privileges are assigned to roles, then granted to users.

Essential building blocks:

Securable Object: Any entity requiring access control (default: access denied).
Role: A container for privileges, assigned to users and other roles (forming a hierarchy).
Privilege: A specific level of access granted for an object (granular control).
User: An identity recognized by Snowflake, representing individuals or programs.

The Snowflake Access Control Model:

- Access to securable objects is granted through privileges assigned to roles, which are then assigned to users or other roles (nesting creates a hierarchy).
- Each securable object has an owner who can directly grant access to roles (distinct from granting roles to users).

Key Difference:

- Snowflake prioritizes roles, unlike user-based models where users or groups directly hold rights and privileges. This fosters centralized control while offering users flexibility through assigned roles.

- Outcome:

Snowflake's hybrid access control framework delivers both strict control and adaptable access permissions, empowering users while maintaining security.

Further Exploration:

For a deeper understanding, explore the Role hierarchy and privilege inheritance section within this topic to delve into nested roles and inherited privileges.

Understanding the Hierarchy of SecurableObjects in Snowflake

In Snowflake, securable objects reside within a structured hierarchy of containers, ensuring organized access control. Here's how it unfolds:

Top-Level Container: The customer organization sits at the apex, encompassing all objects within your Snowflake account.
Databases: Each database acts as a major container, housing multiple schemas and their associated objects.
Schemas: Schemas reside within databases, providing a logical grouping for related objects.
Securable Objects: These include tables, views, functions, stages, and more, residing within schemas.

Visualizing the Structure:

Customer Organization
├── Database 1
│ ├── Schema 1
│ │ ├── Table 1
│ │ ├── View 1
│ │ └── Function 1
│ └── Schema 2
│ ├── Table 2
│ └── Stage 1
└── Database 2
├── Schema 3
│ └── ... (Additional securable objects)
└── ... (Additional schemas and their objects)

Key Points:

This clear organization streamlines access control management.
Privileges are granted at appropriate levels within this structure (e.g., database-level, schema-level, or object-level).

Understanding this hierarchy is crucial for effective security administration in Snowflake.

Understanding Roles, Privileges, and Ownership in Snowflake's Access Control

Key Concepts:

Ownership:
- A role "owns" an object by having the OWNERSHIP privilege on it.
- Ownership grants full control over the object, including granting or revoking privileges to other roles.
- Ownership can be transferred using GRANT OWNERSHIP.
Roles:
- Entities that hold privileges on objects.
- Roles are assigned to users, allowing them to perform actions on objects.
Types:
- Account roles (apply to any object in the account).
- Database roles (apply to objects within a specific database).
- Instance roles (for access to class instances).

Role hierarchies:
- Roles can be granted to other roles, creating a hierarchy.
- Privileges are inherited from higher roles in the hierarchy.

System-defined roles:
- ORGADMIN (manages organization-level operations)
- ACCOUNTADMIN (encapsulates SYSADMIN and SECURITYADMIN, highest level)
- SECURITYADMIN (manages grants and users/roles)
- USERADMIN (manages users and roles)
- SYSADMIN (creates warehouses, databases, and other objects)
- PUBLIC (granted to everyone by default)
- Custom roles can be created for specific access control needs.

Privileges:
- Define who can access and perform operations on objects.
- Managed using GRANT and REVOKE commands.
- Granted at the object level or through future grants (for new objects).
- Managed access schemas centralize privilege management with the schema owner or MANAGE GRANTS role.

Key Points:

- Snowflake's access control combines aspects of DAC and RBAC for granular control.
- Understanding roles, privileges, and ownership is crucial for effective security administration.
- Role hierarchies facilitate privilege inheritance and efficient management.
- Consider best practices for custom role creation and hierarchy structures.

Understanding Database Roles and Enforcement Models in Snowflake

Key Points Regarding Database Roles:

Limitations:
- Database roles granted to shares cannot be granted to other database roles (restriction on nesting).
- Database roles granted to other database roles cannot be granted to shares.
- Account roles cannot be granted to database roles in a hierarchy.
Activation:
- Database roles cannot be directly activated in a session.
- To leverage database role privileges, grant them to account roles, which can then be activated.

Enforcement Model with Primary and Secondary Roles:

Primary Role:
- Every session has a single "current role" or primary role.
- Determines authorization for CREATE statements (object ownership).
Determined at session initiation based on connection settings or user defaults.

Secondary Roles:
- Multiple secondary roles can be activated within a session.
- Aggregate privileges from primary and secondary roles determine SQL action authorization (excluding object creation).
- Authorization for actions other than object creation can come from primary or secondary roles.

Key Takeaways:

- Database roles offer granular access control within databases.
- Understand their limitations and activation methods for effective use.
Snowflake's enforcement model with primary and secondary roles provides flexibility in privilege management and session-level authorization.
Simplifying Role Management and Authorizing Actions with Secondary Roles

Key Benefits of Secondary Roles:

Streamlined Role Management: In organizations with numerous roles and granular authorization, secondary roles allow users to activate multiple roles within a session, simplifying access control without creating complex role hierarchies.
Cross-Database Operations: Empower users to perform SQL actions spanning multiple databases, such as cross-database joins, without requiring a single parent role with access to all involved databases.

Activation and Management:

Enabling Secondary Roles: Activate secondary roles using the USE SECONDARY ROLES command.
Viewing Active Roles: Use the CURRENT_SECONDARY_ROLES function to list active secondary roles in a session.
Changing Roles: The USE ROLE command allows switching the primary role during a session.

Authorization Considerations:

Object Creation: Only the primary role and its inherited roles are considered for authorization when creating objects.
Other Actions: For actions like querying tables, privileges from both primary and secondary roles, as well as their inherited roles, are considered.

No Super-User Privileges:

Snowflake emphasizes security by design. There's no "super-user" or "super-role" concept that bypasses authorization checks. All actions require explicit access privileges.

Managing Session Policies (Snowflake)

Managing Session Policies in Snowflake: Key Privileges and Commands:

Essential Privileges

CREATE: Enables creating new session policies in a schema.
APPLY SESSION POLICY: Enables applying policies at the account or user level.
OWNERSHIP: Grants full control over a policy, required for most alterations.

Important Note: Operating on any object in a schema also requires the USAGE privilege on the parent database and schema.

Key DDL Commands

CREATE SESSION POLICY
ALTER SESSION POLICY
DROP SESSION POLICY
SHOW SESSION POLICIES
DESCRIBE SESSION POLICY
ALTER ACCOUNT (to set or unset account-level policies)
ALTER USER (to set or unset user-level policies)

Troubleshooting Common Issues

Cannot create a session policy:
- Ensure a database is specified or use a fully qualified object name.
- Verify the role has the CREATE SESSION POLICY on SCHEMA privilege.
- Check database existence and USAGE privilege on the schema.
- Verify the role has OWNERSHIP or APPLY privilege on the policy.

Cannot drop a session policy:
Ensure the role has OWNERSHIP privilege on the policy.
Unset the policy from the account (if attached) before dropping.

Cannot set a session policy on an account:
An account can only have one active policy. Unset the current one first.

Cannot set a timeout value:
- The timeout value (in minutes) must be an integer between 5 and 240.

Cannot update an existing session policy:

- Verify the policy name, ALTER SESSION POLICY syntax, and privileges.
Additional Information:

- For a detailed summary of DDL operations and required privileges, refer to Snowflake documentation.
- For instructions on account and database replication to replicate session policies, consult Snowflake documentation.

Snowflake Sessions & Session Policies

Understanding and Managing Snowflake Sessions and Session Policies

Key Concepts:

Sessions: Independent of IdP sessions, lasting indefinitely with activity or expiring after an idle session timeout (default 4 hours).
Session Policies: Customizable idle timeout periods (5-minute minimum) for accounts or users to address compliance requirements. User-level policies take precedence.
Key Properties:
SESSION_IDLE_TIMEOUT_MINS: For programmatic and Snowflake clients.
SESSION_UI_IDLE_TIMEOUT_MINS: For the Classic Console and Snowsight.
Client Considerations:

Avoid using CLIENT_SESSION_KEEP_ALIVE to prevent excessive open sessions and potential performance degradation.
Use CLIENT_SESSION_KEEP_ALIVE_HEARTBEAT_FREQUENCY to control token update frequency.
Interface Behavior:

Web interface sessions refresh with continued object usage.
New or opened worksheets reuse existing sessions with a reset idle timeout.
Tracking Session Policy Usage:

SESSION_POLICIES view for account-level policies.
POLICY_REFERENCES table function for user-level policies.
Limitations:

Future grants on session policies are unsupported.
Workaround: Grant APPLY SESSION POLICY privilege to a custom role for applying policies.
Implementation Steps (Centralized Management Approach):

Create a custom role (policy_admin) with ownership of the session policy and privileges to apply it to accounts or users.
Grant necessary permissions to policy_admin for account-level policy setting.
Follow Snowflake documentation for specific configuration steps.

Overview of federated authentication and SSO

Federated Authentication and SSO in Snowflake: A Concise Overview

Centralized Authentication for Streamlined Access

Snowflake embraces federated authentication, enabling you to leverage external identity providers (IdPs) for user authentication and single sign-on (SSO) access. This approach streamlines user management and enhances security.

Key Concepts:

- Service Provider (SP): Snowflake acts as the SP, receiving authenticated user information from the IdP.
- Identity Provider (IdP): An external entity responsible for:
Creating and maintaining user credentials and profiles.
Authenticating users for SSO access to Snowflake.

- Supported IdPs:
- Native Support: Okta (hosted service), Microsoft AD FS (on-premises)
Most SAML 2.0-compliant vendors, including Google G Suite, Microsoft Azure Active Directory, OneLogin, Ping Identity PingOne (custom application setup required).

SSO Workflows:

Federated authentication supports these SSO workflows:

Login: Users authenticate through the IdP, seamlessly accessing Snowflake.
Logout: Users can initiate logout from either Snowflake or the IdP, terminating sessions across both platforms.
System Timeout: Inactive sessions automatically expire based on configured settings.

Configuration:

- Choose a compatible IdP.
- Establish a trust relationship between Snowflake and the IdP.
- Configure Snowflake to use federated authentication.

For detailed configuration steps, refer to the Snowflake documentation on configuring IdPs.

Federated Authentication Login and Logout Workflows: A Concise Guide

Login Workflows:

Snowflake-Initiated Login:

- User accesses the Snowflake web interface.
- User selects login using the configured IdP.
- User authenticates with the IdP.

Upon successful authentication, the IdP sends a SAML response to Snowflake, initiating a session and displaying the Snowflake web interface.

-IdP-Initiated Login:

-User authenticates with the IdP.
-User selects the Snowflake application within the IdP.
The IdP sends a SAML response to Snowflake, initiating a session and displaying the Snowflake web interface.

Logout Workflows:

- Standard Logout: Requires users to explicitly log out of both Snowflake and the IdP (supported by all IdPs).
- Global Logout: Logs the user out of the IdP and all Snowflake sessions (support varies by IdP).

Key Points:

Snowflake-Initiated Logout: Terminates only the current Snowflake session; other sessions and the IdP session remain active. Global logout is not supported from within Snowflake.

IdP-Initiated Logout: Behavior depends on IdP capabilities:
AD FS supports both standard and global logout.
Okta supports standard logout only.
Custom providers support standard logout, with global logout varying by provider.

Important Note: Closing a browser tab/window doesn't always end an IdP session. Users might still access Snowflake until the IdP session times out.

Securing Snowflake (Guide)

Snowflake Security Overview:

Snowflake offers comprehensive security features designed to protect your account, users, and data at the highest standards. This section primarily targets administrators with roles like ACCOUNTADMIN, SYSADMIN, or SECURITYADMIN.

Topics:

Federated Authentication & SSO: Configure and manage federated authentication with external identity providers.
Key-pair Authentication: Implement and rotate key-pair authentication for enhanced security.
Multi-factor Authentication (MFA): Enforce multi-factor authentication for additional user verification.
Snowflake OAuth & External OAuth: Leverage OAuth for authentication and access control.
Network Policies & Network Rules: Define network restrictions for inbound and outbound traffic.
Private Connectivity: Securely connect to Snowflake stages and accounts on AWS, Azure, and Google Cloud Platform.
Snowflake Sessions & Session Policies: Manage session settings and access privileges.
SCIM Provisioning: Simplify user and group management with SCIM integration.
Access Control (RBAC): Implement granular access control using roles and privileges.
End-to-End Encryption: Encrypt data at rest and in transit for robust protection.
Encryption Key Management: Manage and control encryption keys for your data.

Understanding Snowflake Table Structures

Understanding Snowflake's Physical Table Structure:

While data appears as familiar rows and columns in Snowflake tables, its physical storage involves innovative concepts like micro-partitions and data clustering.

This section dives into these crucial elements, revealing the inner workings of Snowflake's table structure. Furthermore, it provides expert guidance on explicitly defining clustering keys for massive tables (multi-terabytes) to optimize maintenance and query performance.

micro-partitioning:%20Beyond%20Traditional%20Data%20Warehouses%3A%0A%0A%20Traditional%20data%20warehouses%20often%20struggle%20with%20static%20partitioning,%20where%20large%20tables%20are%20divided%20into%20fixed%20units%20managed%20independently.%20This%20static%20approach%20poses%20challenges%20like%20high%20maintenance%20overhead%20and%20data%20skew,%20leading%20to%20unevenly%20sized%20partitions%20and%20performance%20bottlenecks.

Databases, Tables, and Views – Overview

Snowflake Data Organization:

Snowflake stores all data within databases, each containing schemas that group logical data objects like tables and views. There are no inherent limits on their creation.

Explore the following resources to delve deeper:

Tables: Understand Snowflake's unique table structures, including micro-partitions and data clustering.
Temporary & Transient Tables: Learn about storing temporary data for specific needs like ETL or short-lived tasks.

- External Tables: Discover how to reference read-only files stored in external stages.

- Iceberg Tables: Leverage the flexible Apache Iceberg format for data in external cloud storage and utilize Snowflake as the Iceberg catalog or create tables from existing object storage files.

- Views: Access the results of a query as a virtual table through the power of views. They can combine, segregate, and protect data effectively.

- Secure Views: Enhance data privacy by using secure views to restrict access to sensitive information within underlying tables.

-Materialized Views: Improve query performance with pre-computed data stored in materialized views, derived from specified queries.

-Table Design Best Practices: Gain valuable insights on designing and managing efficient and sustainable tables.

-Cloning Best Practices: Learn the best practices and considerations for successfully cloning databases, schemas, and permanent tables within Snowflake.

-Data Storage Considerations: Discover effective strategies for optimizing data storage costs associated with Continuous Data Protection (CDP), especially for tables.

Using a Warehouse (Snowflake)

Executing Queries and DML Statements:

Warehouse Requirements: Running queries and DML statements in Snowflake requires two conditions:

-A running warehouse.
-The warehouse specified as the current warehouse for the session.

Session Scope: Each Snowflake session can only have one current warehouse at a time.
Warehouse Selection: Use the USE WAREHOUSE command to set or change the current warehouse for the session.
Query Processing: Once set, queries and DML statements within the session are processed by the current warehouse. You can view the warehouse used for each query/statement in the Classic Console's History and Worksheets pages.

Delegating Warehouse Management:

Default Access: The ACCOUNTADMIN role holds default privileges to manage all warehouses in the account (alter, suspend, describe, etc.).
Custom Role Delegation: For finer control, use the GRANT command to grant the MANAGE WAREHOUSES privilege to a custom role. This grants equivalent privileges (MODIFY, MONITOR, OPERATE) to all account warehouses.

Example: The following demonstrates delegating warehouse management to a custom role named manage_wh_role:

1. Create and Own a New Warehouse:

SQL
CREATE ROLE create_wh_role;
GRANT

CREATE WAREHOUSE ON ACCOUNT TO ROLE create_wh_role;
GRANT ROLE create_wh_role TO ROLE SYSADMIN;
CREATE ROLE manage_wh_role;
GRANT MANAGE WAREHOUSES ON ACCOUNT TO ROLE manage_wh_role;
GRANT ROLE manage_wh_role TO ROLE SYSADMIN;
Use code with caution. Learn more

2. Use manage_wh_role to modify "test_wh" owned by create_wh_role:

SQL
USE ROLE manage_wh_role;
ALTER WAREHOUSE test_wh ...;

Delegating Warehouse Management with Roles:

This example demonstrates creating two roles to manage Snowflake warehouses:

1. create_wh_role:

This role has the CREATE WAREHOUSE privilege, allowing it to:

Create and own new warehouses.
Grant ownership of warehouses to other roles.

2. manage_wh_role:

This role has the MANAGE WAREHOUSES privilege, granting it comprehensive control over all warehouses in the account, regardless of ownership. This includes the ability to:

Suspend and resume warehouses.
Resize warehouses.
Describe warehouse properties.

Scenario:

Create both roles and grant them the necessary privileges:
SQL
CREATE ROLE create_wh_role;
GRANT CREATE WAREHOUSE ON ACCOUNT TO ROLE create_wh_role;
GRANT ROLE create_wh_role TO ROLE SYSADMIN;

CREATE ROLE manage_wh_role;
GRANT MANAGE WAREHOUSES ON ACCOUNT TO ROLE manage_wh_role;
GRANT ROLE manage_wh_role TO ROLE SYSADMIN;
Use code with caution. Learn more

Create a new warehouse with create_wh_role:
SQL
USE ROLE create_wh_role;
CREATE OR REPLACE WAREHOUSE test_wh WITH WAREHOUSE_SIZE = XSMALL;
Use code with caution. Learn more

Switch to manage_wh_role and manage the warehouse:
SQL
USE ROLE manage_wh_role;

ALTER WAREHOUSE test_wh SUSPEND; -- Suspend the warehouse
ALTER WAREHOUSE test_wh SET WAREHOUSE_SIZE = SMALL; -- Resize the warehouse
DESC WAREHOUSE test_wh; -- Describe the warehouse properties

Working with Warehouses

Managing Warehouses:

Snowflake offers two options for managing your warehouses:

1. Web Interface:

Access the Admin > Warehouses > Warehouse panel in Snowsight.

In the Classic Console, navigate to Warehouses > Create.

2. DDL Commands:

Utilize the CREATE WAREHOUSE command.

Warehouse Creation:

Specify the initial state of the warehouse during creation:

Started: Resources provisioned immediately, incurring credit consumption.
Suspended: Delayed resource provisioning, minimizing initial costs.

Starting and Resuming Warehouses:

Initial Creation: Warehouses can be started during initial creation or at any later point.
Resuming: Once created, resuming a warehouse is identical to starting.

Suspended Warehouses:

Use the following methods to resume a suspended warehouse:

Web Interface: Navigate to Admin > Warehouses > > Resume in Snowsight, or Warehouses > > Resume in the Classic Console.
SQL: Execute an ALTER WAREHOUSE RESUME command.
Startup and Credit Consumption:

Resource Provisioning: Starting typically takes seconds, but may occasionally be longer during resource provisioning.
Credit Billing: Warehouses begin consuming credits when all resources are provisioned, except in rare cases of partial provisioning where billing reflects only provisioned resources. Once remaining resources are provisioned, full billing resumes.
Billing Minimum: Resumes incur a 1-minute minimum billing, though usage is reported in hourly increments.

SQL Processing:

Snowflake delays executing submitted statements until all resources are provisioned, except during rare provisioning failures.
Partial Provisioning: In such cases, processing begins once 50% or more of resources are available.

Suspending Warehouses:

Running warehouses can be suspended at any time, even during active SQL execution. This action halts credit consumption upon resource shutdown.

Suspension Methods:

Web Interface: Navigate to Admin > Warehouses > > Suspend in Snowsight or Warehouses > > Suspend in the Classic Console.
SQL: Execute an ALTER WAREHOUSE SUSPEND command.

Suspension Process:

Idle Resource Shutdown: Snowflake immediately closes all idle compute resources.
Active Statement Completion: Resources actively executing statements are allowed to finish.
Resource Shutdown & Suspension: Upon statement completion, remaining resources shut down, transitioning the warehouse to "Suspended" status.

Note: Resources awaiting shutdown are in "quiesce" mode.

Resizing Warehouses for Dynamic Resource Allocation:

Snowflake empowers you to adjust warehouse size (up or down) at any time, even during active query processing.

Resizing Methods:

Web Interface:
Snowsight: Navigate to Admin > Warehouses > > Edit.
Classic Console: Access Warehouses > > Configure.
SQL: Execute ALTER WAREHOUSE SET WAREHOUSE_SIZE = .

Benefits of Resizing Up:

Consider increasing warehouse size to enhance performance in scenarios like:

Executing large, complex queries against extensive datasets.
Loading or unloading significant data volumes.

Scaling up vs. Scaling out Warehouses

Scaling Warehouse Resources:

Snowflake offers two approaches to scaling warehouse resources:

Vertical Scaling (Resizing): Increase compute resources within a single warehouse. Improves performance, especially for large complex queries, and reduces queuing due to resource limitations.

Note: Resizing is not intended for handling high concurrency; consider additional warehouses or multi-cluster options.

Horizontal Scaling (Adding Clusters): Requires Snowflake Enterprise Edition and expands a multi-cluster warehouse with additional compute units.

Resizing Considerations:

Limited benefit for small, basic queries. Larger size doesn't guarantee faster execution.
Running queries unaffected. Resizing only impacts queued and new queries after additional resources are provisioned.

5XL/6XL to 4XL or smaller: Brief dual billing period while the old warehouse shuts down.

Multi-cluster Warehouses Improve Concurrency

Leveraging Multi-Cluster Warehouses:

Snowflake Enterprise Edition users can opt for multi-cluster warehouses specifically designed to handle:

High concurrency: Ideal for scenarios with numerous concurrent users or queries.
Dynamic workloads: Adapts automatically to fluctuating user/query demands.

Considerations for Multi-cluster Warehouses:

Configuration:
Enterprise Edition users: Configure all warehouses as multi-cluster.
Mode: Prefer Auto-scale for automatic cluster adjustments (unless Maximized mode is vital).

Number of Clusters:
Minimum: Keep default (1) for on-demand scaling. Increase for high-availability needs.
Maximum: Set based on warehouse size and anticipated costs. Consider potential credit consumption with all clusters running concurrently.

Creating a Warehouse

Warehouse creation requires careful consideration of two key factors for optimal cost and performance:

1. Warehouse Size: Determines available compute resources, with larger sizes offering increased processing power but higher ongoing costs.

2. Management Approach: Choose between manual control over warehouse state (e.g., start/resume/suspend) or automated options for flexibility and resource optimization.

Multi-cluster Warehouses: For users with Snowflake Enterprise Edition (or higher), the number of clusters within a warehouse adds another layer of scalability beyond size. Further details on this advanced feature are available in the "Scaling Up vs Scaling Out" section of this topic.

Selecting an Initial Warehouse Size

Choosing the Initial Warehouse Size:

The optimal warehouse size depends on its intended function and processing workload. Consider these guidelines:

Data Loading: Match size to the number and size of files being loaded. See "Planning a Data Load" for details.
Small-Scale Testing: X-Small, Small, or Medium sizes are typically sufficient.
Large-Scale Production: Larger sizes (Large, X-Large, etc.) may be more cost-effective.

However, remember:

Per-second billing and auto-suspend allow starting large and adjusting down as needed.
Decreasing size is always possible.
Larger isn't always faster for simple queries. They may not benefit from additional resources, regardless of concurrency.

In general:

Aim to match warehouse size to the expected size and complexity of queries.

Automatic Warehouse Suspension:

Snowflake allows enabling automatic suspension based on a specified period of inactivity (minutes, hours, etc.). We recommend tailoring this setting to your workload and availability requirements:

Frequent Workloads: Consider a low inactivity threshold (e.g., 5-10 minutes) to minimize unnecessary credit consumption through per-second billing.
Gaps in Queries: Ensure the threshold aligns with natural gaps in your workload to avoid frequent and resource-intensive suspension/resumption cycles (minimum 60 seconds billed each time).
Heavy Workloads: Disable auto-suspension for warehouses with continuous processing needs or zero tolerance for delays. Note that provisioning time, though typically fast (1-2 seconds), can vary based on warehouse size and available resources.

Automatic Warehouse Resumption:

Snowflake offers automatic resumption upon receiving new queries. We recommend tailoring this setting based on your desired control over usage:

Priority on Convenience: Enable auto-resume for immediate availability even with potential minor provisioning delays. This reduces manual intervention but may incur unnecessary costs if unused.
Focus on Cost Control or Access: Disable auto-resume and manually initiate restarts. This provides granular control over resource utilization and user access but requires proactive management.

How Does Warehouse Caching Impact Queries?

Warehouse Caching:

Active warehouses maintain a table data cache, improving subsequent query performance by reducing table reads.
Cache size scales with warehouse compute resources: larger warehouses have larger caches.
Suspending a warehouse drops the cache, potentially impacting initial query performance upon resumption.
Resumed warehouses rebuild the cache: queries benefit as the cache grows.

Consider this trade-off:

Suspend to save credits: immediate cost savings, but potentially slower initial queries.
Leave running to maintain cache: faster initial queries, but ongoing credit consumption.

How Does Query Composition Impact Warehouse Processing?

Query Processing and Warehouse Size:

Resource demand: Compute requirements for queries depend on their size and complexity.
Scaling: Generally, queries scale linearly with warehouse size, especially complex ones.

Key factors:
Table size: Overall size outweighs the number of rows.
Filtering: Predicates affect processing, as does the number of joins/tables.
Optimization tip: Execute similar queries (size, complexity, data) on the same warehouse. Mixed workload diversity complicates load analysis and optimal size selection.

How are Credits Charged for Warehouses?

Credit Billing:

Computed based on:
-Warehouse size (credits per hour per cluster)
-Number of clusters (multi-cluster only)
-Running time of individual cluster resources
Example: X-Small (1 credit/hour/cluster), 4X-Large (128 credits/hour/cluster).

Billing Notes:

-Minimum: 1 minute for initial resource provisioning.
-Stopping: No benefit before 60 seconds (initial charge already incurred).
-Per-second: After 60 seconds, billing increments by the second.
-Restarting: Credits accumulate across restarts under 60 seconds.
-Resizing: Increases billing proportionally while additional resources run.

5XL/6XL → 4XL or smaller: Brief overlap billing while old warehouse shuts down.

Credit Display:

Shown in hourly increments, but reflects fractional usage due to per-second billing.

Warehouse considerations

1. Introduction:

This guide presents general best practices for leveraging virtual warehouses in Snowflake for efficient query processing. It refrains from absolute recommendations as optimal configurations depend on diverse factors, including:

- User/query concurrency
- Number of accessed tables
- Data size and structure
- Specific needs for availability, latency, and cost

Warehouse considerations for data loading are addressed in a separate topic (see sidebar).

2. Key Principles:

Experimentation: Execute diverse queries on various warehouse sizes to identify your ideal setups for specific workloads.
Size Flexibility: Prioritize efficient resource utilization over constant warehouse sizing. Snowflake's per-second billing allows suspending larger warehouses (Large, X-Large, etc.) during inactivity, minimizing cost.

3. Conclusion:

By implementing these principles, you can optimize warehouse usage and achieve cost-effective, high-performance query processing in Snowflake.

Note: These guidelines encompass both standard single-cluster warehouses and the advanced multi-cluster warehouses offered in Snowflake Enterprise edition.

Warehouses (Comprehensive Overview)

1. Warehouses: Essential Processing Units

Snowflake virtual warehouses serve as the core computational units responsible for running all SQL queries and data manipulation language (DML) operations, including loading, unloading, and updating data within tables.

2. Warehouse Properties and Functionality

Each warehouse is characterized by two key attributes:

Type: Categorized as either Standard or Snowpark-optimized, catering to different workload requirements.
Size: Defines the available compute resources, measured in virtual CPUs and memory.
Additionally, various configurable properties enable granular control and automation of warehouse activity.

3. Dynamic Resource Management

Snowflake offers unparalleled flexibility in managing warehouses:

Start and Stop on Demand: Warehouses can be activated and deactivated instantly as needed.
Responsive Scaling: Resizing is possible at any time, even during active operation, to adjust compute resources according to the intensity and type of warehouse tasks.

Warehouse Size:

Determines available compute resources per cluster.
Sizes: X-Small to 6X-Large, with increasing credits per hour and second.
Notes:
X-Small and Small are defaults for Snowsight and CREATE WAREHOUSE.
X-Large is default for Classic Console.
5X-Large and 6X-Large are in preview for US Government and Azure regions.
Impact on Credit Usage and Billing:

Billing: Per-second, with a 60-second minimum on startup.
Impact: Doubles with each size upgrade per full hour.
Example: Credits consumed for different sizes with varying running times.
Impact on Data Loading:

Increasing size doesn't always improve loading performance.
Tip: Smaller warehouses (Small-Large) suffice for most scenarios.
Impact on Query Processing:

Larger size can improve execution time, especially for complex queries.
Resizing: Provides more resources for running and queued queries.
Tip: Large size not necessarily faster for small, basic queries.
Auto-suspension and Auto-resumption:

Auto-suspend: Deactivates warehouse after inactivity (default enabled).
Auto-resume: Activates warehouse on incoming query (default enabled).
Benefit: Simplifies monitoring and usage, avoiding unnecessary credit consumption.
Query Processing and Concurrency:

Warehouse size and query complexity determine concurrent queries.
Queued queries: Wait for resources as others complete.
Control mechanisms: STATEMENT_QUEUED_TIMEOUT_IN_ SECONDS and STATEMENT_TIMEOUT_IN_SECONDS.
Alternatives: Create new or resize existing warehouse, or leverage multi-cluster warehouses (Enterprise Edition).
Warehouse Usage in Sessions:

Sessions initially lack associated warehouses, preventing queries.
Default warehouses:
User-specific: Set during creation/modification.
Client utilities/drivers/connectors: Specified in configuration files or connection parameters.
Precedence: User > Client defaults > Command line/driver parameters.
Change: USE WAREHOUSE command within a session.

Virtual Warehouses (on Snowflake)

Imagine Snowflake as a vast data lake, and virtual warehouses as your handy boats. These "warehouses" are actually clusters of processing power, available in two flavors: the trusty Standard and the cutting-edge Snowpark-optimized. These boats equip you with the muscle (CPU and memory) and temporary storage needed to sail through various tasks in your Snowflake journey:

Dive into data: Run SELECT statements to retrieve rows from tables and views like treasures from the depths.
Shape your data: Craft your data with DML operations like updating rows, inserting new ones, or simply unloading them onto dry land.
Bring data aboard: Load fresh data into your tables, filling your boat with new discoveries.
So, whether you're a seasoned data explorer or just setting sail, virtual warehouses are your ticket to navigating the Snowflake data ocean efficiently and effectively.

Heads up! These operations need a running warehouse, and active warehouses chew through Snowflake credits.

1. Overview of Warehouses

Virtual warehouses are the computational units in Snowflake responsible for executing all SQL queries and Data Manipulation Language (DML) operations (e.g., loading, unloading, updating data).
Warehouses are categorized by type (Standard or Snowpark-optimized) and size, with additional properties enabling control and automation.

2. Snowpark-optimized Warehouses

While both Standard and Snowpark-optimized warehouses can handle Snowpark workloads, the latter is optimized for scenarios with significant memory demands, such as machine learning training.

3. Warehouse Considerations

This section presents best practices and general guidelines for managing virtual warehouses effectively in Snowflake to optimize query processing.

4. Multi-cluster Warehouses

Multi-cluster warehouses provide dynamic scaling of compute resources to address fluctuating user and query concurrency demands, particularly during peak or off-peak periods.

5. Working with Warehouses

This section offers comprehensive guidance on creating, stopping, starting, and managing Snowflake warehouses efficiently.

6. Using the Query Acceleration Service

The Query Acceleration Service can enhance performance by offloading resource-intensive portions of queries in a warehouse. Enabling it can improve overall warehouse efficiency by mitigating the impact of outlier queries.

7. Monitoring Warehouse Load

Warehouse query load provides insights into the average number of concurrent or queued queries within a specified timeframe, enabling performance analysis and optimization.

Connecting to Snowflake (SnowSQL (CLI Client)

SnowSQL serves as the command-line interface for connecting to Snowflake, enabling the execution of SQL queries and performing various DDL and DML operations, including data loading and unloading from database tables.

The snowsql executable, representing SnowSQL, can operate either interactively as a shell or in batch mode through stdin or the -f option.

While SnowSQL is an application developed using the Snowflake Connector for Python, it's important to note that the connector is not mandatory for SnowSQL installation. All necessary software for SnowSQL is bundled within the installers.

Snowflake offers platform-specific versions of SnowSQL for download, catering to the following platforms:

Operating System

Supported Versions

- Linux

- CentOS 7, 8

- Red Hat Enterprise Linux (RHEL) 7, 8

- Ubuntu 16.04, 18.04, 20.04 or later

- macOS 10.14 or later

- Microsoft Windows

- Microsoft Windows 8 or later

- Microsoft Windows Server 2012, 2016, 2019, 2022

Related Videos:

How to Install, Configure & Use SnowSQL