Snowflake Snowday — Summary
Snowflake's semiannual product announcement, Snowflake Snowday, took place on November 7, 2022, the same day as the end of Snowflake's Data Cloud World Tour (DCWT).
I attended 5 DCWT events across the globe in 2022. It was fascinating to see how much Snowflake has grown since the 2019 tour. Many improvements and new features are being added to the Snowflake Data Cloud. It's hard to keep up! These announcements should further improve Snowflake's ability to turn data into value.
Let's summarize the exciting Snowflake announcements from Snowday. The features we're most enthusiastic about that improve Data to Value are:
- Snowflake's Python SDK (Snowpark) is now generally available.
- Private data sharing significantly accelerates collaborative data work.
- The Snowflake Kafka connector, dynamic tables, and Snowpipe streaming enable real-time data integration.
- Streamlit integration simplifies dashboard and app development.
All of these features substantially improve Data to Value for organizations.
Snowflake Snowday Summary - Top Announcements
TOP announcement! – whoop whoop – SNOWPARK FOR PYTHON! (General Availability – GA)
I believe this was the announcement all Python data scientists were anticipating (including myself). Snowpark for Python now enables every Snowflake customer to develop and deploy Python-based apps, pipelines, and machine-learning models directly in Snowflake. In addition to Snowpark for Python being Generally Available to all Snowflake editions, these other Python-related announcements were made:
- Snowpark Python UDFs for unstructured data (Private Preview)
- Python Worksheets – The improved Snowsight worksheet now supports Python so you don't need an additional development environment. This simplifies getting started with Snowpark for Python development. (Private preview)
One Product. One Platform.
- Snowflake’s major push is to make its platform increasingly easy to use for most or all of its customers’ data cloud needs.
- Snowflake now offers Hybrid Tables for OLTP workloads and Snowpark. Snowflake is expanding its core platform to handle AI/ML and online transaction processing (OLTP) workloads. This significantly increases Snowflake’s total addressable market.
- Snowflake acquired Streamlit earlier this year for a main reason. They aim to integrate Streamlit's data application frontend and backend. They also want to handle data application use cases.
- Snowflake is investing heavily to evolve from primarily a data store to a data platform for building frontend and backend data applications. This includes web/data apps needing millisecond OLTP inserts or AI/ML workloads.
Additionally, Snowflake continually improves the core Snowflake Platform in the following ways:
The Cross-Cloud Snowgrid:
Replication Improvements and Snowgrid Updates:
These improvements and enhancements to Snowflake, the cross-cloud data platform, significantly boost performance and replication. If you're unfamiliar with Snowflake, we explain what Snowgrid is here.
- Cross-Cloud Business Continuity – Stream & Task Replication (PUBLIC PREVIEW) – This enables seamless pipeline failover, which is fantastic. It takes replication beyond just accounts, databases, policies, and metadata.
- Cross-Cloud Business Continuity – Replication GUI (PRIVATE PREVIEW). You can now more easily manage replication and failover from a single interface for global replication. It enables easy setup, management, and failover of an account.
- Cross-Cloud Collaboration – Discovery Controls (PUBLIC PREVIEW)
- Cross-Cloud Collaboration – Cross-Cloud Auto-Fulfillment (PUBLIC PREVIEW)
- Cross-Cloud Collaboration – Provider Analytics (PUBLIC PREVIEW)
- Cross-Cloud Governance – Tag-Based Masking (GA)
- Cross-Cloud Governance – Masking and Row-Access Policies in Search Optimization (PRIVATE PREVIEW)
- Replication Groups – Looking forward to updates on this as well. These can enable sharing and simple database replication in all editions.
- The above are available in all editions EXCEPT:
- Enterprise or higher needed for Failover/Failback (including Failover Groups)
- Business Critical or higher needed for Client Redirect functionality
Performance Improvements on Snowflake Updates:
New performance improvements and performance transparency were announced were related to:
- Query Acceleration (public preview): Speeds up search queries.
- Search Optimization Enhancements (public preview): Improves search relevance and precision.
- Join eliminations (GA): Removes unnecessary table joins.
- Top results queries (GA): Returns the most relevant search results.
- Cost Optimizations: Account usage details (private preview): Reduces search costs.
- History views (in development): Provides search query history.
- Programmatic query metrics (public preview): Offers API for search analytics. Available on all editions EXCEPT: ENTERPRISE OR HIGHER REQUIRED for Search Optimization and Query Acceleration
Data Listings and Cross-Cloud Updates
I’m thrilled about Snowflake’s announcement regarding Private Listings. Many of you know that Data Sharing, which I’ve been writing about for over 4 years, is one of my favorite Snowflake features. My latest article is “The Future of Data Collaboration.” Data Sharing is a game-changer for data professionals.
Snowflake’s announcement makes private data-sharing scenarios much easier to implement. Fulfilling different regional requirements is now simpler too (even 1-2 years ago, we had to write replication commands). I’ll provide more details on how this simplifies data sharing and collaboration. I was happy to see presenters use the Data to Value concepts in their announcement.
I appreciated Snowflake incorporating some of my Data to Value concepts, like “Time to value is significantly reduced for the consuming party.” Even better, this functionality is now available for ALL SNOWFLAKE EDITIONS.
Private Listings (Get a crisper-looking visual)
Snowflake Data Governance Improvements
All Snowflake features enable native data governance and protection.
- Tag-based Masking automatically applies designated policies to sensitive columns using tags.
- Search Optimization now supports tables with masking and row access policies.
- FedRAMP High for AWS Government (authorization pending). *Available ONLY on ENTERPRISE+ OR HIGHER
Building on Snowflake
New announcements related to:
- Streamlit integration (PRIVATE PREVIEW in January 2023) – This integration will be exciting. The private preview can’t come soon enough.
- Snowpark Optimization Warehouses (PUBLIC PREVIEW) – This was a smart move by Snowflake to support AI/ML Snowpark customers’ needs. Great to see it rolled out, allowing customers access to higher memory warehouses better suited for ML/AI training scale. Snowpark code can run on both warehouse types.
- *Available for all Snowflake Editions
Streaming and Dynamic Table Announcements:
- Snowpipe Streaming (public preview soon): Stream data into Snowflake
- Snowflake Kafka Connector (public preview soon): Stream data from Kafka into Snowflake
- Snowflake Dynamic Tables (formerly Materialized Tables, private preview): Check out Dan Galvin’s article for details: https://medium.com/snowflake/️-snowflake-in-a-nutshell-the-snowpipe-streaming-api-dynamic-tables-ae33567b42e8
- Available for all Snowflake Editions
Conclusion:
Overall, I'm thrilled with where this is headed. These enhancements greatly improve Snowflake's streaming data integration, especially with Kafka. Now, Snowflake customers can get real-time data streams and transform data with low latency. When fully implemented, this will enable more cost-effective and high-performance data lake solutions.
If you missed Snowday and want to watch the recording, here's the link: https://www.snowflake.com/snowday/agenda/
We'll cover more updates from Snowday and Snowflake BUILD in depth this week in the Snowflake Solutions Community.