Shortest Snowflake Summit 2022 Recap from a Snowflake Data Superhero
If you missed the Snowflake SUMMIT or any part of Snowflake Summit Opening Keynote. Here are the most key feature announcements and recap[in “brief” but “useful” detail]
KEY FEATURE ANNOUNCEMENTS — EXECUTIVE SUMMARY. [mostly in a chronological order of when they were announced. My top ~20. The number of announcements this week was overwhelming!]
#1. New Resource Groups concept announced where you can combine all sorts of snowflake data objects to monitor their resource usage. [this is huge since Resource Monitors were pretty primitive]
#2. Concept of Budgets that you can track against. [both Resource Groups and Budgets coming into Private Preview in the next few weeks]
#3. More Usage Metrics are being made available as well for SnowPros like us to use or Monitoring tools. This is important since many enterprise businesses were looking for this.
Replication Improvements on SnowGrid:
#4. Account Level Object Replication (Previously, Snowflake allowed data replication but not other account type objects. Now, all objects which are not just data can supposedly now can be replicated as well. Users)
#5. Pipeline Replication and Pipeline Failover. Stages and Pipes now can be replicated as well. [Kleinerman stated this is coming soon to Preview. I’m assuming Private Preview?] — DR people will love this!
Data Management and Governance Improvements:
#6. The combination of tags and policies. You can now do — [Private Preview now and will go into public preview very soon]
Expanding External Table Support and Native ICEBERG Tables:
#7. External Table Support for Apache Iceberg is coming shortly. Remember though that External tables are ONLY read only and have other limitations so see what Snowflake did in #9 below. [pretty amazing]
#8. EXPANDING Snowflake to handle on-premise data with Storage Vendor Partners so far of Dell Technologies and Pure Storage [their integration will be in private preview in the next few weeks.]
#9. Supporting ICEBERG TABLES with FULL STANDARD TABLE support in Snowflake so these tables will support replication, time-travel, etc. etc. [very huge]. This enables so much more ease of use within a Data Lake conceptual deployment. EXPERT IN THIS AREA: Polita Paulus
Improved Streaming Data Pipeline Support:
#10. New Streaming Data Pipelines. Main innovation is the capability to create a concept of MATERIALIZED TABLES. Now you can ingest streaming data as row sets. [very huge]. EXPERT IN THIS AREA: Tyler Akidau
- Funny — I did a presentation in Snowflake Summit 2019 on Snowflake’s Kafka connector. Now that is like ancient history.
Application Development Disruption with Streamlit and Native Apps:
#11. Low code data application development via Streamlit. The combination of this and the Native Application Framework allows Snowflake to disrupt the entire Application Development environment. I would watch closely for how this evolves. Its still very early but this is super interesting.
#12. Native Application Framework. I have been working with this for about 3 months and I think its a game-changer. It allows all of us data people to create Data Apps and share them on a marketplace and monetize them as well. It really starts to position Snowflake and its new name (UGH! 3rd name change — 2019=Data Exchange, 2020=Data Marketplace, 2022=
Expanded SnowPark and Python Support:
#13. Python Support in the Snowflake Data Cloud. More importantly, this is a MAJOR MOVE to make it much easier for all “data constituents” to be able to work seamlessly within Snowflake for ALL workloads including Machine Learning. This has been an ongoing move by Snowflake to make it much much easier to run data scientist type workloads within Snowflake itself.
#14. Snowflake Python Worksheets. This is really combined with the above announcement and enables data scientists who are used to Jupyter notebooks to more easily work in a fully integrated environment in Snowflake.
New Workloads. Cybersecurity and OLTP! boom!
#15. CYBERSECURITY. This was announced awhile back but I wanted to include it here to be complete since it was emphasized again.
#16. UNISTORE. OLTP type support based on Snowflake’s Hybrid Table features. This was one of the biggest announcements by far. Snowflake now is entering a much much larger part of data and application workloads by extending its capabilities BEYOND OLAP [big data. online analytical processing] into OLTP space which still is dominated by Oracle, SQL Server, mysql, postgresql, etc. This is a massive move and positioning Snowflake as a single integrated data cloud for all data and all workloads.
#17. Snowflake Overall Data Cloud Performance Improvements. This is cool but given all the other “more transformative” announcements I’m just bundling this together. Performance improvements included improvements on AWS related to new AWS capabilities as well as more power per credit with internal optimizations. [since Snowflake is a closed system though I think its hard for customers to see and verify this]
#18. Snowflake Overall Data Cloud Performance Improvements. This is cool but given all the other “more transformative” announcements I’m just bundling this together. Performance improvements included improvements on AWS related to new AWS capabilities as well as more power per credit with internal optimizations. [since Snowflake is a closed system though I think its hard for customers to see and verify this]
#19. Large Memory Instances. [not much more to say. they did this to handle more data science workloads but it shows Snowflake’s continued focus around customers when they need something else.]
#20. ̶D̶a̶t̶a̶ Marketplace Improvements. The Marketplace, one of my favorite things about Snowflake. They mostly announced incremental changes
Final Note: I hope you find this article useful and please let me know in the comments if you feel I missed anything really important.
I attempted to make it as short as possible while still providing enough detail so that you could understand that Snowflake Summit 2022 contained many significant announcements and moves forward by the company.
Quick “Top 3” Takeaways for me from Snowflake Summit 2022:
- Snowflake is positioning itself now way way beyond a cloud database or data warehouse. It now is defining itself as a full stack business solution environment capable of creating business applications
- Snowflake is emphasizing it is not just data but that it can handle “ALL WORKLOADS” – Machine Learning, Traditional Data Workloads, Data Warehouse, Data Lake, Data Applications and it now has a Native App and Streamlit Development toolset.
- Snowflake is expanding wherever it needs to be in order to be a full data anywhere anytime data cloud. The push into better streams data pipelines from kafka, etc. and the new on-prem connectors allow Snowflake to take over more and more customer data cloud needs.
Snowflake at a very high level wants to:
- Disrupt Data Analytics
- Disrupt Data Collaboration
- Disrupt Data Application Development
Want more recap beyond JUST THE FEATURES?
Here is a more in-depth take on the Keynote 7 Pillars that were mentioned:
Frank Slootman Recap:
MINUTE: ~2 to ~15 in the video
Snowflake related Growth Stats Summary:
2019: 938 Employees
2022 at Summit: 3992 Employees
2019: 948. Customers
2022 at Summit: 5944 Customers
*Total Revenue Growth:
2022 at Summit: 1.2B
Large emphasis on MISSION PLAN and INDUSTRY/VERTICAL Alignment.
MINUTE: ~15 to ~53 – Frank Slootman and Benoit
53 to 57:45 – Christian Intros.
Frank introduces the pillars of Snowflake INNOVATION and then Benoit and Christian delve into these 7 Pillars in more depth.
Let’s go through the 7 PILLARS OF SNOWFLAKE INNOVATIONS!
ALL DATA – Snowflake is emphasizing they can handle not only Structured Data and Semi-Structured but also Unstructured Data of ANY SCALE. Benoit even said companies can scale out to 100s of Petabytes.
- ALL WORKLOADS – There is a massive push by Snowflake to provide an integrated “all workload” platform. They define this as all types of data, all types of workloads now (emphasizing now it can handle all ML/AI type workloads via SnowPark and most ). [My take: one of Snowflake’s original architecture separation of compute and storage still is what makes it so so powerful.]
- GLOBAL – An emphasis on that Snowflake based on SnowGrid is a fully Global Data Cloud Platform. As of today, Snowflake is deployed over 30 cloud regions on the three main cloud providers. Snowflake works to deliver a unified global experience with full replication and failover to multiple regions based on its unique architecture of SnowGrid.
- SELF-MANAGED – Snowflake still is focusing a TON on continuing to make Snowflake SIMPLE and easy to use.
- MARKETPLACE – Snowflake emphasizes it continued focus on building more and more functionality on the Snowflake Marketplace (rebranded now since it will contain both native apps as well as data shares.). Snowflake continues to make the integrated marketplace as easy as possible to share data and data applications.
- GOVERNED – Frank’s story from 2019 keynote…someone grabbed him and said…You didn’t talk about GOVERNANCE [so Frank and everyone talked a ton about it this time!] – Snowflake and Frank state that there is a continuous heavy focus on Data Security and Governance.
OTHER KEY PARTS OF THE KEYNOTE VIDEO:
[ fyi – if you didn’t access it already the FULL Snowflake Summit 2022 Opening Keynote is here:
MINUTE: ~57:45 to 67 (1:07) – Linda Appsley – GEICO testimonial on Snowflake.
MINUTE: Goldman Executive presentation.