Welcome to our Frank’s Future of Data four-part series. In these articles, we will cover a few tips on how to get value out of your Snowflake data.
I spend a ton of time reviewing and evaluating all the ideas, concepts, and tools around data, data, and data. The “data concept” space has been exploding with an increase in many different concepts and ideas. There are so many new data “this” and data “that” tools as well so I wanted to bring data professionals and business leaders back to the core concept that matters around the creation, collection, and usage of data. Data to Value.
In layman’s terms, the main concept is that we need to remember that the entire point of collecting and using data is to create business, organizational, and/or individual value. This is the core principle that we should keep in mind when contemplating the value that data provides.
The truth is that while the technical details and jargon involved in creating and collecting data, as well as realizing its value, are important, many users find them overly complex.
For a moment, let’s set aside the technical jargon that can be overused and misused, such as Data Warehouse, Data Lake, Data Mesh, and Data Observability. I’ve noticed that data experts and practitioners often have differing views on the latest concepts. These views can be influenced by their data education background and the types of technologies they were exposed to.
Therefore, I created these articles to prepare myself for taking advantage of new paradigms that Snowflake and other “Modern Data” Stack tools/clouds provide.
On Part 1 of the Data to Value series we will cover the Data to Value trends you need to be aware of.
Data to Value Trends:
In 2018, I had the opportunity to consult with some highly advanced and mature data engineering solutions. Some of these solutions were actively adopting Kafka/Confluent to achieve true “event-driven data processing”. This represented a significant departure from the traditional batch processing that had been used in 98% of the implementations I had previously encountered. I found the idea of using continuous streams of data from different parts of the organization, delivered via Kafka topics, to be quite impressive. At the same time, these concepts and paradigm shifts were quite advanced and likely only accessible to very experienced data engineering teams.
1) – Non-stop push for faster speed of Data to Value.
Within our non-stop dominantly capitalist world, faster is better and often provides advantages to organizations, especially around improved value chains and concepts such as supply chains. Businesses and organizations continuously look for any advantage they can get. I kinda hate linking to McKinsey for backup but here it goes. Their number 2 characteristic for the data-driven enterprise of 2025 is “Data is processed and delivered in real-time”.
2) – Data Sharing.
More and more Snowflake customers are realizing the massive advantage of data sharing allowing them to share “no-copy,” in-place data in near real-time. Data Sharing is a massive competitive advantage if set up and used appropriately. You can securely provide or receive access to data sets and streams from your entire business or organization value chain which is also on Snowflake. This allows for access to data sets at reduced cost and risk due to the micro-partitioned zero-copy securely governed data access.
3) – Creating Data with the End in Mind.
When you think about using data for value and logically think through the creation and consumption life cycle then data professionals and organizations are realizing there are advantages to capturing data in formats that are ready for immediate processing. If you design your data creation and capture as logs of data or other outputs that can be easily and immediately consumed you can gain faster data-to-value cycles creating competitive advantages with certain data streams and sets.
4) – Automated Data Applications.
I see some really big opportunities with Snowflake’s Native Applications and Streamlit integrated. Bottom-line, there is a need for consolidated “best-of-breed” data applications that can have a low-cost price point due to massive volumes of customers.
5) – Full Automated Data Copying Tools.
The growth of Fivetran and Stitch (Now Talend) has been amazing. We now are also seeing huge growth in automated data copy pipelines going the other way like Hightouch. At IT Strategists, we became a partner with Stitch, Fivetran, and Matillion back in 2018.
6) – Full Automation of Data Pipelines and more integrated ML and Data Pipelines.
With the introduction of a fully automated data object and pipeline service at Coalesce, we saw for the first time that data professionals improve Data to Value through fully automated data objects and pipelines. Some of our customers are referring to parts of Coalesce as a Terraform-like product for data engineering. What I see is a massive removal of data engineering friction similar to what Fivetran and Hightouch did but at a separate area of the data processing stack. We have become an early partner with Coalesce because we think it is similar to how we viewed Snowflake at the beginning of 2018. We view Coalesce as just making Snowflake even more amazing to use.
7) – The Data Mesh Concept(s) and Data Observability.
Love these concepts or hate them, they are taking hold within the overall data professionals’ brain trust. Zhamak Dehghani (previously at Thoughtworks) and ThoughtWorks from 2019 until now have succeeded in communicating to the market the concept of a Data Mesh. Whereas, Barr Moses from Monte Carlo, has been beating the drum very hard on the concept of Data Observability. I’m highlighting these data concepts as trends that are aligned with improving Data to Value speed, quality, and accessibility. There are many more data concepts besides these two. Time will reveal which of these will gain mind and market share and which will go by the wayside.
That is it for Frank’s Future of Data part 1 series article. In our second section, Part 2, we will continue exploring more trends that we should keep in mind, as well as exploring Snowflake’s announcements related to Data to Value.