Data to Value helps prioritize data-related investments
This is the 3rd article in my 3 part series around Data to Value. The key takeaway from this series is that we always need to understand the value of our data. We also need to measure the speed of how fast we can go from data to business value. C-Level execs and others focused on strategic data initiatives need to utilize Data to Value metrics. Then we can understand the true value that is derived from our data creation, collection, extraction, transformation, loading, and analytics. Which allows us to invest better in data initiatives for our organizations and ourselves. Finally, data can only produce true value if it is accurate and of known quality.
If you want to view my 7 Data to Value trends, I summarize them in more detail below. If you want to check out my initial Data to Value Trends, 1 to 4 are in Data to Value – Part 1 and Data to Value – Part 2 articles.
Here are the Data to Value Trends that I think you need to be aware of (there are a few others though as well!):
Trend #1 – Non-stop push for faster speed of Data to Value.
Within our non-stop dominantly capitalist world, faster is better! Data to Value Speed advantages for organizations especially around improved value chains can create massive business advantages.
Trend #2 – Data Sharing. See Part 2
Trend #3 – Creating Data with the End in Mind. See Part 2
Trend #4 – Automated Data Applications. See Part 2
Trend #5 – Fully Automated Data Direct Copy Tools.
Trend #6 – Full Automation of Data Pipelines and more integrated ML and Data Pipelines.
With the introduction of a fully automated data object and pipeline service at Coalesce, we saw for the first time that data professionals improve Data to Value through fully automated data objects and pipelines. Some of our customers are referring to parts of Coalesce as a Terraform-like product for data engineering. What I see is a massive removal of data engineering friction similarto what Fivetran and Hightouch did but at a separate area of the data processing stack. We have become an early partner with Coalesce because we think it is similar to how we viewed Snowflake at the beginning of 2018. We view Coalesce as just making Snowflake even more amazing to use.
Trend #7 – The Data Mesh Concept(s), Data Observability, etc. concepts.
Love these concepts or hate them, they are taking hold within the overall data professionals’ brain trust. Zhamak Dehghani (previously at Thoughtworks) and ThoughtWorks from 2019 until now have succeeded in communicating to the market the concept of a Data Mesh. Whereas, Barr Moses from Monte Carlo, has been beating the drum very hard on the concept of Data Observability. I’m highlighting these data concepts as trends that are aligned with improving Data to Value speed, quality, and accessibility. There are many more data concepts besides these two. Time will reveal which of these will gain mind and market share and which will go by the wayside.
Some other things that we should keep in mind are:
– Growth of Fivetran and now Hightouch.
The growth of Fivetran and Stitch (Now Talend) has been amazing. We now are also seeing huge growth with automated data copy pipelines going the other way; they are focusing on the Reverse ETL (Reverse Extraction Transformation and Load) like our partner Hightouch. At our IT Strategists consulting firm, we became a partner with Stitch, Fivetran, and Matillion back in 2018. At Snowflake’s Partner Summit back in 2018 I sat next to Jake Stein – one of the founders of Stitch on the bus from San Francisco to the event in Sonoma and we quickly became good friends. (Jake is an excellent entrepreneur and is now focused on a new startup Common Paper – a structured contracts platform – after selling Stitch to Talend) Then I also met George Frazier from Fivetran at the event and mentioned how he was killing it with his post comparing all the cloud databases back in 2018 [there was no other content like that back then].
– Resistance to “ease of use” and “cost reductions” is futile.
Part of me as a consultant at the time wanted to resist these “Automated EL Tools” EL (Extract and Load) vs ETL – (Extract, Transform, and Load) or ELT (Extract, Load, and then Transform within the database). As I tested out Stitch and Fivetran though, I knew that resistance was futile. The ease of use of these tools and the reduction of development and maintenance costs cannot be overlooked. There was no way to stop the data market from embracing these easier-to-use data pipeline automation tools. What was even more compelling is you can set up automated extract and load jobs within minutes or hours most of the time. This is UNLIKE any of the previous ETL tools we have been using for decades which were mostly software installations. These installations took capacity planning, procurement, and all sorts of organizational business friction to EVEN get started at all. With Fivetran and Hightouch, there is no engineering or developer expertise needed for almost all of the work. [in certain situations, it helps to have data engineers and architects involved.] Overall though, it is just a simple concept connecting DESTINATIONS and CONNECTORS to eating Fivetran, DESTINATIONS are databases or data stores. CONNECTORS are sources of data (Zendesk, Salesforce, or one of the hundreds of other connectors in Fivetran). Fivetran and Hightouch are excellent examples of data service/tool trends that truly improve the speed of Data to Value.
Also, a MAJOR MAJOR trend that has been happening for a quite while “trying” to push the needle forward with data to value has been the growth of automated integrated Machine Learning pipelines with data. This is what Data Robot, Dataiku, H2O, Sagemaker, and tons and tons of others are attempting to do. It still seems very very early stage and not any single vendor with large mindshare or adoption yet. Overall the space is fragmented right now and it’s hard to tell which of these tools and vendors will thrive and survive.
This article is part of my Frank’s Future of Data series I put together to prepare myself for taking advantage of new paradigms that the “Snowflake Data Cloud” and other “Modern Data Stack” tools/clouds provide. Before I started my Snowflake Journey I was often speaking around the intersection of Data, Automation, and AI/ML. I truly believe these forces have been changing our world everywhere and will continue to do so for many years.
Data to Value is a key concept that helps us prioritize how to invest in our data-related initiatives.
I hope you found this useful for thinking about how you should decide on data-related investments and initiatives. Focusing specifically on Data to Value can help you prioritize and simplify what is truly most important for your organization! Did I have many value trends? Hit me up in the comments or directly if you have additional trends.
Good Luck to you all!