What is Data Sharing?
Let’s begin with data, data can derive from software that is used by enterprises within their business. For example, how many people are viewing a website or what kind of people are most interested in a certain brand. On a lower level, data sharing is simply when data resources are shared with many users or applications and at the same time assuring that there is data fidelity to all of those participating.
Now how is this relevant today? Currently, data sources are continuous which in turn means that there has to be data volumes for all the data sources. The main focus, with data sharing, has become how to move these increasing volumes of data and how to ensure that the data is accurate and secure. The cloud comes into play as it is expanding what data sharing is capable of. Now that there is the modern cloud, data sharing can allow people to share live data within their business and outside of it, get rid of data silos, create access to specific data sets, and more. However, this would require a platform that can put data sharing into motion and ensure that it works to its potential and this is where Snowflake comes into the picture.
Snowflake and Data Sharing
Snowflake allows for data collaboration while at the same time lowering the costs. It gives organizations the ability to securely share data and access live data. Not only is it secured and governed access to shared data but you can also publish data sets. As you can see the possibilities seem endless, but that’s only a brief preview of the capabilities of data sharing within Snowflake so let’s take a deeper look at the many parts that play a role in data sharing in Snowflake and how they come together in data sharing.
What are Data Providers and Consumers?
A data provider is an account within Snowflake that makes shares that are accessible to other accounts within Snowflake. Data providers share a database with another (or more than one) Snowflake account and for each database that is shared, Snowflake provides support with grants that can give access control to objects within the database. There are also no limitations on how many shares can be created or accounts that can be added to a share.
A data consumer is an account that creates a database from a share that is made accessible by another data provider. When you add a shared database to your account you are able to access and query the objects within it. There are no limitations on how many shares you can consume from the data providers, but you can only make one database for each share.
What is a Share?
In Snowflake, shares are objects that sum up all the information that is needed in order to share a database. Within shares, there are the permissions that give access to the databases and schema that contain the object to share, that give access to specific objects within the database, and the consumer accounts that the database and objects are shared with.
Then when a database is created from a share, the shared objects will be available to any users within the consumer account. These shares are customizable, secure, and fully controlled from the provider account. This means that objects that are added to a share will also be available to consumers which creates real-time access to the data and the access to a share or objects within it can be rescinded.
How does Secure Data Sharing Function in Snowflake?
In secure data sharing, the data itself is not actually copied or moved between the accounts as one may think. Sharing is done through Snowflake’s layer and metadata store. This means that shared data will not take up storage within a consumer account which means it would not add to the monthly data storage costs. The charges that would be made would be for the compute resources that are used to query the shared data.
Going back to what was previously mentioned, because the data itself is not actually copied or exchanged it makes secure data sharing an easy and fast setup for providers and it also makes shared data quickly available to consumers. But let’s take a closer look at how data sharing works for both the provider and the consumer:
Provider: Will create a share of a database within their account. Then they will grant access to objects within the database. From here they can share data from many databases (if the databases are from the same account). Lastly, one (or more) accounts will be added to the share and this can also include your accounts if you have many within Snowflake.
Consumer: Will have a read-only database from the share. Access to the database is customizable when using the same access control that is provided for objects.
The way that Snowflake is structured, allows providers to share data with many consumers (even those in their own organization) and consumers can access the shared data from many providers.
What Information is Shared with Providers?
Those that are providers in Snowflake are able to view a couple of things about consumers who have access to their data.
Providers can see the consumers: Snowflake account name and Snowflake organization name. They can also see the statistical data on the data consumption, this includes the day of the consumption and the number of queries a consumer account creates on a provider’s share.
Lastly, providers can see any information that a consumer gives (when a data request is submitted) such as the consumer’s business email and company name.
Can I share with Third Parties?
Data sharing can only occur between Snowflake accounts. However, as a provider within Snowflake, you might want to share data with a consumer outside of Snowflake and there is a way to do this.
In order to share data with outside consumers, Snowflake has created reader accounts. These accounts allow data to be shared without forcing a consumer to become a Snowflake customer. The reader accounts will belong to the provider account that made it. While the provider account uses shares in order to share the databases with reader accounts, the reader account can only take in data from the provider account that created it.
The users that are in a reader account can also query data that has been shared with it, however, it can’t perform DML tasks that could be done in a full account.
Now that we have done an overview and introduction of data sharing and how it works within Snowflake – let’s take a look at some other features that come with Snowflake’s data sharing.
Products that use Secure Data Sharing in Snowflake
Snowflake provides other products that you can use for data sharing in order to connect with providers of data with the consumers and these products include: direct share, snowflake data marketplace, and data exchange.
Direct Share is one of the easiest ways to share data that allows account-to-account data sharing while using Snowflake’s Secure Data Sharing. As the provider (account on Snowflake) you are able to share data with other companies so that your data is viewable in their Snowflake account without having to move your data or copy your data.
Snowflake Data Marketplace
All accounts within Snowflake can use the Snowflake Data Marketplace as long as these accounts are on non-VPS regions that are on supported cloud platforms. Snowflake’s Data Marketplace uses Snowflake’s Securing Data Sharing in order to connect the providers with consumers (just as mentioned in the direct share product).
You can also find and have access to third-party data and have the datasets in your Snowflake account in order to query without transformation and also join it with your data. The Data Marketplace gives you one location from where to get your data which makes it easier in cases where you are using different sellers for data sourcing.
Lastly, you can be a provider (account) and publish data within the Data Marketplace which is great in the data monetization aspect and also as a different way to market.
Data Exchange allows you to collaborate, securely, around data between groups that you invite which helps providers to publish data that will be seen by consumers. You can also share data with your entire business so think of your customers, partners, or even just within your own unit, and more! It also gives you the ability to control who is a part of your data, who can publish, consume, or even just access it. Specifically, you can invite others and decide what they are allowed to do: provide data or consume data. Data Exchange is supported for any accounts in Snowflake that are hosted on non-VPS regions and on all supported cloud platforms.
These three products within Snowflake that use secure data sharing prove useful to both provider and consumer (and more) accounts within Snowflake. But now that we have looked at how data sharing functions and what other features use data sharing in Snowflake, let’s take a look at how you actually use the data that was shared to you or your data that is shared to others and more.
Working with Shared Data
After having a foundation of what direct share, Snowflake Marketplace, and data exchange consist of and how they function, we can look into more concepts and tools that are accessible within them.
On Snowflake, when you have an ACCOUNTADMIN role, you are able to use the Shared Data page that is in the Snowflake new web interface in order to complete most assignments for managing and creating shares. Also as we continue, keep in mind that inbound is referring to data that is shared with you and outbound is referring to the data that has been shared by your account/from you.
Data Shared with You
Direct share, data exchange, or the Snowflake Marketplace can all be used by provider accounts to share inbound shares with your account. Inbound shares can be used to view shares from providers such as who provided a share and how sharing the data was performed, and you can create a database from a share.
Within the Snowflake web interface there is a “Share With Me” tab that shows you the inbound shared data for:
Direct shares that are shared with you and these shares are placed into two groups which include 1. Direct shares that are ready to get and 2. Direct shares that were imported into a database and can be queried.
Listings for data exchange that you can access. The data is shown under the name of the initial data exchange and if you have more than one data exchange then each data exchange will be shown within separate sections.
Listings for the Snowflake Marketplace data that have been moved into a database and can be queried, but it does not show shares that are ready to get. However, you can find the data listing in the Marketplace menu.
Data You Shared
Outbound shares are made within your account in order to share data with consumers. You are able to share data through direct share, data exchange, and the Snowflake Marketplace (as with inbound shares as previously mentioned).
With outbound shares you can:
See the shares you created or have access to. The database for the share, consumer accounts that can access the share, the day when the share was made, and objects that are shared are all information that is provided.
Create and edit a share and data listing for both
For individual consumer accounts, you can remove their access to the share.
Back to the web interface, the “Shared by My Account” tab shows the outbound shares that are from Snowflake Marketplace, data exchange, and direct share.
When looking at shares, there are icons next to each that demonstrate their sharing mechanisms such as direct share, data exchange, or Snowflake Marketplace.
Lastly, you are able to have these filters when viewing your shared data:
Type, which is seen as the “Ally Types” drop-down and it can be used to see direct shares compared to listings
Consumer, which is seen as the “Shared With” drop-down and it can be used to choose a certain consumer or data exchange (in which the data has been shared).
Data that is Shared
When sharing data, there are many ways you can do this:
1. Use direct share to directly share data with consumers
2. In the Snowflake Marketplace, post a listing
3. In data exchange, post a listing
Furthermore, when you are in the web interface and you want to share data, you will use the “Share Data” drop-down and choose from the list that provides all the platforms where you can share data.
Within the web interface, the inbound and outbound requests can be seen in the “Requests” tab. However, this tab does not show the data requests from the Snowflake Data Marketplace.
Let’s also take a step back and look at what exactly inbound and outbound requests are?
Inbound requests come from consumers who are requesting to have access to your data. You are also able to organize these requests by their status and then review them. Outbound requests come from you when you submit requests for data listings from other providers. Similar to inbound requests, you can sort the requests by status. Keep in mind that requests you make can be rejected, but you also have the ability to resubmit your request.
With certain roles such as the Data Exchange Admin role or if you have Provider Profile Level Privileges you have the ability to create and organize the provider profiles within the “Manage Exchanges” tab. However, if your organization does not have a data exchange, then you will not see the “Manage Exchanges” tab.
But back to the provider profile, if you have this, you are able to do the following tasks within a data exchange:
Create, update, and delete a profile
Update contact email
Manage profile editors
Now that we have gotten an overview of data sharing, you should be able to understand all the parts that make up data sharing and the various functions it contains!