Introduction:
Big data refers to extremely large datasets that can be analyzed to identify patterns, trends, and associations. The analysis of big data provides insights into various fields, including business, science, and government. However, the challenge with big data is not just analyzing it, but also storing, managing, and sharing it. This is where technologies like Snowflake come into play, as they offer a secure platform for storing and sharing large amounts of data.
Part 1: What is Data Sharing?
Let’s begin with data, data can derive from software that is used by enterprises within their business. For example, how many people are viewing a website, or what kind of people are most interested in a certain brand? On a lower level, data sharing is simply when data resources are shared with many users or applications and at the same time assuring that there is data fidelity to all of those participating.
Now how is this relevant today? Currently, data sources are continuous which in turn means that there have to be data volumes for all the data sources. The main focus, of data sharing, has become how to move these increasing volumes of data and how to ensure that the data is accurate and secure. The cloud comes into play as it is expanding what data sharing is capable of. Now that there is the modern cloud, data sharing can allow people to share live data within their business and outside of it, get rid of data silos, create access to specific data sets, and more. However, this would require a platform that can put data sharing into motion and ensure that it works to its potential and this is where Snowflake comes into the picture.
Snowflake and Data Sharing
Snowflake allows for data collaboration while at the same time lowering costs. It gives organizations the ability to securely share data and access live data. Not only is it secured and governed access to shared data but you can also publish data sets. As you can see the possibilities seem endless, but that’s only a brief preview of the capabilities of data sharing within Snowflake so let’s take a deeper look at the many parts that play a role in data sharing in Snowflake and how they come together in data sharing.
Part 2: What are Data Providers and Consumers?
A data provider is an account in Snowflake that creates shares that can be accessed by other accounts in Snowflake. When a database is shared, Snowflake supports it through grants that allow access control to objects within the database. There are no restrictions on the number of shares that can be created or accounts that can be added to a share.
A data consumer is an account that creates a database from a Share that is made accessible by another data provider. When you add a shared database to your account, you can access and query the objects within it. There are no limitations on how many Shares you can consume from data providers, but you can only create one database for each Share.
What is a Share?
In Snowflake, Shares are objects that contain all the necessary information for sharing a database. Shares include permissions that provide access to the databases and schema containing the object to be shared, as well as access to specific objects within the database. Additionally, consumer accounts are shared with the database and objects.
When a database is created from a Share, the objects shared within it become available to any users within the consumer account. These Shares can be customized, are secure, and are fully controlled by the provider account. This allows objects added to a Share to be accessed in real-time by consumers, and the provider account can also rescind access to a Share or any of its objects.
Part 3: How does Secure Data Sharing Function work in Snowflake?
When securely sharing data, the data is not copied or transferred between accounts, as one might assume. Rather, sharing is accomplished through Snowflake’s layer and metadata store. As a result, shared data does not occupy storage space within a consumer account, and therefore does not contribute to monthly data storage costs. However, charges will be incurred for the compute resources required to query the shared data.
Going back to what was previously mentioned, because the data itself is not copied or exchanged it makes secure data sharing an easy and fast setup for providers and it also makes shared data quickly available to consumers. But let’s take a closer look at how data sharing works for both the provider and the consumer:
Provider:
We will create a share of a database within your account. You can then grant access to objects within the database. This will enable you to share data from multiple databases, as long as those databases are under the same account. Finally, you can add one or more accounts to the share, including any accounts that you may have within Snowflake.
Consumer:
We will set up a read-only database from Share. You can customize access to the database by using the same access control that is provided for objects.
The structure of Snowflake allows providers to share data with many consumers, even those within their organization. Consumers can access shared data from many providers.
What Information is shared with Providers?
Snowflake providers have access to certain information about consumers who access their data. This includes the consumer's Snowflake account and organization names. Providers can also view statistical data about data consumption, such as the date of consumption and the number of queries generated by the consumer account on a provider's Share.
In addition, providers can see any information that a consumer provides at the time of data request submissions, such as the consumer's business email and company name.
Can I share with Third Parties?
Sharing data is only possible between Snowflake accounts. However, if you're a provider within Snowflake, you may want to share data with a consumer outside of Snowflake. Luckily, Snowflake has created reader accounts to facilitate this process.
Reader accounts enable data to be shared with consumers who are not Snowflake customers without the need for them to become one. These accounts are owned by the provider account that created them. While the provider account uses Shares to share databases with reader accounts, the reader account can only receive data from the provider account that created it.
Users with a reader account can query shared data, but they are unable to perform DML tasks that are available in a full account.
Having introduced data sharing and its workings within Snowflake, let's explore other features that come with Snowflake's data sharing.
Part 4: Products that use Secure Data Sharing in Snowflake
Snowflake offers additional products that enable data sharing between providers and consumers. These products include Direct Share, Snowflake Data Marketplace, and Data Exchange.
Direct Share:
Direct Share is a simple method of sharing data that enables account-to-account data sharing while utilizing Snowflake's Secure Data Sharing. As the provider (account on Snowflake), you can grant access to your data to other companies, allowing them to view your data within their Snowflake account without the need to move or copy any data.
Snowflake Data Marketplace:
All accounts in Snowflake can access the Snowflake Data Marketplace, provided they are in non-VPS regions on supported cloud platforms. The Data Marketplace uses Snowflake's Securing Data Sharing to facilitate connections between providers and consumers, similar to the Direct Share product.
You have the option to access third-party data and import the datasets into your Snowflake account without the need for transformation. This allows you to easily combine it with your existing data. The Data Marketplace provides a central location to obtain data from multiple sellers, simplifying the process of data sourcing.
Additionally, becoming a provider and publishing data within the Data Marketplace is a great way to monetize your data and reach a wider audience.
Data Exchange:
Data Exchange enables secure collaboration around data between invited groups, allowing providers to share data with consumers, as well as with your entire organization, including customers, partners, or even just within your unit. It also provides you with the ability to control who has access to your data, and who can publish, consume, or simply view it. Specifically, you can invite others and determine whether they are authorized to provide or consume data. Data Exchange is available for all Snowflake accounts hosted on non-VPS regions and supported cloud platforms.
These three products in Snowflake that use secure data sharing are useful for both provider and consumer accounts (and more) within Snowflake. Now that we have seen how data sharing works and what other features use data sharing in Snowflake, let's take a look at how to use the data that was shared with you or your data that is shared with others and more.
Working with Shared Data:
Once you have a grasp of the fundamentals of direct share, Snowflake Marketplace, and data exchange, there are additional concepts and tools available for you to explore.
Within Snowflake, those with an ACCOUNTADMIN role can utilize the Shared Data page on the new web interface to manage and create shares. As we delve further, please note that "inbound" refers to data that has been shared with you, while "outbound" refers to data shared from your account.
Data Shared with You:
Provider accounts can share inbound shares with your account using Direct Share, Data Exchange, or the Snowflake Marketplace. Inbound shares allow you to view data shared by providers, including who provided the share and how the data was shared. You can also create a database from a share.
To access your inbound shares, go to the "Share With Me" tab within the Snowflake web interface. Here you will find:
- Direct shares that are shared with you. These shares are placed into two groups: 1. Direct shares that are ready to be used and 2. Direct shares that have been imported into a database can be queried.
- Listings for data exchange that you have access to. The data is shown under the name of the initial data exchange. If you have more than one data exchange, each data exchange will be shown within separate sections.
- Listings for the Snowflake Marketplace data that have been moved into a database and can be queried. However, it does not show shares that are ready to be used. You can find the data listing in the Marketplace menu.
Data You Shared:
Your account allows you to share data with consumers through outbound shares. You can share data directly, through data exchange, or via the Snowflake Marketplace (as previously mentioned for inbound shares).
With outbound shares, you can:
- View the shares you have created or have access to, including information such as the database for the share, consumer accounts that can access the share, the date when the share was created, and the objects that are being shared.
- Create and edit both a share and its data listing.
- Remove access to the share for individual consumer accounts.
Returning to the web interface, the "Shared by My Account" tab displays outbound shares from Snowflake Marketplace, data exchange, and direct shares.
When considering shares, icons are located beside each share to indicate the sharing mechanisms like direct sharing, data exchange, or Snowflake Marketplace.
Lastly, there are filters available when viewing your shared data:
- Type: This is presented as the "Ally Types" drop-down and allows you to differentiate direct shares from listings.
- Consumer: This is presented as the "Shared With" drop-down and allows you to select a specific consumer or data exchange (where the data has been shared).
Data that is Shared
When sharing data, there are many ways you can do this:
- Use direct share to directly share data with consumers
- In the Snowflake Marketplace, post a listing
- In data exchange, post a listing
Furthermore, when you are in the web interface and you want to share data, you will use the “Share Data” drop-down and choose from the list that provides all the platforms where you can share data.
Requesting Data
In the web interface, you can view inbound and outbound requests in the "Requests" tab. However, this tab does not display data requests from the Snowflake Data Marketplace.
Let's take a moment to review what inbound and outbound requests mean.
Inbound requests are made by consumers who are seeking access to your data. You can organize these requests by status and review them accordingly. Outbound requests, on the other hand, are requests made by you to obtain data listings from other providers. Just like inbound requests, you can sort them by status. Keep in mind that the requests you make may be rejected, but you can always resubmit them.
Managing Exchanges
In certain roles, such as the Data Exchange Admin role or if you have Provider Profile Level Privileges, you can create and organize provider profiles within the “Manage Exchanges” tab. However, if your organization does not have a data exchange, the “Manage Exchanges” tab will not be visible.
Regarding the provider profile, with this role, you can perform the following tasks within a data exchange:
- Create, update, and delete a profile
- Update contact email
- Manage profile editors
Now that we have reviewed data sharing, you should be able to understand all its components and the different functions it offers!
To keep up with new features, regularly visit our website for more information and tips.
Conclusion:
This article provides a deep dive into data sharing and how it works within the Snowflake ecosystem. It covers the basics of data sharing, the role of data providers and consumers, and how to secure data-sharing functions. Additionally, it explores Snowflake's products that use secure data sharing, such as Direct Share, Snowflake Data Marketplace, and Data Exchange. The article also explains how to work with shared data, including managing inbound and outbound requests and managing exchanges.