Secure Data Sharing with Snowflake

Secure Data Sharing

Introduction

Big Data. Internet of Things. Social Media. Everyday millions of data points are generated and moved across the internet, from the viral video you send to your friend on TikTok to critical business data to make decisions. The acceleration of data use and utility is only getting faster. That’s why security and protecting your data is of the utmost importance. That’s where Secure Data Sharing comes in play.

The cloud is the solution. Snowflake’s platform offers instant access to live data in a secure format. This feature can not only streamline data sharing inside organizations but also how we share data to the outside. Many companies continue to use outdated data sharing technologies which are more costly and less secure.

How Does Secure Data Sharing Work?

Sharing involves two parties: the data provider and one or more consumers. Snowflake enables sharing of database tables, views and user-defined functions (UDFs) using a Secure Share object. A data consumer given a Secure Share has access to a read-only version of the database in their own account to operate on.

With Snowflake’s data sharing, no actual data is copied or transferred between accounts. All sharing is accomplished through the service layer and metadata store. Thanks to Snowflake’s architecture and separate compute from storage, data sharing is instant and prevents storage costs.

Sharing is done with a Secure Share object. Each Secure Share includes: privileges that grant access to the database, schema, consumer accounts and objects being shared (tables, views, UDFS). The share object represents a connection between the provider and consumer. New objects or data can be added to the object and will be available to the consumer in real time. Access to a share can be revoked at any time as well.

Two Types of Consumers

Participates in Secure Data Sharing are either data provider or consumer. A provider creates a Secure Share to export, and the consumer imports the Secure Share. However, there are two different types of consumers: reader accounts and full consumer. The difference affects who pays for the storage resources.

Full consumer accounts are existing Snowflake customers. Data can be shared directly to the existing account and the existing account pays for all computer resources incurred by querying the database.

Reader accounts are from consumers not on Snowflake’s platform. If the consumer is not a Snowflake customer, the provider can create Reader Accounts. For a Reader Account, all costs would be paid by the provider who shared the data. All costs can be tracked and invoiced back to the consumer.

Setting up a Secure Share with Reader Account

Let’s begin with an example using my personal database (BDU), which has a schema containing flight data (FAA). There are eight tables in the schema:

Click the Shares tab at the top of the user console, making sure that you toggle the view to Outbound since you are a provider creating a Secure Share that will go to an outside account. Then hit the Create button to open the menu to create a Secure Share. Please note that you need ACCOUNTADMIN privileges, or a custom role that has been granted these specific privileges, to create a Secure Share. SYSADMIN privileges will not be enough:

In the menu, insert the Secure Share Name (SHAREDEMO), and click the button to Select Tables & Secure Views to select the eight tables. Please note that you can select or deselect tables or views as needed. Next, hit the Create button to complete the process:

Now you see that Secure Share has been created and the data can be previewed. Please note that you need an available Virtual Warehouse that is usable by your current role in order to run this query. Next, click the button to Add Consumers:

As you can see, there are two Account Types: Reader and Full. Full is an existing user which will assume costs. Reader does not, the data provider pays for them. In this case, the intended consumer is not a Snowflake customer, so click the Create a Reader account link:

To create a Reader Account, insert the name (READERACCOUNT) and a comment (optional), along with credentials for the ACCOUNTADMIN user (READER1). Then click Create Account:

Going back to a worksheet and executing the command SHOW MANAGED ACCOUNTS will show all details needed:

Going back to the previous Add Consumers menu, you can now see that the new Reader Account has been added as a Consumer of the Secure Share. Clicking the blue link with the account locator (XB28199) will also take you the login for the Reader Account:

Enter the credentials for the ACCOUNTADMIN user:

Once you are in your Reader Account, you will need to do some setup. First, ensure that your role is ACCOUNTADMIN (it will default to SYSADMIN). Then click the Warehouses tab and create a warehouse to use (DEMO_WH). You will need compute resources to run queries from your provider.

Click the Shares account, making sure that you are toggled to Inbound. We see the SHAREDEMO has been shared to us by the ITSTRATEGISTS account we were using, which is the data provider in this case. Click Create Database From Secure Share:

In the menu, name your database accordingly and grant access to the desired roles. You can change the name of the database given by the provider. You can grant access to the database to multiple roles. Click Create Database:

Navigate to Worksheets. The shared database (SHARE_FAA) now has read-only tables you can query using the SYSADMIN role. As the consumer, you should be able to read from it using a reporting tool like Tableau or an ETL tool. If you’re a Full Consumer, you can query this data along with your existing data while not paying any storage costs.

Database tables are read-only. To change this, you can create another database and table by selecting from the shared tables. In this case, a table that has 148 million records took 27 seconds to move over using a medium-sized warehouse. Unfortunately, you cannot use Snowflake’s clone feature on a shared database.

Frequently Asked Questions

What is the difference between sharing data with existing Snowflake customers versus non-Snowflake customers?

Existing Snowflake customers bear the costs of any storage or compute incurred from querying or copying the shared database. For a non-Snowflake customer, the provider would create Reader Accounts and pay for all the costs incurred by those accounts. As the provider, you would be able to track account usage and bill the consumer if that was part of the set business agreement.

How fast would the data update for consumers?

Instant! The data is stored and shared on the cloud so the provider and consumer see the same data. The Secure Share is a metadata wrapper that points to the correct data that is still sitting with the provider. So any changes to the provider’s dataset would be reflected instantly from the consumer’s viewpoint.

Once the consumers can see the data in their account, what can they do with it?

They can query and run any analytics they desire such as Tableau or Sigma. If they wish to manipulate the data, they can copy that data into their own database, so they would have write access as.

Can I allow my users to see only selected tables or views?

Yes! Snowflake has a role-based security framework, so the Account Administrator can limit which roles are accessible to certain users. Roles are a collection of permissions granted on objects. Therefore, what a user can see depends on what permissions have been granted to the user’s role.

What if my consumers go overboard with how much they are using the Reader Account?

As the data provider for Reader Accounts you control usage. Set up Resource Monitors, which impose limits on the number of credits that virtual warehouses use within a specified interval or date range. When these limits are reached or are approaching, the Resource Monitor can trigger an alert and suspension of the warehouse.