Snowflake Data Masking

Snowflake Data Masking

Last week, the United States CDC issued new COVID-19 mask policies.  I will leave that for many others to discuss, but for the COOL Data People reading this we will focus on how easy it is to implement Snowflake Data Cloud “Data Masking”.   Ready? – Let’s “Data Mask” it UP!  

 

What is Data Masking? Data Masking is just like it sounds… the hiding or masking of data.  It is a convenient way to add additional masking of data for column level security.  Data Masking overall is a simple concept.  It has really caught on in our new age of GDPR, PII.   What is Snowflake’s Version of Data Masking? Snowflake’s implementation of this is….  Dynamic Data Masking. Dynamic Data Masking is column-level security which uses masking policies to mask data at your query run time. Pretty cool, eh?   Snowflake’s version of data masking, has several features including: *Masking policies that are at the schema-level. *Data Masking currently works to mask data at either the table or view object. *The masking policies are applied at query runtime.   *The masking policies are applied to EVERY location where the column is displayed. *Depending on all the variables of your role, your role hierarchy, your masking policy conditions, and sql execution content then you will see fully masked data, partially masked data, or just PLAIN TEXT!   Now that you know what Snowflake Data Cloud Dynamic Data Masking is then …. how do you use it?   Data Masking within Snowflake is enabled with Data Definition Language (DDL). Here are the basic syntax constructs you use for the MASKING POLICY object. It is your typical object CREATE, ALTER, DROP, SHOW, DESCRIBE  (this is pretty standard for most Snowflake objects and one of the reasons I like snowflake … most of the time is the consistency and the simplicity and ease of use).  

So, let’s have some fun and create a data masking policy for email addresses in a simple example.   There are 3 main PARTS for creating and applying a dynamic data mask on Snowflake to a column.  Here we go:

PART 1 – Enable and Grant Masking Policy
PART 2 – Create a Masking Policy
PART 3 – Apply the Masking Policy to a Column in a View or Table
(just creating a masking policy is not enough.  kind of like wearing a covid mask under your mouth and nose.  Even though you have a mask, its not applied really so its not working)

We will show you how to do all of this in detail below.

Dynamic Data Masking Example

Let’s say we want to create a data mask for email addresses in our  a row using a stored procedure.

If you have not been using our Snowflake Solutions Demo Database Training Example then let’s create a database, schema, and table to use.


/* SETUP DEMO DATABASE AND TABLE FOR DATA MASKING DEMO and PROOF OF CONCEPT */
USE ROLE SYSADMIN; /*use this role or equivalent */
CREATE OR REPLACE DATABASE DEMO_MASKING_DB;
CREATE SCHEMA DEMO;
CREATE OR REPLACE TABLE EMPLOYEE(ID INT, FULLNAME VARCHAR,HOME_ADDRESS VARCHAR,EMAIL VARCHAR);
INSERT INTO EMPLOYEE VALUES(1,'Frank Bell','1000 Snowflake Lane North Pole, Alaska', 'fbell@snowflake.com');
INSERT INTO EMPLOYEE VALUES(2,'Frank S','1000 Snowflake Lane North Pole, Alaska', 'franks@snowflake.com');
INSERT INTO EMPLOYEE VALUES(3,'Craig Stevens','1000 Snowflake Lane North Pole, Alaska', 'craig@snowflake.com');
CREATE WAREHOUSE IF NOT EXISTS MASK_WH WITH WAREHOUSE_SIZE = XSMALL, INITIALLY_SUSPENDED = TRUE, auto_suspend = 60;

 

 

/* PART 0 – create and grant roles for DATA MASKING DEMO – REPLACE FREDDY WITH YOUR USERNAME
– there is more to do when you use custom roles with no privileges */
USE ROLE SECURITYADMIN;
CREATE ROLE IF NOT EXISTS EMPLOYEE_ROLE;
CREATE ROLE IF NOT EXISTS MANAGER_ROLE;
CREATE ROLE IF NOT EXISTS HR_ROLE;
CREATE ROLE IF NOT EXISTS DATA_MASKING_ADMIN_ROLE;
GRANT USAGE ON DATABASE DEMO_MASKING_DB TO ROLE EMPLOYEE_ROLE;
GRANT USAGE ON SCHEMA DEMO_MASKING_DB.DEMO TO ROLE EMPLOYEE_ROLE;
GRANT SELECT ON TABLE DEMO_MASKING_DB.DEMO.EMPLOYEE TO ROLE EMPLOYEE_ROLE;
GRANT USAGE ON DATABASE DEMO_MASKING_DB TO ROLE HR_ROLE;
GRANT USAGE ON SCHEMA DEMO_MASKING_DB.DEMO TO ROLE HR_ROLE;
GRANT SELECT ON TABLE DEMO_MASKING_DB.DEMO.EMPLOYEE TO ROLE HR_ROLE;
GRANT USAGE,MODIFY ON DATABASE DEMO_MASKING_DB TO ROLE “DATA_MASKING_ADMIN_ROLE”;
GRANT USAGE,MODIFY ON SCHEMA DEMO_MASKING_DB.DEMO TO ROLE “DATA_MASKING_ADMIN_ROLE”;
GRANT USAGE ON WAREHOUSE MASK_WH TO ROLE EMPLOYEE_ROLE;
GRANT USAGE ON WAREHOUSE MASK_WH TO ROLE HR_ROLE;
GRANT ROLE EMPLOYEE_ROLE TO USER FREDDY;
GRANT ROLE MANAGER_ROLE TO USER FREDDY;
GRANT ROLE HR_ROLE TO USER FREDDY;
GRANT ROLE DATA_MASKING_ADMIN_ROLE TO USER FREDDY;

 

/* PART 1 – enable masking policy ON ACCOUNT AND GRANT ACCESS TO ROLE */
GRANT CREATE MASKING POLICY ON SCHEMA DEMO_MASKING_DB.DEMO TO ROLE “DATA_MASKING_ADMIN_ROLE”;
USE ROLE ACCOUNTADMIN;
GRANT APPLY MASKING POLICY ON ACCOUNT TO ROLE “DATA_MASKING_ADMIN_ROLE”;

 

/* PART 2 – CREATE MASKING POLICY */
USE ROLE DATA_MASKING_ADMIN_ROLE;
USE SCHEMA DEMO_MASKING_DB.DEMO;
CREATE OR REPLACE MASKING POLICY MASK_FOR_EMAIL AS (VAL STRING) RETURNS STRING ->
CASE
WHEN CURRENT_ROLE() IN (‘HR_ROLE’) THEN VAL
ELSE ‘*********’
END;


/* PART 3 - APPLY MASKING POLICY TO EMAIL COLUMN IN EMP:LOYEE TABLE */
ALTER TABLE IF EXISTS EMPLOYEE MODIFY COLUMN EMAIL SET MASKING POLICY MASK_FOR_EMAIL;


***AWESOME - NOW YOU NOW HAVE CREATED AND APPLIED YOUR DATA MASK! Let's Test it out.



/* TEST YOUR DATA MASK !!! --> TEST by QUERYING TABLE WITH DIFFERENT ROLES AND SEE RESULTS */
/* Notice the EMAIL is MASKED with ******* */
USE ROLE EMPLOYEE_ROLE;
SELECT * FROM DEMO_MASKING_DB.DEMO.EMPLOYEE;
/* Notice the EMAIL is NOT MASKED */
USE ROLE HR_ROLE;
SELECT * FROM DEMO_MASKING_DB.DEMO.EMPLOYEE;

ADDITIONAL DETAILS:

*MASKS are really custom data definition language (DDL) objects in Snowflake.
YOU can always get their DDL by using the Snowflake standard GET_DDL function or using DESCRIBE.
/* EXAMPLES for reviewing the MASKING POLICY */
/* when using SECURITYADMIN or other roles without USAGE you must use the full DATABASE.SCHEMA.POLICY PATH */

USE ROLE SECURITYADMIN;
DESCRIBE MASKING POLICY DEMO_MASKING_DB.DEMO.MASK_FOR_EMAIL;

USE ROLE ACCOUNTADMIN; /* when using SELECT that means the ROLE MUST HAVE USAGE enabled which SECURITYADMIN role does not have by default */

SELECT GET_DDL(‘POLICY’,’DEMO_MASKING_DB.DEMO.MASK_FOR_EMAIL’);

Conclusion

Dynamic Data Masking Policies are a great way to secure and obfuscate your PII data to different roles without access where necessary while at the same time displaying the PII data to the roles that need access to it.  I hope this tutorial has helped you understand Dynamic Data Masking on Snowflake. Thanks for checking out this article.  For further information on Dynamic Data Masking Training then check out our new Free Snowflake Training Series.  

Leave a Reply

Snowflake Cost Saving

we automate snowflakeDB data cloud cost saving. sign our free 7 days no risk trail now