How can you model slowly changing dimensions (SCD Type 1, Type 2, and Type 3) on Snowflake’s?

550 viewsData Modeling
0

How can you model slowly changing dimensions (SCD Type 1, Type 2, and Type 3) using Snowflake's features?

Daniel Steinhold Answered question August 4, 2023
0

You can model slowly changing dimensions (SCD Type 1, Type 2, and Type 3) using Snowflake's features and capabilities. Snowflake offers several functionalities and best practices to handle SCDs efficiently. Let's explore how to model each type:

**1. SCD Type 1 (Overwrite):**
In SCD Type 1, the existing dimension record is updated with the new data, and historical changes are not preserved.

**Modeling in Snowflake:**
For SCD Type 1, you can simply update the existing dimension record directly using standard SQL **`UPDATE`** statements.

**Example:**

```sql
sqlCopy code
-- Update the customer's address directly in the dimension table (no history preservation).
UPDATE customer_dimension
SET address = 'New Address'
WHERE customer_id = 123;

```

**2. SCD Type 2 (Add Rows with Versioning):**
In SCD Type 2, a new record is added to the dimension table for each change, preserving historical versions of the data with additional versioning columns.

**Modeling in Snowflake:**
To model SCD Type 2 in Snowflake, you can create a surrogate key (e.g., a unique identifier) for each dimension record and add columns to track the version and effective dates.

**Example:**

```sql
sqlCopy code
-- Create a SCD Type 2 dimension table with versioning columns.
CREATE TABLE customer_dimension_type2 (
customer_key INT AUTOINCREMENT PRIMARY KEY,
customer_id INT,
name VARCHAR,
address VARCHAR,
valid_from TIMESTAMP_NTZ DEFAULT CURRENT_TIMESTAMP(),
valid_to TIMESTAMP_NTZ DEFAULT '9999-12-31 00:00:00',
is_current BOOLEAN DEFAULT TRUE
);

```

To update a record, you would first set the current record's **`is_current`** flag to **`FALSE`**, and then insert a new record with updated data and valid time ranges.

**3. SCD Type 3 (Add Columns for Changes):**
In SCD Type 3, additional columns are added to the dimension table to store specific historical changes.

**Modeling in Snowflake:**
To model SCD Type 3, you can add new columns to track specific historical changes and update the existing record with the latest data.

**Example:**

```sql
sqlCopy code
-- Create a SCD Type 3 dimension table with columns for specific historical changes.
CREATE TABLE customer_dimension_type3 (
customer_id INT PRIMARY KEY,
name VARCHAR,
address VARCHAR,
previous_address VARCHAR,
previous_update_date TIMESTAMP_NTZ
);

```

To update a record, you would first move the current address to the **`previous_address`** column, and then update the **`address`** column with the new data, along with the **`previous_update_date`**.

By implementing the appropriate SCD type, you can effectively manage changes to dimension data in Snowflake. Each SCD type offers a different balance between data preservation and storage efficiency. Carefully choose the approach that best aligns with your business requirements and data analysis needs.

Daniel Steinhold Answered question August 4, 2023

Maximize Your Data Potential With ITS

Feedback on Q&A