What’s the process of creating an external table in Snowflake and what is its use in data modeling?

123 viewsData Modeling

What’s the process of creating an external table in Snowflake and what is its use cases in data modeling?

Daniel Steinhold Answered question August 4, 2023

Creating an external table in Snowflake allows you to access and query data stored in external data sources, such as cloud storage (Amazon S3, Google Cloud Storage, or Azure Blob Storage). External tables provide a way to leverage existing data without the need to load it into Snowflake’s storage. Here’s the process of creating an external table in Snowflake and its use cases in data modeling:

**Process of Creating an External Table:**

1. **Create an External Stage:** Before creating an external table, you need to create an external stage to specify the location of the data in the external storage. The external stage acts as a reference to the external location.
2. **Grant Necessary Permissions:** Ensure that the necessary permissions are granted to the user or role to access the external stage and the data in the external storage.
3. **Create the External Table:** Use the **`CREATE EXTERNAL TABLE`** statement to define the external table’s schema, similar to creating a regular table in Snowflake. Specify the location of the data in the external stage and other relevant properties.
4. **Query the External Table:** Once the external table is created, you can query it using standard SQL statements like any other table in Snowflake.

**Use Cases of External Tables in Data Modeling:**

1. **Data Integration:** External tables are useful for integrating data from various sources without the need to physically load the data into Snowflake. You can query and join data from multiple external sources and internal tables in a single SQL query.
2. **Data Archiving and Historical Data:** External tables can be used to store historical data or archive data that is infrequently accessed. This helps manage storage costs by keeping historical data in low-cost external storage.
3. **Data Lake Integration:** If your organization uses a data lake on cloud storage, you can create external tables to access and analyze data in the data lake directly from Snowflake.
4. **Data Sharing:** External tables can be shared with other Snowflake accounts, allowing data consumers in other organizations to access and query the data without the need for data replication.
5. **ETL and Data Transformation:** External tables can be used as an intermediate step during ETL processes. You can transform and cleanse data in the external storage before loading it into Snowflake.
6. **Backup and Restore:** External tables can serve as an alternative or supplementary backup mechanism, allowing you to store critical data in a secure and resilient external storage.

**Important Considerations:**

– While external tables offer flexibility and integration capabilities, they may have some performance trade-offs compared to internal (native) tables. Data retrieval from external storage may be slightly slower than from internal storage, especially for frequent access.
– External tables are read-only in Snowflake, which means you can’t perform DML (Data Manipulation Language) operations like INSERT, UPDATE, or DELETE on them.
– Be mindful of data security and access controls when dealing with external tables, especially if the data resides outside your organization’s infrastructure.

In summary, external tables in Snowflake provide a powerful way to access and utilize data stored in external sources, enabling data integration, historical data management, data lake integration, and more. They complement Snowflake’s internal storage capabilities and enhance the versatility of data modeling and analytics in a cloud-based environment.

Daniel Steinhold Answered question August 4, 2023