What is Confluent in Snowflake?
Confluent is a data streaming platform based on Apache Kafka. It allows for the integration of different data sources and the processing of large amounts of data in real-time. In this document, the instructions are provided for setting up Confluent with Snowflake, a cloud-based data warehousing and analytics platform. The purpose of this integration is to move data generated in Confluent/Kafka into Snowflake for further analysis and insights.
How To Set Up Confluent with Snowflake:
Here is a practical guide to getting started with setting up Confluent with Snowflake
- To start, you’ll need a Snowflake Account. Don’t have one yet? Not to worry – you can sign up for a 30-day free trial with $400 credit at Free Snowflake Account Setup.
- This setup requires using Docker. (I’ll have separate instructions to do this without Docker later)
- You also need git.
Here we go – there are 3 main parts to this setup:
- Get this docker version of confluent/Kafka up and running.
- Create a topic on it to generate data for moving data into Snowflake.
- Setup the Kafka to Snowflake Connector as the Destination with the right Snowflake connectivity.
Part 1 – Get the docker version of Confluent/Kafka running
Okay…The first time it will take a few minutes to download… and you will see the output eventually like this:
If you want to verify that everything is up and running then execute this command:
Part 2 – Create a topic to send Data to Snowflake. Generate data for it with the DataGen functionality.
Let’s go… execute these commands in sequence:
Create a topic:
Generate data for the topic: Let’s first configure a JSON file for the data we want to create. Use whatever editor you want and create this file u.config.
I’m using vi u.config and pasting this in there:
Part 3 – Download the Kafka to Snowflake Connector and configure it.
So you have Confluent/Kafka up and running. You have data generated into a topic.
So now just download the magical Kafka to Snowflake connector here: https://mvnrepository.com/artifact/com.snowflake/snowflake-kafka-connector/0.3.2 I’m sure by the time I publish this the version will change but for now, assume it’s this one.
Once you have the file in the same directory we have been using for everything then copy it to the connected virtual machine where it needs to be to work.
You now need to create the configuration file to set up the Connector and the Sink associated with it that connects to the Snowflake Database. This does assume you have already set up your RSA key. You do have to fill in 6 of the settings below to have this setup for your specific configuration. Again, use your favorite editor. I’m using:vi connector_snowflake.config and entering in my specific details.
Almost there. Now use this configuration file to set up the sink.
Now in a few seconds or minutes if you set up everything correctly the topic should be written on the Snowflake table. Go into the database and schema you connected to and you should be able to execute something like:
Now you should see data flowing from Kafka to Snowflake. Enjoy!
Find out more about all the benefits Snoptimizer has to offer you and your business. Sign up for a free proof of concept!