Airflow is a platform designed to automate, schedule, and monitor workflows in an automated way. It is mainly used for data processing pipelines, computational workflows, and ETL processes.
The tool can be customized to meet each need and this makes other tasks easier to complete.
Some of the other features of Airflow include:
1. Directed Acyclic Graphs and they represent a workflow in Airflow
2. Easy access and interaction with logs
3. Set up alerts
4. Monitoring interface
Furthermore, some of the things that will be needed for Airflow are Python, a Snowflake account and access to the latest Apache-Airflow.
To integrate Apache-Airflow with Snowflake you will:
1. Configure Apache-Airflow with snowflake connection.
2. Open “localhost:8080” within the browser
3. Go under “Admin” and then “Connections”
4. Click on the + symbol and then add a new record
5. Lastly, choose the connection type as “Snowflake”
Then you will need to create a DAG file:
1. Go to the folder you have chosen as your AIRFLOW_HOME
2. Find the DAGs folder within subfolder_dags
3. Inside the DAG folder you will need to paste the Python file
4. Create a Python file with the name “snowflake_airflow.py”
Workflows will automatically be picked up and scheduled to run.