Menu

Post image 1
Post image 2
Post image 3
Post image 4
1 / 4
0

A Beginners Guide to Apache Airflow

DEV Community·GeraldM·about 1 month ago
#IelP6ymP
Reading 0:00
15s threshold

Introduction In data engineering we build data pipelines using approaches such as ETL(extract, transform, load) and ELT(extract, load, transform). These data pipelines are python code programs that perform the individually defined tasks known as workflows. But we all know that a program only runs once when you manually run it. For data pipelines, data is being extracted from various sources such as websites and payments systems that continuously record new data. This means that the data we are extracting keeps changing and we need to run our python code again and again to accommodate the new data. How do we do this? This is where Apache airflow comes in. To understand more about ETL and ELT read through my article ETL vs ELT What is Apache Airflow? Apache Airflow is an opensource platform used to schedule, monitor and manage workflows. It was created by Maxime Beauchemin at Airbnb in 2014 with the aim of managing increasingly complex data worflows.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More