:
- Airflow,
- Airflow: , DAG, Operator
- Airflow
- « »
- Airflow
Airflow
Airflow – , . open source , Python, 2014 Airbnb. 2016 Airflow Apache Software Foundation, 2019 top-level Apache.
ETL-, ETL , , , Pentaho, Informatica PowerCenter, Talend . Airflow – , “cron ”: , , , . Hive Spark .
Airflow, worker ( ), . , , .
Airflow - Hadoop . Python-, Bash , Docker Kubernetes, .
Airflow
Airflow, Lamoda . - scheduler, . , ML Vowpal Wabbit. .
Airflow ( ) , - . , .
Airflow
Webserver
Webserver – -, , . :
- . . , : , , , .
, Graph View. .
Graph View Tree View. , . , – .
– , – . – . , , , , .
Scheduler – , , . Python-, , , . , Scheduler – Airflow.
- , Scheduler’a. , High Availability ( Scheduler HA Airflow 2.0).
- : , - . , - , .
- Airflow, . , Airflow – real-time . ( ), . , 5 – , 10 . , 10 , .
Worker
Worker – , . Airflow :
- , – SequentialExecutor. , .
- LocalExecutor , , LocalExecutor . : - SQLite, LocalExecutor SequentialExecutor.
- CeleryExecutor , . Celery – , RabbitMQ Redis. , .
- DaskExecutor Dask – .
- KubernetesExecutor pod Kubernetes.
- DebugExecutor IDE.
Apache Airflow
, DAG
Airflow – DAG, , . , , .
, . : , , SLA. , .