The simplest and fastest way to start Airflow is to run it with CeleryExecutor in Docker. We assume that you have a basic understanding of Dockers and that you have already installed the Docker Community Edition (CE) on your computer.
Create an Airflow Folder
It is convenient to create an Airflow directory where you will have your folders like dags
etc. So, open your terminal and run:
mkdir airflow-docker cd airflow-docker
I created a folder called airflow-docker
.
Download the docker-compose.yaml
To deploy Airflow on Docker Compose, you should fetch docker-compose.yaml. So let’s download it with the curl
command
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.0.1/docker-compose.yaml'
Some directories in the container are mounted, which means that their contents are synchronized between your computer and the container.
./dags
– you can put your DAG files here../logs
– contains logs from task execution and scheduler../plugins
– you can put your custom plugins here.
This file contains several service definitions:
airflow-scheduler
– The scheduler monitors all tasks and DAGs, then triggers the task instances once their dependencies are complete.airflow-webserver
– The webserver available athttp://localhost:8080
.airflow-worker
– The worker that executes the tasks given by the scheduler.airflow-init
– The initialization service.flower
– The flower app for monitoring the environment. It is available athttp://localhost:8080
.postgres
– The database.redis
– The redis – broker that forwards messages from scheduler to worker.
Initialize the Environment
To initialize the environment, run the following command:
docker-compose up airflow-init
After initialization is complete, you should see a message similar to that below:
Run the Airflow
You can start running the Airflow by running the command:
docker-compose up
If you want to make sure that the container is running you should open a new terminal and run:
$ docker ps
Accessing the Web Interface
Once the cluster has started up, you can log in to the web interface and try to run some tasks. The webserver available at: http://localhost:8080
. The default account has the login airflow
and the password airflow
How to interact with the Airflow Command Line
After running the command docker ps
we get the names and the ids of the running containers. If we want to interact with a particular container we can run the docker exec <container_name> <command>
. Let’s get the airflow version:
docker exec airflow-docker_airflow-webserver_1 airflow version
Notice that in you airflow-docker folder
you should find the following files and folders.
Cleaning Up
To stop and delete containers, delete volumes with database data and download images, run:
docker-compose down --volumes --rmi all
Are you ready to run your first DAG?
If you feel ready to run your first DAG you can have a look at our walk-through tutorial
References
[1] Apache Airflow
3 thoughts on “How to Start Running Apache Airflow in Docker”
Wow, after all that i went through , it was your article which saved me , it is sooo frustating but i finally got through .. thaankyou soo much for writing this piece.
Happy to hear that!
Hi I seem to be getting this error here on cmd : docker-compose up airflow-init
airflow-init_1 | Upgrades done
airflow-init_1 | [2021-06-24 10:57:26,973] {manager.py:784} WARNING – No user yet created, use flask fab command to do it.
airflow-init_1 | Admin user airflow created
airflow-init_1 | 2.1.0
dockers_airflow-init_1 exited with code 0
This causes my airflow_init_1 container to stop working. Also cannot open the localhost:8080.
Do you have any advice