Airflow-based ETL system for the InfoDengue epidemiological surveillance project.
- Docker & Docker Compose
- GNU Make
- Python 3.14 (for local development)
- Conda/Mamba
Create .env file in the project root and populate the variables:
envsubst < .env.tpl > .envGenerate Fernet key:
python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"docker compose buildThe build process:
- Uses multi-stage Dockerfile
- Installs system dependencies (git, postgres-client, build tools)
- Installs Python dependencies via Poetry
- Configures Airflow with Celery executor
docker compose up -dServices started:
- postgres - Metadata database (port 5432)
- redis - Message broker (port 6379)
- airflow-webserver - API server (port 8080)
- airflow-scheduler - DAG scheduler
- airflow-worker - Celery worker
- airflow-dag-processor - DAG file processor
- airflow-triggerer - Deferrable task triggerer
On first run, the airflow-init service:
- Creates admin user
- Runs database migrations
- Sets up connections and variables
Check initialization:
docker compose logs airflow-initOpen: http://localhost:${AIRFLOW_PORT}/alertflow
Default credentials (set in .env):
- Username: (set via
_AIRFLOW_WWW_USER_USERNAME) - Password: (set via
_AIRFLOW_WWW_USER_PASSWORD)
AlertFlow/
├── alertflow/
│ ├── dags/ # Airflow DAG definitions
│ ├── plugins/ # Custom plugins
│ └── logs/ # Task logs
├── docker/
│ ├── compose.yaml # Main compose file
│ ├── compose-dev.yaml # Development overrides
│ └── Dockerfile # Container image definition
├── pyproject.toml # Python dependencies (Poetry)
├── Makefile # Build/test automation
└── .env # Environment variables (not committed)
Place DAG files in alertflow/dags/:
- Changes detected automatically (30s interval)
- No container restart needed
- View parsed DAGs at http://localhost:${AIRFLOW_PORT}/alertflow
./airflow.sh dags list# All services
docker compose logs -f
# Specific service
docker compose logs -f airflow-webserver# All services
docker compose restart
# or:
docker compose down && docker compose up -dOverride any config via environment variables in docker-compose.yaml:
environment:
AIRFLOW__CORE__EXECUTOR: CeleryExecutor
AIRFLOW__CORE__LOAD_EXAMPLES: 'false'docker compose run --rm airflow-cli db migratedocker compose run --rm airflow-cli users create \
--username admin \
--firstname Admin \
--lastname User \
--role Admin \
--email admin@example.com \
--password admin# Change port in .env
echo "AIRFLOW_PORT=8081" >> .env
docker compose -f docker-compose.yaml down
docker compose -f docker-compose.yaml up -d# Check postgres health
docker compose exec postgres pg_isready -U airflow
# Reset database (WARNING: destroys data)
docker compose down -v
docker volume rm alertflow_postgres-db-volume
docker compose up -ddocker stats| Target | Description |
|---|---|
make build |
Build all container images |
make up |
Start all services in background |
make down |
Stop all services |
make restart |
Restart all services |
make logs |
Tail logs from all services |
This setup is tailored for the InfoDengue ETL pipeline:
- Processes epidemiological data for dengue, zika, chikungunya
- Integrates with climate data APIs (COPERNICUS)
- Uses geospatial analysis for disease mapping
- Outputs feed the InfoDengue dashboard
For more details on the ETL pipeline, see alertflow/dags/.