Variables ========= Variables are useful for sharing parameters between DAGs. Say that there is a DAG with a task that stages api output in S3: .. code-block:: python def upload(**kwargs): s3 = S3Hook() s3.load_file('./output.json', 'api/{{ today_ds }}.json', bucket_name='staging-data') task = PythonOperator( python_callable=upload, task_id="stage_api", dag=dag) Perhaps, there will be another task in another DAG that will want to use this data: .. code-block:: python def download(**kwargs): s3 = S3Hook() obj = s3.get_key('api/{{ today_ds }}.json', bucket_name='staging-data') obj.download_file('./api_output.json') task = PythonOperator( python_callable=download, task_id="download_api_data", dag=dag) The problem introduced here is that, if the first DAG ever changes its output location, say from ``api/{{ today_ds }}.json`` to ``api/users/{{ today_ds }}.json``, the second task would still attempt to to load the file from the old location. To avoid having to update all the tasks that rely on this output location, we can instead make it a `variable `_. This way, we can simply change the value of the output location and know that all tasks referencing that variable will use it next time it is run. Our tasks utilizing variables now look something like this: .. code-block:: python key = Variable('output_key') bucket = Variable('output_bucket') def upload(**kwargs): s3 = S3Hook() s3.load_file('./output.json', key, bucket_name=bucket) task = PythonOperator( python_callable=upload, task_id="stage_api", dag=dag) def download(**kwargs): s3 = S3Hook() obj = s3.get_key(key, bucket_name=bucket) obj.download_file('./api_output.json') task = PythonOperator( python_callable=download, task_id="download_api_data", dag=dag) Variable Definitions -------------------- .. Caution:: Variables should *not* contain secrets. Use connections for that. Variables may be set through the admin, however, because Airflow is to be deployed with near-identical configurations accross environments, we require that variables be set inside the files in the `variables/` directory. These files will be loaded into Airflow upon container start, depending on the current environment. This allows variables to be tracked in our git flow and let's us bootstrap a new deployment without manually configuring all of the settings. Common Variables ++++++++++++++++ Common variables that are consistent accross all environments may be set inside the `common.json` variable file. These are things like slack channels or external accounts that only ever have one value. Environment Specific Variables ++++++++++++++++++++++++++++++ These variables change depending on the deployment environment: `dev`, `qa`, or `prd`. These are often values such as bucket locations or service names.