In Airflow, you can create connection to S3 in order to, for instance, store logs in S3 bucket. To do so, you have to go to airflow interface, go to "Admin" menu, "Connections" submenu, and then click on the blue + sign.

However, once you’ve created your connection, there is no easy way to check that it is working. Here is a small process to test a newly created S3 connection, if you have ssh access to the server where airflow is deployed:

  • Connect to machine where you have deployed airflow:

ssh your_login@your_airflow_server
  • Create a test.py file with the following content, replace your_connection_id with the connection id you’ve just created and your_s3_bucket with the name of the bucket you want to connect to:

from airflow.providers.amazon.aws.hooks.s3 import S3Hook

remote_conn_id = 'your_connection_id'
remote_location = 'your_s3_bucket'

hook = S3Hook(remote_conn_id, transfer_config_args={'use_threads': False})
print(hook.list_keys(remote_location)[0:10])
  • Execute this test.py script:

python3 test.py

If your connection is working, you should get the list of first 10 files in your S3 bucket:

[2021-10-11 15:33:12,934] {base_aws.py:368} INFO - Airflow Connection: aws_conn_id=your_connection_id
[2021-10-11 15:33:12,972] {base_aws.py:179} INFO - No credentials retrieved from Connection
[2021-10-11 15:33:12,973] {base_aws.py:82} INFO - Retrieving region_name from Connection.extra_config['region_name']
[2021-10-11 15:33:12,973] {base_aws.py:84} INFO - Creating session with aws_access_key_id=None region_name=eu-central-1
[2021-10-11 15:33:12,980] {base_aws.py:157} INFO - role_arn is None
['directory1/', 'directory1/file1.txt', 'directory1/file2.txt', 'directory1/file3.txt', 'directory1/file4.txt',
'directory1/file5.txt', 'directory1/file6.txt, 'directory1/file7.txt', 'directory1/file8.txt', 'directory1/file9.txt'']

If you can’t connect to your S3 bucket, you will have a python stacktrace. For instance if the bucket you try to connect does not exist:

[2021-10-11 15:39:45,558] {base_aws.py:368} INFO - Airflow Connection: aws_conn_id=your_connection_id
[2021-10-11 15:39:45,588] {base_aws.py:179} INFO - No credentials retrieved from Connection
[2021-10-11 15:39:45,588] {base_aws.py:82} INFO - Retrieving region_name from Connection.extra_config['region_name']
[2021-10-11 15:39:45,588] {base_aws.py:84} INFO - Creating session with aws_access_key_id=None region_name=eu-central-1
[2021-10-11 15:39:45,596] {base_aws.py:157} INFO - role_arn is None
Traceback (most recent call last):
  File "test.py", line 7, in <module>
    print(hook.list_keys(remote_location)[0:10])
  File "/home/ubuntu/.local/lib/python3.8/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 62, in wrapper
    return func(*bound_args.args, **bound_args.kwargs)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 302, in list_keys
    for page in response:
  File "/home/ubuntu/.local/lib/python3.8/site-packages/botocore∕paginate.py", line 255, in __iter__
    response = self._make_request(current_kwargs)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/botocore∕paginate.py", line 332, in _make_request_
    return self._method(**current_kwargs)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/botocore∕client.py", line 357, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/botocore∕client.py", line 676, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.NoSuchBucket: An error occurred (NoSuchBucket) when calling the ListObjectsV2 operation: The specified bucket does not exist