AWS S3

Scheme

The scheme you can use to access your files in Amazon Simple Storage is

s3://

Configuration

The configuration of the Amazon Web Services Simple Cloud Storage can be done through a variety of ways. Most require you to have an Access Key and a Secret Key.

Best practice is to create a specific IAM user for Apache Hop so that if needed you can fine-tune the permissions (set it to read-only for example) or indeed delete the user if it’s no longer needed.

For a complete list see Working with credentials in the AWS documentation.

Below are 2 popular ways of configuring access.

Credentials file

You can create a file in your home folder in the .aws/credentials file which then contains content like this:

[default]
aws_access_key_id = yourSecretKey
aws_secret_access_key = a-long/series-of-characters-for-your-access-key

Variables

You can set the following system environment variables:

  • AWS_ACCESS_KEY_ID : set it to your AWS access key

  • AWS_SECRET_ACCESS_KEY : set it to your secret access key

Part size

You can set the default part size for new files on S3 by setting the following variable:

HOP_S3_VFS_PART_SIZE

This needs to be set as a global Hop configuration variable (in hop-config.json). You can use the Tools/Edit config variables menu in Hop GUI or you can use the hop-conf command line tool to do so.

Acceptable are 5MB as a minimum and 5GB as a maximum value.

If this variable is not set, 5MB will be taken as value and a message will be printed on the console while creating files on S3:

Part size null less than minimum of 5MB, set to minimum.

Usage and testing

To test if the configuration works you can simply upload a small CSV file in an S3 bucket and then use File/Open in Hop GUI. Then you type in s3:// as a file location and hit enter (or click the refresh button). Browse to the CSV file you uploaded and open it. If all is configured correctly you should be able to see the content in the Hop GUI.