Predictive Hacks

How to Interact with S3 using AWS CLI

aws cli

At this post, I gather some useful commands/examples from AWS official documentation. I believe that the following examples are the basics needed by a Data Scientist working with AWS. If you do not feel comfortable with the command lines you can jumpy to the Basic Introduction to Boto3 tutorial where we explained how you can interact with S3 using Boto3.

AWS Command Line Interface

The AWS Command Line Interface (CLI) is a unified tool to manage your AWS services. With just one tool to download and configure, you can control multiple AWS services from the command line and automate them through scripts.

Since we work with Python, I would suggest to pip install the awscli package from the Anaconda distribution. Once you installed it you can get the version with the command:

aws --version

How to Configure the AWS CLI

There are several ways to configure the aws. You can do it from the command line (I use Anaconda) with the configure command where you create the credentials file:

$ aws configure
AWS Access Key ID [None]: AKIAIOSFODNN7EXAMPLE
AWS Secret Access Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Default region name [None]: us-west-2
Default output format [None]: json

You can create different users by applying profiles. By default, the AWS CLI uses the default profile. You can create and use additional named profiles with varying credentials and settings by specifying the --profile option and assigning a name.

The following example creates a profile named produser.

$ aws configure --profile produser
AWS Access Key ID [None]: AKIAI44QH8DHBEXAMPLE
AWS Secret Access Key [None]: je7MtGbClwBF/2Zp9Utk/h3yCo8nvbEXAMPLEKEY
Default region name [None]: us-east-1
Default output format [None]: text

You can then specify a --profile profilename and use the credentials and settings stored under that name :

$ aws s3 ls --profile produser

CLI credentials file

credentials and config file are updated when you run the command aws configure. The credentials file is located at ~/.aws/credentials on Linux or macOS, or at C:\Users\USERNAME\.aws\credentials on Windows. This file can contain the credential details for the default profile and any named profiles

CLI configuration file

The credentials and config file are updated when you run the command aws configure. The config file is located at ~/.aws/config on Linux or macOS, or at C:\Users\USERNAME\.aws\config on Windows. This file contains the configuration settings for the default profile and any named profiles.

Basic aws CLI commands

How to list the S3 buckets

list S3 buckets

How to get the documentation

$ aws help
$ aws <command> help
$ aws <command> <subcommand> help

For example, if you want help for the s3api or s3:

$ aws s3api help
$ aws s3 help

How to create a bucket

$ aws s3 mb s3://bucket-name

Bear in mind that there are some restrictions in Bucket Names. Let’s outline the Rules for bucket naming:

Rules for bucket naming

The following rules apply for naming S3 buckets:

  • Bucket names must be between 3 and 63 characters long.
  • Bucket names can consist only of lowercase letters, numbers, dots (.), and hyphens (-).
  • Bucket names must begin and end with a letter or number.
  • Bucket names must not be formatted as an IP address (for example, 192.168.5.4).
  • Bucket names can’t begin with xn-- (for buckets created after February 2020).
  • Bucket names must be unique within a partition. A partition is a grouping of Regions. AWS currently has three partitions: aws (Standard Regions), aws-cn (China Regions), and aws-us-gov (AWS GovCloud [US] Regions).
  • Buckets used with Amazon S3 Transfer Acceleration can’t have dots (.) in their names. For more information about transfer acceleration, see Amazon S3 Transfer Acceleration.

How to delete a bucket

$ aws s3 rb s3://bucket-name --force

How to delete objects

$ aws s3 rm s3://bucket-name/example

How to move objects

The following example moves all objects from s3://bucket-name/example to s3://my-bucket/.

$ aws s3 mv s3://bucket-name/example s3://my-bucket/

The following example moves a local file from your current working directory to the Amazon S3 bucket

$ aws s3 mv filename.txt s3://bucket-name

The following example moves a file from your Amazon S3 bucket to your current working directory, where ./ specifies your current working directory.

$ aws s3 mv s3://bucket-name/filename.txt ./

Copy objects

The following example copies all objects from s3://bucket-name/example to s3://my-bucket/.

$ aws s3 cp s3://bucket-name/example s3://my-bucket/

The following example copies a local file from your current working directory to the Amazon S3 bucket with the s3 cp command.

$ aws s3 cp filename.txt s3://bucket-name

The following example copies a file from your Amazon S3 bucket to your current working directory, where ./ specifies your current working directory.

$ aws s3 cp s3://bucket-name/filename.txt ./

recursive

When you use this option, the command is performed on all files or objects under the specified directory or prefix. The following example deletes s3://my-bucket/path and all of its contents.

$ aws s3 rm s3://my-bucket/path --recursive

Recursively copying local files to S3

When passed with the parameter –recursive, the following cp command recursively copies all files under a specified directory to a specified bucket and prefix while excluding some files by using an –exclude parameter. In this example, the directory myDir has the files test1.txt and test2.jpg:

aws s3 cp myDir s3://mybucket/ --recursive --exclude "*.jpg"

Output:

upload: myDir/test1.txt to s3://mybucket/test1.txt

Recursively copying S3 objects to another bucket

When passed with the parameter –recursive, the following cp command recursively copies all objects under a specified bucket to another bucket while excluding some objects by using an –exclude parameter. In this example, the bucket mybucket has the objects test1.txt and another/test1.txt:

aws s3 cp s3://mybucket/ s3://mybucket2/ --recursive --exclude "another/*"

Output:

copy: s3://mybucket/test1.txt to s3://mybucket2/test1.txt

You can combine --exclude and --include options to copy only objects that match a pattern, excluding all others:

aws s3 cp s3://mybucket/logs/ s3://mybucket2/logs/ --recursive --exclude "*" --include "*.log"

Output:

copy: s3://mybucket/logs/test/test.log to s3://mybucket2/logs/test/test.log
copy: s3://mybucket/logs/test3.log to s3://mybucket2/logs/test3.log

Recursively copying S3 objects to a local directory

When passed with the parameter –recursive, the following cp command recursively copies all objects under a specified prefix and bucket to a specified directory. In this example, the bucket mybucket has the objects test1.txt and test2.txt:

aws s3 cp s3://mybucket . --recursive

Output:

download: s3://mybucket/test1.txt to test1.txt
download: s3://mybucket/test2.txt to test2.txt

References

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email

Leave a Comment

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore