At this post, I gather some useful commands/examples from AWS official documentation. I believe that the following examples are the basics needed by a Data Scientist working with AWS. If you do not feel comfortable with the command lines you can jumpy to the Basic Introduction to Boto3 tutorial where we explained how you can interact with S3 using Boto3.
AWS Command Line Interface
The AWS Command Line Interface (CLI) is a unified tool to manage your AWS services. With just one tool to download and configure, you can control multiple AWS services from the command line and automate them through scripts.
Since we work with Python, I would suggest to pip install the awscli
package from the Anaconda distribution. Once you installed it you can get the version with the command:
aws --version
How to Configure the AWS CLI
There are several ways to configure the aws
. You can do it from the command line (I use Anaconda) with the configure
command where you create the credentials file:
$ aws configure AWS Access Key ID [None]: YOUR_ACCESS_KEY AWS Secret Access Key [None]: YOUR_SECRET_ACCESS Default region name [None]: us-west-2 Default output format [None]: json
You can create different users by applying profiles
. By default, the AWS CLI uses the default
profile. You can create and use additional named profiles with varying credentials and settings by specifying the --profile
option and assigning a name.
The following example creates a profile named produser
.
$ aws configure --profile produser AWS Access Key ID [None]: AKIAI44QH8DHBEXAMPLE AWS Secret Access Key [None]: je7MtGbClwBF/2Zp9Utk/h3yCo8nvbEXAMPLEKEY Default region name [None]: us-east-1 Default output format [None]: text
You can then specify a --profile
and use the credentials and settings stored under that name :profilename
$ aws s3 ls --profile produser
If you do not want to add the -- profile username
in every single command when you are dealing with different users, you can set the AWS_PROFILE
environment variable at the command line.
Linux or macOS
$ export AWS_PROFILE=user1
Windows
C:\> setx AWS_PROFILE user1
Using set
to set an environment variable changes the value used until the end of the current command prompt session, or until you set the variable to a different value.
Using setx
to set an environment variable changes the value in all command shells that you create after running the command. It does not affect any command shell that is already running at the time you run the command. Close and restart the command shell to see the effects of the change.
CLI credentials file
The credentials
and config
file are updated when you run the command aws configure
. The credentials
file is located at ~/.aws/credentials
on Linux or macOS, or at C:\Users\
on Windows. This file can contain the credential details for the USERNAME
\.aws\credentialsdefault
profile and any named profiles
CLI configuration file
The credentials
and config
file are updated when you run the command aws configure
. The config
file is located at ~/.aws/config
on Linux or macOS, or at C:\Users\
on Windows. This file contains the configuration settings for the default profile and any named profiles.USERNAME
\.aws\config
Basic aws CLI commands
How to list the S3 buckets
aws s3api list-buckets
Or:
aws s3 ls
In case you want to list for a specific bucket:
aws s3 ls s3://bucket-name
How to get the documentation
$ aws help $ aws <command> help $ aws <command> <subcommand> help
For example, if you want help for the s3api
or s3
:
$ aws s3api help $ aws s3 help
How to create a bucket
$ aws s3 mb s3://bucket-name
Bear in mind that there are some restrictions in Bucket Names. Let’s outline the Rules for bucket naming:
Rules for bucket naming
The following rules apply for naming S3 buckets:
- Bucket names must be between 3 and 63 characters long.
- Bucket names can consist only of lowercase letters, numbers, dots (.), and hyphens (-).
- Bucket names must begin and end with a letter or number.
- Bucket names must not be formatted as an IP address (for example, 192.168.5.4).
- Bucket names can’t begin with
xn--
(for buckets created after February 2020). - Bucket names must be unique within a partition. A partition is a grouping of Regions. AWS currently has three partitions:
aws
(Standard Regions),aws-cn
(China Regions), andaws-us-gov
(AWS GovCloud [US] Regions). - Buckets used with Amazon S3 Transfer Acceleration can’t have dots (.) in their names. For more information about transfer acceleration, see Amazon S3 Transfer Acceleration.
How to delete a bucket
$ aws s3 rb s3://bucket-name --force
How to delete objects
$ aws s3 rm s3://bucket-name/example
How to move objects
The following example moves all objects from s3://bucket-name/example
to s3://my-bucket/
.
$ aws s3 mv s3://bucket-name/example s3://my-bucket/
The following example moves a local file from your current working directory to the Amazon S3 bucket
$ aws s3 mv filename.txt s3://bucket-name
The following example moves a file from your Amazon S3 bucket to your current working directory, where ./
specifies your current working directory.
$ aws s3 mv s3://bucket-name/filename.txt ./
Copy objects
The following example copies all objects from s3://bucket-name/example
to s3://my-bucket/
.
$ aws s3 cp s3://bucket-name/example s3://my-bucket/
The following example copies a local file from your current working directory to the Amazon S3 bucket with the s3 cp
command.
$ aws s3 cp filename.txt s3://bucket-name
The following example copies a file from your Amazon S3 bucket to your current working directory, where ./
specifies your current working directory.
$ aws s3 cp s3://bucket-name/filename.txt ./
recursive
When you use this option, the command is performed on all files or objects under the specified directory or prefix. The following example deletes s3://my-bucket/path
and all of its contents.
$ aws s3 rm s3://my-bucket/path --recursive
Recursively copying local files to S3
When passed with the parameter –recursive, the following cp command recursively copies all files under a specified directory to a specified bucket and prefix while excluding some files by using an –exclude parameter. In this example, the directory myDir has the files test1.txt and test2.jpg:
aws s3 cp myDir s3://mybucket/ --recursive --exclude "*.jpg"
Output:
upload: myDir/test1.txt to s3://mybucket/test1.txt
Recursively copying S3 objects to another bucket
When passed with the parameter –recursive, the following cp command recursively copies all objects under a specified bucket to another bucket while excluding some objects by using an –exclude parameter. In this example, the bucket mybucket has the objects test1.txt and another/test1.txt:
aws s3 cp s3://mybucket/ s3://mybucket2/ --recursive --exclude "another/*"
Output:
copy: s3://mybucket/test1.txt to s3://mybucket2/test1.txt
You can combine --exclude
and --include
options to copy only objects that match a pattern, excluding all others:
aws s3 cp s3://mybucket/logs/ s3://mybucket2/logs/ --recursive --exclude "*" --include "*.log"
Output:
copy: s3://mybucket/logs/test/test.log to s3://mybucket2/logs/test/test.log copy: s3://mybucket/logs/test3.log to s3://mybucket2/logs/test3.log
Recursively copying S3 objects to a local directory
When passed with the parameter –recursive, the following cp command recursively copies all objects under a specified prefix and bucket to a specified directory. In this example, the bucket mybucket has the objects test1.txt and test2.txt:
aws s3 cp s3://mybucket . --recursive
Output:
download: s3://mybucket/test1.txt to test1.txt download: s3://mybucket/test2.txt to test2.txt