Predictive Hacks

Example of External Tables with Snowflake

Snowflake gives us the option to create external tables. We will provide an example of how to create external tables with Snowflake and data located in S3. Assume our CSV file called Example.csv is the following:

ID,Name,DOB
1,George,1980-03-04
2,Billy,1983-10-22
3,Marco,1990-01-01
4,Alex,1970-12-07
5,Maria,2000-01-01
6,Christina,1990-06-01

and it is placed in an S3 bucket. We have already created an S3 stage and we are ready to create the external table. There are two options, one is to create it directly and the other option is to specify its column.

Option A

Note that the data are under the path demo_data and our external stage is called my_s3_test_snowflake_stage.

create or replace external table my_ext_table
with location = @my_s3_test_snowflake_stage/demo_data/
auto_refresh = true
file_format = (type = CSV, skip_header=1);
  
 select * from  my_ext_table

Above, you can see the output of the my_ext_table table and most probably was not exactly what we expected to get since the data are unstructured. However, we can convert them to tabular data as follows:

 select value:c1::int as ID,
        value:c2::varchar as Name,
        value:c3::date as Date
 from my_ext_table

Option B

We can specify the column names in the beginning as we can see below:

create or replace external table my_ext_table (
ID int as (value:c1::int),
Name varchar as (value:c2::varchar),
Date date as (value:c3::date)

)
with location = @my_s3_test_snowflake_stage/demo_data/
auto_refresh = true
file_format = (type = CSV, skip_header=1);
  
select * from  my_ext_table

Things to Know

Refresh the Data

Unlikely with Hive, when the data changes in the S3 bucket, the external table is not updated automatically. So, in order to update it, you need to run the following command:

 alter external table my_ext_table refresh;

However, if you want your external table to refresh automatically, you can do it by setting the SQS notifications.

Multiple Files

You can have multiple files in the S3 bucket, and Snowflake can remove the headers for each file. Let’s have the same file twice at S3 bucket, of course with a different name.

create or replace external table my_ext_table (
ID int as (value:c1::int),
Name varchar as (value:c2::varchar),
Date date as (value:c3::date)

)
with location = @my_s3_test_snowflake_stage/demo_data/
auto_refresh = true
file_format = (type = CSV, skip_header=1);
  
select * from  my_ext_table

Get the filename

You can also get the file name of the external table data using the command metadata$filename.

select metadata$filename, name from my_ext_table

More Snowflake Tutorials

How To Schedule Tasks In Snowflake

What you can do with Snowflake

How to Get Data from Snowflake using Python

How to Load Data from S3 to Snowflake

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Python

Image Captioning with HuggingFace

Image captioning with AI is a fascinating application of artificial intelligence (AI) that involves generating textual descriptions for images automatically.

Python

Intro to Chatbots with HuggingFace

In this tutorial, we will show you how to use the Transformers library from HuggingFace to build chatbot pipelines. Let’s