Predictive Hacks

How to create external tables in Hive

Assume that you want to get data from S3 and create an external table in Hive. The syntax is the following:

CREATE EXTERNAL TABLE [IF NOT EXISTS] [db_name.] table_name
[(col_name data_type [COMMENT col_comment], ...)]
[COMMENT table_comment]
[ROW FORMAT row_format]
[FIELDS TERMINATED BY char]
[STORED AS file_format]
[LOCATION hdfs_path];

You can have a look at Hive create table documentation. Let’s create an external table from the data which are in S3 bucket s3://my-bucket/my_key/my_file.

CREATE EXTERNAL TABLE IF NOT EXISTS my_table(
        ID INT,
        NAME STRING,
        SURNAME STRING)
    COMMENT 'staples_configfile'
    ROW FORMAT DELIMITED
    FIELDS TERMINATED BY ','
    STORED AS TEXTFILE
    location "s3a://my-bucket/my_key/my_file"
    tblproperties ("skip.header.line.count"="1");

You can also create an empty table and then fill it with data as follows:

--create raw results
CREATE EXTERNAL TABLE IF NOT EXISTS my_table(
        ID INT,
        NAME STRING,
        SURNAME STRING,
        opened_dt STRING)
    STORED AS ORC
    LOCATION "s3a://my-bucket/my_key/folder/my_table";
 
     
insert overwrite table my_table
select * from existing_table;

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Python

Get Started with Hugging Face Auto Train

Hugging Face has launched the auto train, which is a new way to automatically train, evaluate and deploy state-of-the-art Machine