Assume that you want to get data from S3 and create an external table in Hive. The syntax is the following:
CREATE EXTERNAL TABLE [IF NOT EXISTS] [db_name.] table_name [(col_name data_type [COMMENT col_comment], ...)] [COMMENT table_comment] [ROW FORMAT row_format] [FIELDS TERMINATED BY char] [STORED AS file_format] [LOCATION hdfs_path];
You can have a look at Hive create table documentation. Let’s create an external table from the data which are in S3 bucket s3://my-bucket/data_folder
.
CREATE EXTERNAL TABLE IF NOT EXISTS my_table( ID INT, NAME STRING, SURNAME STRING) COMMENT 'staples_configfile' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE location "s3a://my-bucket/data_folder" tblproperties ("skip.header.line.count"="1");
You can also create an empty table and then fill it with data as follows:
--create raw results CREATE EXTERNAL TABLE IF NOT EXISTS my_table( ID INT, NAME STRING, SURNAME STRING, opened_dt STRING) STORED AS ORC LOCATION "s3a://my-bucket/my_key/folder/my_table"; insert overwrite table my_table select * from existing_table;