We will provide a walk-through example of how to create a partition table in Hive. Note that the partition is done on columns, where you can have multiple partition keys. Let say that we want to partition our table on the Country
field. Assume that we have already created an external table called MY_TABLE
and we have to create a partitioned one called MY_PARTITIONED_TABLE
.
--- Create the partitioned table CREATE TABLE MY_PARTITIONED_TABLE ( Name STRING, Surname STRING, UserID INTEGER) PARTITIONED BY (Country STRING) STORED AS PARQUET LOCATION "s3://my-bucket/mypath/"
So far, the table is empty, and the location is where the data will be stored once we start inserting data in the external table. Note that we want a dynamic partition.
SET hive.exec.dynamic.partition.mode=nonstrict; INSERT OVERWRITE TABLE MY_PARTITIONED_TABLE PARTITION(Country) SELECT * FROM MY_TABLE; SET hive.exec.dynamic.partition.mode=strict;
And voilà! The MY_PARTITIONED_TABLE
is a partitioned table on the Country
column.