Hive – Load Data Into Table

In Apache Hive, you can load data into a table using the LOAD DATA statement. The basic syntax for loading data into a table is as follows:


LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename;

LOCAL keyword is used when the data file is on the local file system.
INPATH is used to specify the path of the file you want to load.
OVERWRITE is an optional keyword used to overwrite the existing data in the table.
tablename is the name of the table where the data will be loaded.

For example, to load data from a file called "data.txt" located in the "/tmp" directory into a table called "employees", you would use the following statement:


LOAD DATA INPATH '/tmp/data.txt' INTO TABLE employees;

If the table does not exist, Hive will return an error. If you want to create the table and load data at the same time, you can use the CREATE TABLE AS SELECT statement.


CREATE TABLE new_table
AS SELECT * FROM existing_table;

Please note that Hive uses the Hadoop Distributed File System (HDFS) to store data, and the filepath specified in the LOAD DATA statement must be a valid HDFS path.

You can also use the INSERT INTO statement to insert data into a table. The syntax is similar to the SQL INSERT INTO statement.


INSERT INTO TABLE tablename
[PARTITION (partcol1[=val1], partcol2[=val2] ...)]
select_statement;

For example, you can use the following statement to insert data into the "employees" table:


INSERT INTO TABLE employees
SELECT * FROM tmp_employees;

In this example, the data from the "tmp_employees" table is inserted into the "employees" table.

Tech Insights

About Us

Contact Form