In Apache Hive, you can load data into a table using the LOAD DATA
statement. The basic syntax for loading data into a table is as follows:
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename;
LOCAL
keyword is used when the data file is on the local file system.INPATH
is used to specify the path of the file you want to load.OVERWRITE
is an optional keyword used to overwrite the existing data in the table.tablename
is the name of the table where the data will be loaded.
For example, to load data from a file called "data.txt" located in the "/tmp" directory into a table called "employees", you would use the following statement:
LOAD DATA INPATH '/tmp/data.txt' INTO TABLE employees;
If the table does not exist, Hive will return an error. If you want to create the table and load data at the same time, you can use the CREATE TABLE AS SELECT
statement.
CREATE TABLE new_table
AS SELECT * FROM existing_table;
Please note that Hive uses the Hadoop Distributed File System (HDFS) to store data, and the filepath specified in the LOAD DATA statement must be a valid HDFS path.
You can also use the INSERT INTO
statement to insert data into a table. The syntax is similar to the SQL INSERT INTO statement.
INSERT INTO TABLE tablename
[PARTITION (partcol1[=val1], partcol2[=val2] ...)]
select_statement;
For example, you can use the following statement to insert data into the "employees" table:
INSERT INTO TABLE employees
SELECT * FROM tmp_employees;
In this example, the data from the "tmp_employees" table is inserted into the "employees" table.