Home‎ > ‎

How to load data into Hive

Loading data from flat files into Hive:

LOAD DATA LOCAL INPATH absolute_filename OVERWRITE INTO TABLE table_name;

Loads a file that contains two columns separated by ctrl-a into pokes table. 'local' signifies that the input file is on the local file system. If 'local' is omitted then it looks for the file in HDFS.

The keyword 'overwrite' signifies that existing data in the table is deleted. If the 'overwrite' keyword is omitted, data files are appended to existing data sets.

Note
  1. NO verification of data against the schema is performed by the load command.
  2. If the file is in hdfs, it is moved into the Hive-controlled file system namespace. The root of the Hive directory is specified by the option 'hive.metastore.warehouse.dir' in hive-default.xml. We advise users to create this directory before trying to create tables via Hive.
LOAD DATA LOCAL INPATH absolute_filename OVERWRITE INTO TABLE table_namePARTITION (ds='2008-08-15');
LOAD DATA LOCAL INPATH absolute_filename OVERWRITE INTO TABLE table_namePARTITION (ds='2008-08-08');

The two LOAD statements above load data into two different partitions of the table invites. Table invites must be created as partitioned by the key ds for this to succeed.

LOAD DATA INPATH hadoop_filename OVERWRITE INTO TABLE table_namePARTITION (ds='2008-08-15');

The above command will load data from an HDFS file/directory to the table. Note that loading data from HDFS will result in moving the file/directory. As a result, the operation is almost instantaneous.

 

Comments