Install Hive with MySQL metadata store
Successfully intalled Hive 0.11 on a 3-node Hadoop cluster with a standalone MySQL metadata store.
1. Download a stable release hive-x.y.z.tar.gz
2. Unpack the gz file and move the hive folder to /usr/local.
tar -xzf hive-x.y.z.tar.gz
sudo mv hive-x.y.z /usr/local/hive
chown ñR hduser:hadoop /usr/local/hive
3. Set HIVE_HOME in .bashrc. Make sure HADOOP_HOME has been set as well.
export HIVE_HOME=/usr/local/hive
export PATH=$HIVE_HOME/bin:$PATH
4. Go to /hive/conf. If there is no hive-default.xml or hive-env.sh, then
cp hive-default.xml.template hive-default.xml
cp hive-env.sh.template hive-env.sh
Note: hive-default.xml is for documentation purpose only. You must make your configuration
changes in hive-site.xml instead.
5. Configure hive-env.sh
export HADOOP_HOME=/usr/local/hadoop
export HIVE_CONF_DIR=/usr/local/hive/conf
6. Create /tmp and metastore folder on Hadoop. Make sure Hadoop instance has been started.
/hadoop/bin/hadoop dfs -mkdir /tmp
/hadoop/bin/hadoop dfs -mkdir /user/hive/warehouse
/hadoop/bin/hadoop dfs -chmod g+w /tmp
/hadoop/bin/hadoop dfs -chmod g+w /user/hive/warehouse
The default location for metastore is /user/hive/warehouse
6. Create hive-site.xml in /usr/local/hive/conf and set mapred.job.tracker property. This should point to the Hadoop job tracker.
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master:50030</value>
</property>
</configuration>
7. At this point, it is ready to run a Hive instance.
/usr/local/hive/bin/hive
8. Test HiveQL. A database row is delimited by Ctrl-A. Here we use ",".
>create table test (a int, b string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ",";
>show tables;
>describe test;
>load data local inpath '/usr/local/hive/test.txt' overwrite into table test;
>select * from test;
9. Have a look at HDFS
/hadoop/bin/hadoop dfs -ls /user/hive/warehouse/test
Hive creates a new folder 'test' under 'warehouse' to store the raw txt file.
Hive maps the table schema (metadata) to the raw file so users can do SQL-like queries
over the raw file.
9. With the default setting, Hive uses a embedded Derby database instance to keep
the metadata which runs in the same JVM and supports only one session at a time.
To support multi-session (users), need to config a standalone database (MySQL) for metadata.
9.1 Download and install MySQL.
9.2 Download MySQL JDBC driver mysql-connector-java-x.y.z-bin.jar and put it in /hive/lib/.
9.3 Add the following properties to hive-site.xml. Specify the MySQL URL, MySql Driver, username
and password. Don't use root account, as it cant access db remotely.
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://192.168.10.10/metastore_db?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
10. From MySQL console, check if metastore_db has been created and modify the character setting.
>show databases;
>alter database metastore_db character set latin1;
Sometimes the default character setting can cause an exception:
Error in metadata: MetaException(message:Got exception:
org.apache.hadoop.hive.metastore.api.MetaException javax.jdo.JDODataStoreException:
An exception was thrown while adding/validating class(es) :
Specified key was too long; max key length is 767 bytes
11. Test create table, load, select, etc. from Hive.