:original_name: mrs_01_0442.html
.. _mrs_01_0442:
Using Hive from Scratch
=======================
Hive is a data warehouse framework built on Hadoop. It maps structured data files to a database table and provides SQL-like functions to analyze and process data. It also allows you to quickly perform simple MapReduce statistics using SQL-like statements without the need of developing a specific MapReduce application. It is suitable for statistical analysis of data warehouses.
Background
----------
Suppose a user develops an application to manage users who use service A in an enterprise. The procedure of operating service A on the Hive client is as follows:
**Operations on common tables**:
- Create the **user_info** table.
- Add users' educational backgrounds and professional titles to the table.
- Query user names and addresses by user ID.
- Delete the user information table after service A ends.
.. _mrs_01_0442__en-us_topic_0000001219230755_en-us_topic_0037446806_table27353390:
.. table:: **Table 1** User information
=========== ==== ====== === =======
ID Name Gender Age Address
=========== ==== ====== === =======
12005000201 A Male 19 City A
12005000202 B Female 23 City B
12005000203 C Male 26 City C
12005000204 D Male 18 City D
12005000205 E Female 21 City E
12005000206 F Male 32 City F
12005000207 G Female 29 City G
12005000208 H Female 30 City H
12005000209 I Male 26 City I
12005000210 J Female 25 City J
=========== ==== ====== === =======
Procedure
---------
#. .. _mrs_01_0442__en-us_topic_0000001219230755_l6b58a848ef0f4fe6a361d4ef0ac39fb8:
Download the client configuration file.
a. Log in to FusionInsight Manager. For details, see :ref:`Accessing FusionInsight Manager `.
b. Choose **Cluster** > *Name of the desired cluster* > **Dashboard** > **More** > **Download Client**.
c. Download the cluster client.
d. Set **Select Client Type** to **Configuration Files Only** , select a platform type, and click **OK** to generate the client configuration file which is then saved in the **/tmp/FusionInsight-Client/** directory on the active management node by default.
#. Log in to the active management node of Manager.
a. Log in to any node where Manager is deployed as user **root**.
b. Run the following command to identify the active and standby nodes:
**sh ${BIGDATA_HOME}/om-server/om/sbin/status-oms.sh**
In the command output, the value of **HAActive** for the active management node is **active**, and that for the standby management node is **standby**. In the following example, **node-master1** is the active management node, and **node-master2** is the standby management node.
.. code-block::
HAMode
double
NodeName HostName HAVersion StartTime HAActive HAAllResOK HARunPhase
192-168-0-30 node-master1 V100R001C01 2020-05-01 23:43:02 active normal Actived
192-168-0-24 node-master2 V100R001C01 2020-05-01 07:14:02 standby normal Deactived
c. Log in to the primary management node as user **root** and run the following command to switch to user **omm**:
**sudo su - omm**
#. Run the following command to go to the client installation directory:
**cd /opt/client**
The cluster client has been installed in advance. The following client installation directory is used as an example. Change it based on the site requirements.
#. .. _mrs_01_0442__en-us_topic_0000001219230755_li15639738131312:
Run the following command to update the client configuration for the active management node.
**sh refreshConfig.sh /opt/client** *Full path of the client configuration file package*
For example, run the following command:
**sh refreshConfig.sh /opt/client** **/tmp/FusionInsight-Client/FusionInsight_Cluster_1_Services_Client.tar**
If the following information is displayed, the configurations have been updated successfully.
.. code-block::
ReFresh components client config is complete.
Succeed to refresh components client config.
.. note::
You can refer to Method 2 in `Updating a Client `__ to perform operations in steps :ref:`1 ` to :ref:`4 `.
#. Use the client on a Master node.
a. On the active management node, for example, **192-168-0-30**, run the following command to switch to the client directory, for example, **/opt/client**.
**cd /opt/client**
b. Run the following command to configure environment variables:
**source bigdata_env**
c. If Kerberos authentication is enabled for the current cluster, run the following command to authenticate the current user:
**kinit** *MRS cluster user*
Example: user **kinit hiveuser**
The current user must have the permission to create Hive tables. To create a role with the permission, refer to `Creating a Role `__. To bind the role to the current user, refer to `Creating a User `__. If Kerberos authentication is disabled, skip this step.
d. Run the client command of the Hive component directly.
**beeline**
#. Run the Hive client command to implement service A.
**Operations on internal tables**:
a. Create the **user_info** user information table according to :ref:`Table 1 ` and add data to it.
**create table user_info(id string,name string,gender string,age int,addr string);**
**insert into table user_info(id,name,gender,age,addr) values("12005000201","A","Male",19,"City A");**
b. Add users' educational backgrounds and professional titles to the **user_info** table.
For example, to add educational background and title information about user 12005000201, run the following command:
**alter table user_info add columns(education string,technical string);**
c. Query user names and addresses by user ID.
For example, to query the name and address of user 12005000201, run the following command:
**select name,addr from user_info where id='12005000201';**
d. Delete the user information table.
**drop table user_info;**
**Operations on external partition tables**:
Create an external partition table and import data.
a. Create a path for storing external table data.
**hdfs dfs -mkdir /hive/user_info**
b. Create a table.
**create external table user_info(id string,name string,gender string,age int,addr string) partitioned by(year string) row format delimited fields terminated by ' ' lines terminated by '\\n' stored as textfile location '/hive/user_info';**
.. note::
**fields terminated** indicates delimiters, for example, spaces.
**lines terminated** indicates line breaks, for example, **\\n**.
**/hive/user_info** indicates the path of the data file.
c. Import data.
#. Execute the insert statement to insert data.
**insert into user_info partition(year="2018") values ("12005000201","A","Male",19,"City A");**
#. Run the **load data** command to import file data.
#. Create a file based on the data in :ref:`Table 1 `. For example, the file name is **txt.log**. Fields are separated by space, and the line feed characters are used as the line breaks.
#. Upload the file to HDFS.
**hdfs dfs -put txt.log /tmp**
#. Load data to the table.
**load data inpath '/tmp/txt.log' into table user_info partition (year='2011');**
d. Query the imported data.
**select \* from user_info;**
e. Delete the user information table.
**drop table user_info;**