Hive commands pdf LEARNOVITA

Advanced Hive Concepts and Data File Partitioning Tutorial

Last updated on 29th Sep 2020, Blog, Tutorials

About author

Vishnu Vinoth (Big Data Engineer )

Vishnu Vinoth is a Big Data Engineer with 7+ years of experience in Big Data, Spark, PySpark, Flink, SparkML, NiFi, Hive, Python, and NoSQL DBs. His articles help to impart knowledge and skills in core fields and provide informative knowledge to students.

(5.0) | 18584 Ratings 1080

Introduction:

Hive command may be an information warehouse infrastructure tool that sits on high Hadoop to summarize massive information. It processes structured information. It makes information querying and analyzing easier. Hive command is additionally referred to as “schema on reading;” It doesn’t verify information once it’s loaded, verification happens only when a question is issued. This property of Hive makes it quick for initial loading. It’s like repeating or just moving a file while not golfing any constraints or checks. it had been initially developed by Facebook. Apache software package Foundation took it up later and developed it any.

Features of Hive Commands:

Here square measure a couple of of the options mentioned below:

  • Hive stores square measure raw and processed dataset in Hadoop.
  • It is designed for the on-line dealing process (OLTP). OLTP is the system that facilitates high volume information in terribly less time with no reliance on the only server.
  • It is fast, ascendable and reliable.
  • The SQL sort querying language provided here is termed HiveQL or HQL. This makes ETL tasks and alternative analysis easier.

Hive Properties:

Sources images: Google There square measure few limitations of Hive command similarly, that square measure listed below:

  • Hive doesn’t support subqueries.
  • Hive certainly supports over-writing, however sadly, it doesn’t support deletion and updates.
  • Hive isn’t designed for OLTP, however it’s used for it.
  • To enter the Hive’s interactive shell:
  • $HIVE_HOME/bin/hive

Basic Hive Commands:

The basic commands square measure as explained below:

1. Create: this may produce the new info within the Hive.Basic Hive Commands one

2. Drop: The drop can take away a table from Hive

3. Alter: Alter command can assist you rename the table or table columns.

For example:

  • ALTER TABLE worker RENAME TO employee1;

4. Show: show command can show all the databases residing within the Hive.Basic Hive Commands two

5. Describe: Describe command can assist you with the knowledge concerning the schema of the table.

Intermediate Hive Commands:

Hive divides a table into multifariously connected partitions supported columns. victimizing these partitions, it gets easier to question information. These partitions get divided into buckets, to run questions with efficiency on to information.In alternative words, buckets distribute information into the set of clusters by conniving the hash code of the key mentioned within the question.

1. Adding Partition: Adding partitions will be accomplished by fixing the table. Say you have got a table “EMP ”, with fields like Id, Name, Salary, Dept, Designation, and you.

  • ALTER TABLE worker
  • ADD PARTITION (year=’2012’)
  • location ‘/2012/part2012’;

2. Renaming Partition:

  • ALTER TABLE worker PARTITION (year=’1203’)
  • RENAME TO PARTITION (Yoj=’1203’);

3. Drop Partition:

  • ALTER TABLE worker DROP [IF EXISTS] PARTITION (year=’1203’);

4. Relative Operators: Relational operators encompass a precise set of operators that helps in taking relevant info.Relational Operatorz Let’s execute a Hive question which is able to fetch United States of America workers whose regular payment is larger than 30000.

  • SELECT * FROM EMP wherever Salary>=40000;

5. Arithmetic Operators: These square measure operators that facilitate in facilitate in corporal punishment arithmetic operations on the operands, and successively, continuously come variety varieties.

For example: to feature 2 variety like twenty two & thirty three

  • SELECT 22+33 ADD FROM temp;

6. Logical Operator: These operators square measure to execute logical operations, that reciprocally, continuously come True/False.

  • SELECT * FROM EMP wherever Salary>40000 && Dept=TP;

Advanced Hive Commands:

Hive Commands

The advanced commands square measure as explained below:

1. View: View idea in Hive is comparable to SQL. The read will be created at the time of corporal punishment a pick statement.

Example:

  • CREATE read EMP_30000 AS
  • SELECT * FROM EMP
  • WHERE salary>30000;

2. Loading information into Table: Load information native inpath into table States;Here “States” is the already created table in Hive.Hive has some intrinsic functions that assist you in taking your end in a much better manner.Like round, floor, BIGINT etc.

3. Join: Join clause will facilitate in connection 2 tables supporting an equivalent column name.

Example:

  • SELECT c.ID, c.NAME, c.AGE, o.AMOUNT
  • FROM CUSTOMERS c be part of ORDERS o
  • ON (c.ID = o.CUSTOMER_ID);

All kinds of be part ofs square measure supported by Hive: Left outer join, right outer be part of, full outer be part of.

Tips and Tricks:

Hive makes processing that straightforward, easy and protrusible, so that users pay less attention towards optimizing the Hive queries. However, being attentive towards a couple of things, whereas writing Hive questions, can certainly bring nice success in managing employment and saving cash. Below square measure a couple of tips concerning that:

1. Partitions & Buckets:It is a giant information tool, which may question on giant datasets. However, writing the question while not understanding the domain will bring nice partitions in Hive.If the user is alert to the dataset, then relevant and extremely used columns may be classified into an equivalent partition. This may facilitate running the question quicker and inefficient manner.Ultimately the no. of clerk and I/O operations also will be reduced.

2. Parallel Execution:It runs the question in multiple stages. In some cases these stages might rely upon alternative stages, and thus can’t start, once the previous stage is completed. However, freelance tasks will run parallelly to avoid wasting overall run time. To modify the parallel run in Hive:

  • set hive.exec.parallel=true;

Hence, this may enhance the cluster utilization.

3. Block Sampling: Sampling information from a table can permit exploration of queries on information.What will you learn from this hive tutorial?This hadoop hive tutorial shows the way to use varied Hive commands in HQL to perform varied operations like making a table in hive, deleting a table in hive, sterilization a table in hive, etc.

Prerequisites to follow this Hive Tutorial:

  • Hive Installation should be completed with success.
  • Basic information of SQL is needed to follow this hadoop hive tutorial.
  • Learn the fundamentals of Hive Hadoop
  • Hive makes processing on Hadoop easier by providing an info question interface to hadoop. Hive may be a friendlier knowledge warehouse tool for users from ETL or info background WHO area unit aware of mistreatment SQL for querying knowledge.

Read additional on – What is Hive? Hive designAccess resolved massive knowledge and knowledge Science comesCommonly Used Hive Commands Hadoop Tutorial on Hive CommandsLearn Hadoop by functioning on attention-grabbing massive knowledge and Hadoop comes

DDL Commands in Hive:

SQL users would possibly already be conversant in what DDL commands area unit except for readers WHO area unit unaccustomed SQL, DDL refers to knowledge Definition Language. DDL commands area unit the statements that area unit to blame for process and dynamical the structure of a info or table in Hive.

DDL Commands in Hive:

    DDL Command Description
    CREATE Database,Table
    DROP Database,Table
    TRUNCATE Table
    ALTER info,Table
    SHOW Databases,Tables,Table Properties,Partitions,Functions,Index
    DESCRIBE Database, Table ,View

Let’s verify the usage of the highest hive commands in HQL on each databases and tables.

DDL Commands on Databases in Hive:

Create info in Hive As the name implies, this DDL command in Hive is employed for making databases.

  • CREATE (DATABASE) [IF NOT EXISTS] database_name
  • [COMMENT database_comment]
  • [LOCATION hdfs_path]
  • [WITH DBPROPERTIES (property_name=property_value, …)];

In the on top of syntax for produce info command, the values mentioned in sq. brackets [] area unit facultative.

Usage of produce info Command in Hive:

hive> produce info if not exists 1stDB comment “This is my first demo” location ‘/user/hive/warehouse/newdb’ with DBPROPERTIES (‘created by’=’abhay’,’created for’=’dezyre’);OKTime taken: zero.092 secondsDrop info in Hive This command is employed for deleting AN already created info in Hive and therefore the syntax is as follows -DROP (DATABASE) [IF EXISTS] database_name [RESTRICT|CASCADE];Usage of Drop info Command in Hivehive> drop info if exists firstDB CASCADE;OK Time taken: zero.099 seconds In Hadoop Hive, the mode is ready as limit by default and users cannot delete it unless it’s non-empty. For deleting the info in Hive alongside the present tables, users should modify the mode from limit to CASCADE.

In the syntax for drop info Hive command, “if exists” clause is employed to avoid any errors that may occur if the computer programmer tries to delete info that doesn’t exist.Big knowledge comes Describe info Command in Hive This command is employed to see any associated information for the databases. Describe info Command in Hive

Alter info Command in Hive:

Hive Components

Whenever the developers have to be compelled to modification the information of any of the databases, alter hive DDL command will be used as follows –

ALTER (DATABASE) database_name SET DBPROPERTIES (property_name=property_value, …);Usage of ALTER info command in Hive – Let’s use the Alter command to switch the OWNER property and specify the role for the owner –ALTER (DATABASE) database_name SET OWNER [USER|ROLE] user_or_role Alter info Command in HiveShow info Command in HiveProgrammers will read the list of existing databases within the current schema.Usage of Show info Comman Show databases;

Show Databases Command in HQL:

Use info Command in Hive This hive command is employed to pick out a particular info for the session on which hive queries would be dead.Usage of Use info Command in HiveUse info Hive Command DDL Commands on Tables in HiveCreate Table Command in Hive Hive produce table command is employed to form a table within the existing info that’s in use for a specific session.

  • CREATE TABLE [IF NOT EXISTS] [db_name.]table_name –?
  • [(col_name data_type [COMMENT col_comment], …)]?
  • [COMMENT table_comment]?
  • [LOCATION hdfs_path]?

Hive produce Table Usage:

Hive produce TableOn the top of step, we’ve created a hive table named Students within the info faculty with varied fields like ID, Name, fee, city, etc. Comments are mentioned for every column in order that anybody concerning the table gets a summary regarding what the columns mean.The LOCATION keyword is employed for specifying wherever the table ought to be held on HDFS.

How to produce a table in hive by repetition AN existing table schema:

Hive lets programmers produce a brand new table by replicating the schema of AN existing table however keep in mind solely the schema of the new table is replicated however not the information. Once making the new table, the situation parameter will be specified .

  • CREATE TABLE [IF NOT EXISTS] [db_name.]table_name Like [db_name].existing_table [LOCATION hdfs_path]?

Create Table in Hive DROP Table Command in Hive Drops the table and every one the information related to it within the Hive metastore.DROP TABLE [IF EXISTS] table_name [PURGE];Usage of DROP Table command in Hive

Drop Table Hive Command:

DROP table command removes the information and knowledge for a specific table. knowledge is sometimes captive to .Trash/Current directory if Trash is organized. If PURGE possibility is specified then the table knowledge won’t visit the trash directory and there’ll be no scope to retrieve the information just in case of incorrect DROP command execution. TRUNCATE Table Command in Hive.

Are you looking training with Right Jobs?

Contact Us

Popular Courses