Introduction to HBase and Its Architecture | A Complete Guide For Beginners
Last updated on 02nd Nov 2022, Artciles, Blog
- In this article you will learn:
- 1.Introduction to HBase.
- 2.HBase History.
- 3.What is HBase?
- 4.Why HBase?
- 5.HBase Architecture.
- 6.Characteristics of HBase.
- 7.Applications of HBase .
- 8.Features of HBase .
- 9.Conclusion.
Introduction to HBase:
A few decades ago, an internet wasn’t available, that is also when a data generated was much lesser and also was structured in a nature.Structured data means data that has a definite structure and which has to standard order.This data was stored in a Relational Database (RDBMS) without any hassle. With an evolution of the internet, large volumes of structured and semi-structured data started getting generated.Semi-structured data includes the emails, JSON, XML, and .csv files to name a few.Loads of a semi-structured data was created across globe. As a result, storing and processing this data became a main challenge.
HBase History:
- Back in a November 2006, Google released a paper on BigTable.
- Then in February 2007, HBase prototype was created as Hadoop contribution.
- In October 2007, first usable HBase along with the Hadoop 0.15.0 was released, and a HBase became a subproject of Hadoop in January 2008. HBase 0.81.1, 0.19.0 and 0.20.0 were released between Oct 2008 and Sep 2009.
- Finally, in May 2010, HBase became an Apache top-level project.
What is a HBase?
- HBase is modeled after a Google’s Bigtable, which is the distributed storage system for structured data.
- Just as a Bigtable leverages the distributed data storage provided by a Google File System, Apache HBase provides a Bigtable-like capabilities on top of a Hadoop and HDFS.
- Some of companies that use a HBase as their core program are be Facebook, Netflix, Yahoo, Adobe, and Twitter.
- The goal of HBase is to host large tables with the billions of rows and millions of columns on a top of clusters of commodity hardware.
Why HBase?
- It can save huge amounts of data in the tabular format for extremely fast reads and writes.
- HBase is mostly used in the scenario that requires regular, consistent insertion and overwriting of a data.
- However, it performs the only batch processing where the data is accessed in the sequential manner.
- This means one has to search an entire dataset for even simplest of jobs.
- Hence, a solution was a required to access, read, or write data any time regardless of its sequence in a clusters of data.
HBase Real Life Connect – Example:
May be aware that Facebook has introduced the new Social Inbox integrating email, IM, SMS, text messages, and on-site Facebook messages. They need to save over a 135 billion messages a month.Facebook chose a HBase because it needed the system that could handle the two types of data patterns:
- An ever-growing dataset that is be rarely accessed.
- An ever-growing dataset that is highly volatile read what’s in a Inbox, and then rarely look at it again.
HBase Architecture:
- The Apache Zookeeper monitors a system, and the HBase Master assigns regions and also load balancing.
- The Region server serves a data to read and write. The Region Server is all various computers in a Hadoop cluster.
- It consists of a Region, HLog, Store, MemoryStore, and different files. All this is a part of HDFS storage system.
Characteristics of HBase:
- HBase is the type of NoSQL database and is classified as key-value store.
- Value is identified with the key.
- Both key and values are be Byte Array, which means binary formats can be saved easily.
- Values are saved in key-orders.
- Values can be quickly accessed by keys.
- HBase is the database in which tables have no schema; column families and not columns are explained at the time of table creation.
Applications of HBase :
There are number of HBase applications across the different industries, from healthcare to e-commerce to the sports sector. For instance:
- In healthcare sector, HBase is used for the storing genome sequences and disease history of a people or a particular area.
- In a field of e-commerce, HBase is used for saving logs about customer search history and it also performs the analytics and target advertisement for the better business insights.
- In sports, HBase is used to save match details and the history of every match. It uses this data for a better prediction.
Features of HBase :
Scalable: HBase allows the data to be scaled across different nodes as it is stored in HDFS.
Automatic failure support: Write a ahead Log across clusters are present that offers an automatic support against failure.
Consistent read and write: HBase offers consistent read and write of data.
JAVA API for client access: HBase offers easy to use a JAVA API for clients.
Block cache and Bloom filters: It supports the block cache and bloom filters for more volume query optimization.
Conclusion:
HBase style components:
- A HMaster, HRegionServer, HRegions, ZooKeeper, HDFS.
- HMaster in HBase is a first server implementation for a HBase vogue.
- Regions are the needed building blocks of HBase cluster that has distribution tables and is built by a Column families.
- HDFS offera high level of error tolerance and uses a foremost reasonable hardware.
- HBase information Model may be a collection of the elements that embrace Tables, Rows, Column Families, Cells, Columns, and Versions.
- The column and end-line endings disagree in final path.
Are you looking training with Right Jobs?
Contact Us- Hadoop Tutorial
- Hadoop Interview Questions and Answers
- How to Become a Hadoop Developer?
- Hadoop Architecture Tutorial
- What Are the Skills Needed to Learn Hadoop?
Related Articles
Popular Courses
- Hadoop Developer Training
11025 Learners
- Apache Spark With Scala Training
12022 Learners
- Apache Storm Training
11141 Learners
- What is Dimension Reduction? | Know the techniques
- Difference between Data Lake vs Data Warehouse: A Complete Guide For Beginners with Best Practices
- What is Dimension Reduction? | Know the techniques
- What does the Yield keyword do and How to use Yield in python ? [ OverView ]
- Agile Sprint Planning | Everything You Need to Know