HBase Basic Principles

HBase undertakes data storage. HBase is an open source, column-oriented, distributed storage system that is suitable for storing massive amounts of unstructured or semi-structured data. It features high reliability, high performance, and flexible scalability, and supports real-time data read/write. For more information about HBase, see https://hbase.apache.org/.

Typical features of a table stored in HBase are as follows:

The HBase component of MRS separates computing from storage. Data can be stored in cloud storage services at low cost, for example, Object Storage Service (OBS), and can be backed up across AZs. MRS supports secondary indexes for HBase and allows adding indexes for column values to filter data by column through native HBase APIs.

HBase architecture

An HBase cluster consists of active and standby HMaster processes and multiple RegionServer processes, as shown in Figure 1.

Figure 1 HBase architecture
Table 1 Module description

Module

Description

Master

Master is also called HMaster. In HA mode, HMaster consists of an active HMaster and a standby HMaster.

  • Active Master: manages RegionServer in HBase, including the creation, deletion, modification, and query of a table, balances the load of RegionServer, adjusts the distribution of Region, splits Region and distributes Region after it is split, and migrates Region after RegionServer expires.
  • Standby Master: takes over services when the active HMaster is faulty. The original active HMaster demotes to the standby HMaster after the fault is rectified.

Client

Client communicates with Master for management and with RegionServer for data protection by using the Remote Procedure Call (RPC) mechanism of HBase.

RegionServer

RegionServer provides read and write services of table data as a data processing and computing unit in HBase.

RegionServer is deployed with DataNodes of HDFS clusters to store data.

ZooKeeper cluster

ZooKeeper provides distributed coordination services for processes in HBase clusters. Each RegionServer is registered with ZooKeeper so that the active Master can obtain the health status of each RegionServer.

HDFS cluster

HDFS provides highly reliable file storage services for HBase. All HBase data is stored in the HDFS.

HBase Principles