Elasticsearch is used as a supplement to relational databases, such as MySQL and GaussDB(for MySQL), to improve the full-text search and high-concurrency ad hoc query capabilities of the databases.
This chapter describes how to synchronize data from a MySQL database to CSS to accelerate full-text search and ad hoc query and analysis. The following figure shows the solution process.
CREATE TABLE `student` ( `dsc` varchar(100) COLLATE utf8mb4_general_ci DEFAULT NULL, `age` smallint unsigned DEFAULT NULL, `name` varchar(32) COLLATE utf8mb4_general_ci NOT NULL, `id` int unsigned NOT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
INSERT INTO student (id,name,age,dsc) VALUES ('1','Jack Ma Yun','50','Jack Ma Yun is a Chinese business magnate, investor and philanthropist.'), ('2','will smith','22','also known by his stage name the Fresh Prince, is an American actor, rapper, and producer.'), ('3','James Francis Cameron','68','the director of avatar');
PUT student { "settings": { "number_of_replicas": 0, "number_of_shards": 3 }, "mappings": { "properties": { "id": { "type": "keyword" }, "name": { "type": "short" }, "age": { "type": "short" }, "desc": { "type": "text" } } } }
Configure number_of_shards and number_of_replicas as needed.
Module |
Parameter |
Suggestion |
---|---|---|
Create Synchronization Instance > Synchronize Instance Details |
Network Type |
Select VPC. |
Source DB Instance |
Select the RDS for MySQL instance to be synchronized, that is, the MySQL database that stores service data. |
|
Synchronization Instance Subnet |
Select the subnet where the synchronization instance is located. You are advised to select the subnet where the database instance and the CSS cluster are located. |
|
Configure Source and Destination Databases > Destination Database |
VPC and Subnet |
Select the VPC and subnet of the CSS cluster. |
IP Address or Domain Name |
Enter the IP address of the CSS cluster. For details, see Obtaining the IP address of a CSS cluster. |
|
Database Username and Database Password |
Enter the administrator username (admin) and password of the CSS cluster. |
|
Encryption Certificate |
Select the security certificate of the CSS cluster. If SSL Connection is not enabled, you do not need to select any certificate. For details, see Obtaining the security certificate of a CSS cluster. |
|
Set Synchronization Task |
Flow Control |
Select No. |
Synchronization Object Type |
Deselect Table structure, because the indexes matching MySQL tables have been created in the CSS cluster. |
|
Synchronization Object |
Select Tables. Select the database and table name corresponding to CSS. NOTE:
Ensure the type name in the configuration item is the same as the index name, that is, _doc. |
|
Process Data |
- |
Click Next. |
After the synchronization task is started, wait until the Status of the task changes from Full synchronization to Incremental, indicating real-time synchronization has started.
Run the following command in Kibana of CSS to check whether full data has been synchronized to CSS:
GET student/_search
INSERT INTO student (id,name,age,dsc) VALUES ('4','Bill Gates','50','Gates III is an American business magnate, software developer, investor, author, and philanthropist.')
Run the following command in Kibana of CSS to check whether new data is synchronized to CSS:
GET student/_search
UPDATE student set age='55' WHERE id=4;
Run the following command in Kibana of CSS to check whether the data is updated in CSS:
GET student/_search
Run the following command in Kibana of CSS to check whether the data is deleted synchronously from CSS:
GET student/_search
For example, run the following command to query the data that contains avatar in dsc in CSS:
GET student/_search { "query": { "match": { "dsc": "avatar" } } }
For example, query philanthropist whose age is greater than 40 in CSS.
GET student/_search { "query": { "bool": { "must": [ { "match": { "dsc": "philanthropist" } }, { "range": { "age": { "gte": 40 } } } ] } } }
For example, use CSS to collect statistics on the age distributions of all users.
GET student/_search { "size": 0, "query": { "match_all": {} }, "aggs": { "age_count": { "terms": { "field": "age", "size": 10 } } } }
If the cluster has only one node, the IP address and port number of only one node are displayed, for example, 10.62.179.32:9200. If the cluster has multiple nodes, the IP addresses and port numbers of all nodes are displayed, for example, 10.62.179.32:9200,10.62.179.33:9200.