1. Synchronization PrincipleBinlog log subscription based on MySQL: Binlog log is used by MySQL to record real-time changes in data There are two types of MySQL data synchronization to ES: full synchronization and incremental synchronization Full synchronization means that after the ES index is established for the first time, all the data in MySQL is imported into ES at once. Incremental synchronization means that new data is generated in MySQL. These new data include three situations: newly inserted data in MySQL, updated old data, and deleted data. The changes and additions of these data must be synchronized to ES. 2. logstash-input-jdbcThe official logstash plug-in is integrated in logstash. Just download logstash and synchronize mysql and elasticsearch data through the configuration file. advantage
shortcoming
3. go-mysql-elasticsearchgo-mysql-elasticsearch is a plug-in developed by a domestic author advantage
shortcoming
4. elasticsearch-jdbcThe latest version of elasticsearch-jdbc is 2.3.4, and the supported ElasticSearch version is 2.3.4, which is not implemented yet. advantage
shortcoming
5. Logstash-input-jdbc synchronizationThe first step is installation: After logstash5.x, the logstash-input-jdbc plug-in is integrated. After installing logstash, install the logstash-input-jdbc plugin through the command cd /logstash-6.4.2/bin ./logstash-plugin install logstash-input-jdbc The second step configuration: Create a new jdbc.conf in the logstash-6.4.2/config folder and configure it as follows Create a new jdbc.sql file in the logstash-6.4.2/config directory select * from t_employee Step 3: Run cd logstash-6.4.2 # Check whether the configuration file syntax is correct bin/logstash -f config/jdbc.conf --config.test_and_exit # Start bin/logstash -f config/jdbc.conf --config.reload.automatic --config.reload.automatic: Automatically reload the configuration file content View the synchronized data after creating the index in kibana PUT octopus GET octopus/_search 6. go-mysql-elasticsearch synchronizationStep 1: mysql binlog log go-mysql-elasticsearch adds, deletes, and modifies data in sync with elasticsearch through binlog logs in mysql MySQL's binlog log is mainly used for master-slave replication and data recovery of the database. The binlog records the operations of adding, deleting, modifying and querying data. During the master-slave replication process, the master database synchronizes the binlog log to the slave database, and the slave database replays the events in the binlog log to achieve master-slave synchronization. MySQL binlog has three modes: ROW: records each row of data that has been modified, but the log volume is too large. STATEMENT: records each SQL statement that modifies the data, which reduces the log volume, but the SQL statement is prone to master-slave inconsistency when using functions or triggers. MIXED: combines the advantages of ROW and STATEMENT, and chooses to use ROW or STATEMENT to record logs according to the specific SQL statement that performs data operations. To synchronize data to the ES cluster through MySQL binlog, you can only use ROW mode, because only ROW mode can know the modified content of the data in MySQL. Taking the UPDATE operation as an example, the binlog log content in ROW mode is as follows: SET TIMESTAMP=1527917394/*!*/; BEGIN /*!*/; # at 3751 #180602 13:29:54 server id 1 end_log_pos 3819 CRC32 0x8dabdf01 Table_map: `webservice`.`building` mapped to number 74 # at 3819 #180602 13:29:54 server id 1 end_log_pos 3949 CRC32 0x59a8ed85 Update_rows: table id 74 flags: STMT_END_F BINLOG ' UisSWxMBAAAARAAAAOsOAAAAAEoAAAAAAAEACndlYnNlcnZpY2UACGJ1aWxkaW5nAAYIDwEPEREG wACAAQAAAAHfq40= UisSWx8BAAAAggAAAG0PAAAAAEoAAAAAAAEAAgAG///A1gcAAAAAAAALYnVpbGRpbmctMTAADwB3 UkRNbjNLYlV5d1k3ajVbD64WWw+uFsDWBwAAAAAAAAtidWlsZGluZy0xMAEPAHdSRE1uM0tiVXl3 WTdqNVsPrhZbD64Whe2oWQ== '/*!*/; ### UPDATE `webservice`.`building` ###WHERE ### @1=2006 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @2='building-10' /* VARSTRING(192) meta=192 nullable=0 is_null=0 */ ### @3=0 /* TINYINT meta=0 nullable=0 is_null=0 */ ### @4='wRDMn3KbUywY7j5' /* VARSTRING(384) meta=384 nullable=0 is_null=0 */ ### @5=1527754262 /* TIMESTAMP(0) meta=0 nullable=0 is_null=0 */ ### @6=1527754262 /* TIMESTAMP(0) meta=0 nullable=0 is_null=0 */ ### SET ### @1=2006 /* LONGINT meta=0 nullable=0 is_null=0 */ ### @2='building-10' /* VARSTRING(192) meta=192 nullable=0 is_null=0 */ ### @3=1 /* TINYINT meta=0 nullable=0 is_null=0 */ ### @4='wRDMn3KbUywY7j5' /* VARSTRING(384) meta=384 nullable=0 is_null=0 */ ### @5=1527754262 /* TIMESTAMP(0) meta=0 nullable=0 is_null=0 */ ### @6=1527754262 /* TIMESTAMP(0) meta=0 nullable=0 is_null=0 */ # at 3949 #180602 13:29:54 server id 1 end_log_pos 3980 CRC32 0x58226b8f Xid = 182 COMMIT /*!*/; An example of binlog content in STATEMENT mode is as follows: SET TIMESTAMP=1527919329/*!*/; update building set Status=1 where Id=2000 /*!*/; # at 688 #180602 14:02:09 server id 1 end_log_pos 719 CRC32 0x4c550a7d Xid = 200 COMMIT /*!*/; From the log contents of UPDATE operations in ROW mode and STATEMENT mode, we can see that ROW mode completely records the values of all fields of a row of data to be modified before the update and the values of all fields after the change, while STATEMENT mode only records the SQL statement of the UPDATE operation. If we want to synchronize MySQL data to ES in real time, we can only select binlog in ROW mode, obtain and parse the data content of the binlog log, execute the ES document API, and synchronize the data to the ES cluster. View and modify binlog mode # View binlog modemysql> show variables like "%binlog_format%"; # Modify binlog modemysql> set global binlog_format='ROW'; # Check whether binlog is enabled mysql> show variables like 'log_bin'; # Enable binlog Modify the my.cnf file log-bin = mysql-bin Step 2 Installation # Install go sudo apt-get install go # Install godep go get github.com/tools/godep # Get the go-mysql-elasticsearch plugin go get github.com/siddontang/go-mysql-elasticsearch # Install the go-mysql-elasticsearch plugin cd go/src/github.com/siddontang/go-mysql-elasticsearch make Step 3 Configuration go/src/github.com/siddontang/go-mysql-elasticsearch/etc/river.toml # MySQL address, user and password # user must have replication privilege in MySQL. my_addr = "127.0.0.1:3306" # Basic mysql settings that need to be synchronized my_user = "root" my_pass = "root" # Elasticsearch address es_addr = "127.0.0.1:9200" # Local elasticsearch configuration # Path to store data, like master.info, and dump MySQL data data_dir = "./var" # data storage url # The following configuration is saved as default# Inner Http status address stat_addr = "127.0.0.1:12800" # pseudo server id like a slave server_id = 1001 #mysql or mariadb flavor = "mysql" #mysqldump execution path mysqldump = "mysqldump" #MySQL data source [[source]] schema = "test" //The corresponding database name when elasticsearch is synchronized with mysql # Only below tables will be synced into Elasticsearch. # To synchronize several tables in the test database. For some projects that use the table partitioning mechanism, we can use wildcards to match, for example, t_[0-9]{4} can match table t_0000 to t_9999. tables = ["t", "t_[0-9]{4}", "tfield", "tfilter"] # Below is for special rule mapping # For a table, we need to specify the type of the ES index to which its data should be synchronized. If not specified, we will use schema # name as the ES index and type by default [[rule]] schema = "test" //Database nametable = "t" //Table nameindex = "test" //Corresponding index nametype = "t" //Corresponding type name# Synchronize all tables that meet the format of t_[0-9]{4} to the ES with index test and type t. Of course, these tables need to ensure that the # schema is consistent [[rule]] schema = "test" table = "t_[0-9]{4}" index = "test" type = "t" # For table tfilter, we will only synchronize the id and name columns, and the others will not be synchronized filter = ["id", "name"] # The column id of table tfield is mapped to es_id, and tags are mapped to es_tags # list This field explicitly tells you that the corresponding column data needs to be converted into ES array type. This is now commonly used for MySQL's varchar # and other types. We may store data like "a, b, c", and then hope to convert it into a list format like [a, b, c] when synchronizing it to ES. [rule.field] # Map column `id` to ES field `es_id` id="es_id" # Map column `tags` to ES field `es_tags` with array type tags="es_tags,list" # Map column `keywords` to ES with array type keywords=",list" Step 4: Run cd go/src/github.com/siddontang/go-mysql-elasticsearch bin/go-mysql-elasticsearch -config=./etc/river.toml 7. Elasticsearch-jdbc synchronizationDownload the tool Unzip: unzip elasticsearch-jdbc-2.3.2.0-dist.zip Setting Environment Variables [root@autofelix /]# vi /etc/profile export JDBC_IMPORTER_HOME=/elasticsearch-jdbc-2.3.2.0 Make environment variables effective [root@autofelix /]# source /etc/profile Configuration reference Step 1: Create a new folder odbc_es in the root directory as follows [root@autofelix /]# ll /odbc_es/ drwxr-xr-x 2 root root 4096 Jun 16 03:11 logs -rwxrwxrwx 1 root root 542 Jun 16 04:03 mysql_import_es.sh Step 2: Create a new script mysql_import_es.sh with the following content [root@autofelix odbc_es]# cat mysql_import_es.sh '#!/bin/sh bin=$JDBC_IMPORTER_HOME/bin lib=$JDBC_IMPORTER_HOME/lib echo '{ "type" : "jdbc", "jdbc": { "elasticsearch.autodiscover":true, "elasticsearch.cluster":"my-application", #Cluster name, see: /usr/local/elasticsearch/config/elasticsearch.yml "url":"jdbc:mysql://10.8.5.101:3306/test", #mysql database address "user":"root", #mysql username "password":"123456", #mysql password "sql":"select * from cc", "elasticsearch" : { "host" : "10.8.5.101", "port" : 9300 }, "index" : "myindex", #new index "type" : "mytype" #new type } }'| java \ -cp "${lib}/*" \ -Dlog4j.configurationFile=${bin}/log4j2.xml \ org.xbib.tools.Runner \ org.xbib.tools.JDBCImporter Step 3: Add executable permissions to mysql_import_es.sh. [root@autofelix odbc_es]# chmod a+x mysql_import_es.sh Step 4: Execute the script mysql_import_es.sh [root@autofelix odbc_es]# ./mysql_import_es.sh This is the end of this article on how to synchronize Mysql data to Elasticsearch. For more information about synchronizing Mysql data to Elasticsearch, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future! You may also be interested in:
|
<<: border-radius is a method for adding rounded borders to elements
>>: Differences between Windows Server 2008R2, 2012, 2016, and 2019
Everyone knows that data in MySQL needs to be wri...
Table of contents Summarize <template> <...
1. Installation environment Here is also a record...
Table of contents transition hook function Custom...
<br />Navigation does not just refer to the ...
In this tutorial, we use the latest MySQL communi...
1. Disk partition: 2. fdisk partition If the disk...
1. Installation steps for MySQL 8.0.12 version. 1...
First look at the effect: Code: 1.html <div cl...
Causes and consequences 1. When using the ansible...
Copy code The code is as follows: <div content...
Table of contents Why use gzip compression? nginx...
Table of contents 1. Custom instructions 1. Regis...
This tutorial shares the installation and configu...
Preface In Windows, you can start multiple MySQL ...