Data migration needs to be imported from MySQL to ClickHouse. The summary plan is as follows, including three methods supported by ClickHouse itself and two third-party tools. create table engin mysql CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster] ( name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1] [TTL expr1], name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2] [TTL expr2], ... INDEX index_name1 expr1 TYPE type1(...) GRANULARITY value1, INDEX index_name2 expr2 TYPE type2(...) GRANULARITY value2 ) ENGINE = MySQL('host:port', 'database', 'table', 'user', 'password'[, replace_query, 'on_duplicate_clause']); Official documentation: https://clickhouse.yandex/docs/en/operations/table_engines/mysql/ Note that the actual data is stored in a remote MySQL database, which can be understood as a foreign table. You can verify this by adding and deleting data in MySQL. insert into select from -- Create a table first CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster] ( name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1], name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2], ... ) ENGINE = engine -- Import data INSERT INTO [db.]table [(c1, c2, c3)] select column or * from mysql('host:port', 'db', 'table_name', 'user', 'password') You can customize the column type and number of columns, and use clickhouse functions to process data, such as select toDate(xx) from mysql("host:port","db","table_name","user_name","password") create table as select from CREATE TABLE [IF NOT EXISTS] [db.]table_name ENGINE = Log AS SELECT * FROM mysql('host:port', 'db', 'article_clientuser_sum', 'user', 'password') Netizen article: http://jackpgao.github.io/2018/02/04/ClickHouse-Use-MySQL-Data/ Custom columns are not supported, and the It can be understood as a combination of Altinity company open sourced a Python tool for migrating data from MySQL to ClickHouse (supports incremental updates of binlog and full import), but the official readme is out of sync with the code, and it does not work according to the quick start. ## Create table clickhouse-mysql \ --src-host=127.0.0.1 \ --src-user=reader \ --src-password=Qwerty1# \ --table-templates-with-create-database \ --src-table=airline.ontime > create_clickhouse_table_template.sql ## Modify the script vim create_clickhouse_table_template.sql ## Import and create table clickhouse-client -mn < create_clickhouse_table_template.sql ## Import data into clickhouse-mysql \ --src-host=127.0.0.1 \ --src-user=reader \ --src-password=Qwerty1# \ --table-migrate \ --dst-host=127.0.0.1 \ --dst-table=logunified \ --csvpool Official documentation: https://github.com/Altinity/clickhouse-mysql-data-reader#mysql-migration-case-1—migrate-existing-data Note that the above three methods are all imported from MySQL to ClickHouse. If the amount of data is large, it will put a lot of pressure on MySQL. The following are two offline methods (streamsets support both real-time and offline) csv ## Ignore creating table clickhouse-client \ -h host \ --query="INSERT INTO [db].table FORMAT CSV" < test.csv However, if the source data is of low quality, there will often be problems, such as special characters (delimiters, escape characters) or line breaks. I was badly cheated. Custom delimiter, --format_csv_delimiter="|" When an error occurs, skip it instead of aborting. --input_format_allow_errors_num=10 allows up to 10 rows of errors, --input_format_allow_errors_ratio=0.1 allows 10% errors. csv skips null values and reports Code: 27. DB::Exception: Cannot parse input: expected , before: xxxx: (at row 69) ERROR: garbage after Nullable(Date): "8,002<LINE FEED>0205" sed ' :a;s/,,/,\\N,/g;ta' |clickhouse-client -h host --query "INSERT INTO [db].table FORMAT CSV" Replace ,, with ,\N, python clean_csv.py --src=src.csv --dest=dest.csv --chunksize=50000 --cols --encoding=utf-8 --delimiter=, clean_csv.py Refer to my other article 032-csv file fault tolerance processing streamsets Streamsets supports full import from MySQL or reading CSV, and also supports incremental insertion by subscribing to binlog. Please refer to my other article 025-Big Data ETL Tool StreamSets Installation and Subscription to MySQL Binlog. This article only shows how to import clickhouse from mysql This article assumes that you have already set up the streamsets service Enable and restart the service Upload the jdbc jar and dependency packages of mysql and clickhouse Convenient way, create pom.xml and use maven to download it uniformly <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.anjia</groupId> <artifactId>demo</artifactId> <packaging>jar</packaging> <version>1.0-SNAPSHOT</version> <name>demo</name> <url>http://maven.apache.org</url> <dependencies> <dependency> <groupId>ru.yandex.clickhouse</groupId> <artifactId>clickhouse-jdbc</artifactId> <version>0.1.54</version> </dependency> <dependency> <groupId>mysql</groupId> <artifactId>mysql-connector-java</artifactId> <version>5.1.47</version> </dependency> </dependencies> </project> If Maven is installed locally, execute the following command mvn dependency:copy-dependencies -DoutputDirectory=lib -DincludeScope=compile All required jar files will be downloaded and copied to the lib directory. Then copy it to Restart the streamsets service
Summarize The above are the 5 methods of migrating MySQL to ClickHouse introduced by the editor. I hope it will be helpful to everyone. If you have any questions, please leave me a message and the editor will reply to you in time. I would also like to thank everyone for their support of the 123WORDPRESS.COM website! You may also be interested in:
|
<<: How to deploy SpringBoot project using Docker
>>: ElementUI implements the el-form form reset function button
I recently helped someone with a project and the ...
Using the image service deployed by docker stack,...
This article example shares the specific code for...
Table of contents 1. Easy to read code 1. Unified...
First, let me give you an example (if you don’t w...
We often need to control the hidden, transparent ...
1: Check the PHP version after entering the termi...
This database query statement is one of 50 databa...
Table of contents 1. Introduction 2. Main text 2....
mysql installer community 8.0.16.0 installation g...
Preface I am currently working on a high-quality ...
The road ahead is long and arduous, but I will co...
Table of contents The basic principles of Vue'...
1. Use of alias The alias command is used to set ...
Database transaction isolation level There are 4 ...