backgroundSince historical business data is stored in MySQL, there is an operation record table video_log. Whenever a user creates, updates, or an auditor reviews a record, a log will be added to the corresponding video_log. This log table only has insert. As you can imagine, one video corresponds to multiple logs. With 100,000 videos a day, an average of 5 logs for one video means 500,000 logs a day, 50 * 30 = 15 million records a month, and 1500 * 12 = 180 million records a year. There are currently more than 200 million pieces of data online. Since the log itself is not oriented to the C-end and is used to query problems, a slight delay can be tolerated. However, as time goes by, it will inevitably become slower and slower, affecting efficiency, so the idea of transformation was proposed. Solution 1: Back up old dataSince the log itself is not the most critical data, but it also requires high real-time performance (for real-time query problems), the initial idea was to keep the core basic storage unchanged and migrate the older data out. After all, the probability of suddenly querying operation records from a year ago is very small. If you need to check it suddenly, you can go offline. In terms of design, we only need a scheduled script to extract data at around 4 am every day (the off-peak period of business). The extracted data can be reported to some offline storage (generally companies have hive-based data warehouses, etc.), so that the online video_log data will not continue to grow. Option 2: Split tableSplitting tables is also a solution. Compared with Solution 1, the advantage is that all data can be queried in real time. The disadvantage is that the code needs to be modified.
The next step is to transform the code and solve the problem of reading and writing new and old data.
Solution 3: Migrate to tidb The disadvantages of Option 2 are quite obvious. What to do after 3 years? Continue to dismantle the meter? I feel like there is always a historical debt there. So we set our sights on tidb, which is a distributed database. Once we connect to tidb, we don’t need to worry about sharding tables. Tidb will do all of this for us, and it will expand the node capacity on its own. Because it is distributed, the primary key of tidb is unordered, which is very important.
Let’s focus on the pitfalls of synchronizing old data.Migrating to tidb seems simple, but there are actually several pitfalls hidden in the job script.
Final synchronization script solutionTaking into account the duplication of data, the efficiency of job restart, and the efficiency of the entire synchronization, I roughly made the following plan:
Finally, the data migration was completed smoothly through the four switching steps of solution 3 + efficient synchronization script. SummarizeThis is the end of this article about MySQL migration solutions and pitfalls. For more relevant MySQL migration solutions and pitfalls, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future! You may also be interested in:
|
<<: 25 CSS frameworks, tools, software and templates shared
I started working on my final project today, but ...
1. Add in package.json "main": "el...
Introduction to Nginx dynamic and static separati...
<tr> <th width="12%">AAAAA&l...
1. How to use the link: Copy code The code is as f...
The Document Object Model (DOM) is a platform, a ...
MySQL v5.7.19 official version (32/64 bit install...
I chose node.js to build the server. Friends who ...
Table of contents Install mockjs in your project ...
Linux finds a specific program where is The where...
Table of contents 1. Project construction 2: Dire...
This article briefly describes how to use Docker ...
This blog post is about a difficulty encountered ...
This article example shares the specific code of ...
Frequently asked questions Access denied for user...