Let me tell you about a recent case. A game log library on the online Alibaba Cloud RDS recently had some problems. As the number of game players increased, the amount of data in the online log library became larger and larger. The latest log library has reached 50G in size, and the time for online changes is very long. The reason why it was not discovered before was that the log library had not been changed before. However, as the business deepened, it was necessary to add some game attributes and change the previous log library. As a result, the long maintenance window made both the business party and the DBA reluctant to do so, and log optimization was imminent. First, let’s look at the log library: 1. There are 5 large tables in the log library with a data volume greater than 50 million; 2. The monthly data volume of these five tables is about 20 million before the data is opened, and it will be more after the data is opened; 3. The index size of two tables has exceeded the data file size We asked the business side and operations about their requirements for these tables, which are as follows: 1. Keep the data of the last three months, and the rest of the data can be circulated to avoid affecting the performance of online business. Data from 2 to 3 months ago should be transferred to a local database that can support queries. The query speed should not be too slow, and query speeds of minutes are acceptable. 3. During the migration process, the log library can tolerate table data loss for a few minutes, and does not have high requirements for the real-time synchronization of data. 4. The online log library needs to support statistics such as user activity 5. Do not want to perform database and table sharding. There are many SQL operations to query the last few months. There is a certain coupling between the tables. Splitting the tables is not conducive to related operations. Based on the above analysis and combined with the actual situation, the initial plan is: 1. Rename the table in the online database game_log, and then recreate the original table. This process is not continuous and may lose a few seconds of data. The specific operations are as follows: #The first step is to rename table game_log.table to game_log_bak.table; #The second step is to get the table structure, the important thing is the value of auto_increment, #Ensure that there will be no conflicts when importing data within three months latershow create table game_log_bak.table\G #The third step is to recreate the table structure of the second step in the game_log library 2. Transfer the data in the renamed game_log_bak library to the local offline database, which uses the infobright storage engine to support fast query of offline data. 3. Back up and clean up the data in the online table that is older than 3 months, which is about 40G, and re-import the data within 3 months (about 10G) in the online game_log_bak database into the game_log database. The structure becomes: 4. Delete the game_log_bak library and build a read-only slave library to synchronize the information of the game_log library from the master library in real time, as follows: 5. Synchronize data from a local read-only slave, such as the local infobright database. The synchronization method can be selected using the dataX tool, as follows: 6. Set up scheduled tasks to clean up expired online data at a certain period to ensure that only the data from the last three months is retained online, which will not put pressure on the disk storage space of RDS. There are currently several problems with this method: 1. Regularly clean up online data. The table space occupied by this data cannot be reclaimed immediately, which may cause data table fragmentation problems. 2. If the scale of the game increases in the future, there may still be problems with this issue. At that time, you can adjust the cleanup cycle of the log table appropriately. If the amount of data is too large, you can consider other solutions to deal with it. Looking back, there are still some problems in the design of the table. The log table should only record flow data, and try to avoid related queries. In other words, you can evaluate the data volume in advance, and then use quarterly or monthly tables to handle such a large number of logs. This may be much more convenient when cleaning and maintaining. The above is the details of the MySQL online log library migration example. For more information about MySQL online log library migration, please pay attention to other related articles on 123WORDPRESS.COM! You may also be interested in:
|
<<: Detailed explanation of js event delegation
>>: Docker commands are implemented so that ordinary users can execute them
Read uncommitted example operation process - Read...
1. Use ansible's playbook to automatically in...
1. Enter the container docker run [option] image ...
Introduction To put it simply, tcpdump is a packe...
Omit the protocol of the resource file It is reco...
1 Keep the rpm package downloaded when yum instal...
Thanks to the development of the Internet, we can...
Table of contents Drag and drop implementation Dr...
Problem Description Install nginx on Tencent Clou...
Therefore, we made a selection of 30 combinations ...
Table of contents 1. Anonymous slots 2. Named slo...
Table of contents 1. Introduction 2. Create a Vit...
<br />In order to clearly distinguish the ta...
Table of contents 1. MySQL time type 2. Check the...
Table of contents Angular accomplish Calling orde...