introduction Have you ever encountered a situation where the interviewer asked How do you optimize your database? So how should we answer this question? Actually, the reason I wrote this topic is that I saw an article about database tuning knowledge forwarded by various public accounts these days (I won’t post the link). I flipped through it a few times and found that it kept saying that the database should be horizontally split. I would like to ask all readers, how many of you have experienced horizontal splitting? Many articles nowadays are so poor in practicality that they can only be said to be pure theoretical analysis. This article originally came from a question on Zhihu, and I improved it based on it. The first stage optimizes SQL and indexes This is the first stage of tuning, why? Because this step has the lowest cost and does not require any middleware. You haven't done any index optimization and SQL optimization, but you are trying to do horizontal splitting. Isn't this just cheating? What are the steps? Let me give you a rough idea. (1) Use slow query logs to locate SQL statements with low execution efficiency (2) Use explain to analyze the SQL execution plan (3) Identify the problem, take appropriate optimization measures, create indexes, etc. I won’t give examples because there are so many articles on how to optimize SQL that it would be tiring for readers to read them all. The second stage is to build a cache Only consider building a cache when optimizing SQL cannot solve the problem. After all, the purpose of using cache is to cache complex, time-consuming, and infrequently changing execution results to reduce database resource consumption. It should be noted here that after building the cache, the complexity of the system increases. You need to consider many issues, such as: Cache and database consistency issues? (For example, whether to add cache or delete cache), you can refer to my article "Analysis of Database and Cache Dual-Write Consistency Scheme" for this. The third stage of read-write separation If caching doesn’t work, use master-slave replication and read-write separation. At the application layer, read and write requests are distinguished. Or use ready-made middleware such as mycat or altas to separate reading and writing. It should be noted that as long as you dare to say that you use the master-slave architecture, there are three problems you need to prepare for: (1) What are the benefits of master-slave relationship? Answer: Implement database backup, implement database load balancing, and improve database availability (2) The master-slave principle? Answer: As shown in the picture (the picture is not drawn by myself, I am lazy) The master database has a log dump thread that passes the binlog to the slave database The slave database has two threads, an I/O thread and an SQL thread. The I/O thread reads the binlog content from the master database and writes it to the relay log. The SQL thread reads the content from the relay log and writes it to the slave database. (3) How to solve the master-slave consistency problem? Answer: I do not recommend solving this problem at the database level. According to the CAP theorem, the master-slave architecture is a high-availability architecture that cannot meet consistency requirements. Even if you use synchronous replication mode or semi-synchronous replication mode, it is weak consistency, not strong consistency. Therefore, it is recommended to use cache to solve this problem. The steps are as follows: 1. Calculate the master-slave delay time through testing. It is recommended to use MySQL version 5.7 or later, because MySQL has a more complete multi-threaded replication function since 5.7, which can generally ensure that the delay is within 1s. But then again, MySQL is now up to version 8.x, is there anyone still using version 5.x? 2. For database write operations, write to the database first and then write to the cache, but the validity period is very short, which is slightly longer than the master-slave delay. 3. When reading a request, read the cache first. If the cache does not exist (the master-slave synchronization has been completed at this time), then read the database. The fourth stage uses the partition table To be honest, you can actually skip this stage during the interview. Because many Internet companies do not recommend the use of partition tables, I myself do not recommend the use of partition tables either. There are too many pitfalls in using this partition table. Here are some responses from other articles: What is a partition table in mysql? Answer: All data is still in one table, but the physical storage is placed in different files according to certain rules. This is a function supported by MySQL, and the business code does not need to be changed. However, the SQL statement needs to be changed, and the SQL condition needs to include the partition column. shortcoming (1) The partition key design is not flexible. If the partition key is not used, it is easy to cause a full table lock (2) When using ALTER TABLE ... ORDER BY on a partitioned table, order by can only be performed within each partition. (3) If you create an index on the partition key of a partitioned table, the index will also be partitioned. There is no such thing as a global index on a partition key. (4) You can divide the database and tables by yourself, control the business scenarios and access modes by yourself, and it is controllable. For the partition table, the R&D team wrote a SQL statement, but was not sure which partition to check, which was not very controllable. Stage 5: Vertical Split If the above four stages are not completed, then vertical splitting will be performed. The complexity of vertical splitting is still smaller than that of horizontal splitting. Split your table into different small tables according to modules. Everyone should have read "The Evolution of Large Website Architecture". This type of articles or books basically mention this stage. (1) Place infrequently used fields in a separate table. (2) Put commonly used fields in a separate table (3) Columns that are frequently queried in combination are placed in one table (joint index). Stage 6: Horizontal Split OK, horizontal split is the most troublesome stage. There will be many problems after the split. I emphasize again that horizontal split must be the last choice. In a sense, I think it would be better to split it vertically. Because after you use vertical splitting to divide into different modules, you find that the pressure of a single module is too great, you can completely optimize the module separately, such as improving the machine configuration of the module. If it is a horizontal split into two tables, the code needs to be changed, and then it is found that two tables are not enough, so the code is changed again and it is split into three tables? The coupling between horizontal split modules is too strong and the cost is too high, so it is not particularly recommended. The above is the full content of this article. I hope it will be helpful for everyone’s study. I also hope that everyone will support 123WORDPRESS.COM. You may also be interested in:
|
<<: Dynamically edit data in Layui table row
>>: Implementation ideas for docker registry image synchronization
WebService Remote Debugging In .NET, the remote d...
This is because the database server is set to aut...
1. Pull the image First, execute the following co...
Recently, some friends said that after installing...
Logo optimization: 1.The logo image should be as ...
Effect The pictures in the code can be changed by...
I started configuring various environments this a...
Preface I recently wrote a small demo. Because I ...
Table of contents Preface 👀 Start researching 🐱🏍...
1. Introduction to gitlab Gitlab official address...
Preface Previously, static IPs assigned using pip...
As shown in the following figure: When we use vir...
Is it the effect below? If so, please continue re...
Introduction MySQL should be a very common databa...
Table of contents Environmental Description Insta...