This article summarizes some common MySQL optimization methods and shares them briefly. It aims to help companies that do not have a full-time MySQL DBA to do basic optimization work. As for specific SQL optimization, most of them can be achieved by adding appropriate indexes. More complex ones require specific analysis. You can refer to some optimization cases on this site or contact us. 1. Hardware layer related optimization 1.1、CPU related
In the server BIOS settings, you can adjust the following configurations to maximize CPU performance or avoid classic NUMA problems: 1. Select Performance Per Watt Optimized (DAPC) mode to maximize the CPU performance. Don't consider power saving when running DB services that usually require high computing power. 2. Turn off options such as C1E and C States to improve CPU efficiency;
3. Select Maximum Performance for Memory Frequency. 4. In the memory settings menu, enable Node Interleaving to avoid NUMA issues;
1.2 Disk I/O related
The following are some measures that can be used to optimize disk I/O, sorted by the magnitude of IOPS performance improvement: 1. Use SSD or PCIe SSD devices to increase IOPS by at least hundreds or even tens of thousands of times; 2. Purchasing array cards equipped with CACHE and BBU modules can significantly increase IOPS (mainly mechanical disks, excluding SSDs or PCIe SSDs. At the same time, it is necessary to regularly check the health of CACHE and BBU modules to ensure that data is not lost in the event of an accident); 3. When there is an array card, set the array write policy to WB, or even FORCE WB (if there is dual power protection, or the data security requirements are not particularly high). It is strictly forbidden to use the WT policy. And the closed array pre-reading strategy is basically useless; 4. Choose RAID-10 instead of RAID-5 whenever possible; 5. If you use a mechanical disk, try to choose a high-speed disk, such as a 15KRPM disk instead of a 7.2KRPM disk. You can save a few bucks.
2. System-level optimization
2.1. File system layer optimization
At the file system level, the following measures can significantly improve IOPS performance: 1. Use deadline/noop I/O schedulers, and never use cfq (it is not suitable for running DB services);
2. Use xfs file system, never use ext3; ext4 is barely usable, but if the business volume is large, you must use xfs; 3. Added the following options to the file system mount parameters: noatime, nodiratime, nobarrier (nobarrier is unique to the xfs file system);
2.2. Other kernel parameter optimization
The purpose of setting appropriate values for key kernel parameters is to reduce the tendency of swapping and prevent large fluctuations in memory and disk I/O, which would lead to instantaneous peak loads: 1. Set vm.swappiness to around 5-10, or even 0 (be careful to set it to 0 for RHEL 7 and above, unless you allow OOM kill to occur) to reduce the chance of using SWAP;
2. Set vm.dirty_background_ratio to 5-10 and vm.dirty_ratio to about twice that value to ensure that dirty data can be continuously flushed to disk and avoid instantaneous I/O writes that cause severe waiting (similar to innodb_max_dirty_pages_pct in MySQL); 3. Set net.ipv4.tcp_tw_recycle and net.ipv4.tcp_tw_reuse to 1 to reduce TIME_WAIT and improve TCP efficiency; 4. As for the two parameters read_ahead_kb and nr_requests that are circulated online, after testing, I found that they have little impact on the OLTP environment with mixed reads and writes (it should be more effective in read-sensitive scenarios). However, maybe there is a problem with my testing method. You can consider whether to adjust it at your own discretion;
3. MySQL layer related optimization
3.1. About version selection
We call the official version ORACLE MySQL. There is nothing much to say about this, and I believe most people will choose it. I personally strongly recommend choosing the Percona branch version, which is a relatively mature and excellent MySQL branch version that has made many improvements in performance, reliability, and manageability. It is basically fully compatible with the official ORACLE MySQL version, and its performance has been improved by more than 20%, so I recommend it first, and I have been using it since 2008. Another important branch version is MariaDB. It is actually not appropriate to say that MariaDB is a branch version because its goal is to replace ORACLE MySQL. It mainly makes a lot of source code-level improvements on the original MySQL Server layer, and is also a very reliable and excellent branch version. However, this also resulted in new features such as GTID that are incompatible with the official version (MySQL 5.7 and later also supports dynamic online enabling or disabling of the GTID mode). Considering that most people will still follow the official version, MariaDB is not recommended as a priority. 3.2. Suggestions on the most important parameter options
It is recommended to adjust the following key parameters to achieve better performance (you can use the my.cnf generator provided by this site to generate a configuration file template): 1. If you choose Percona or MariaDB version, it is strongly recommended to enable the thread pool feature, so that the performance will not drop significantly under high concurrency. In addition, there is the extra_port function, which is very practical and can save lives at critical moments. Another important feature is the QUERY_RESPONSE_TIME function, which also gives us an intuitive understanding of the overall SQL response time distribution; 2. Set default-storage-engine=InnoDB, which means that the InnoDB engine is used by default. It is strongly recommended not to use the MyISAM engine. The InnoDB engine can definitely meet more than 99% of business scenarios; 3. Adjust the innodb_buffer_pool_size. If it is a single instance and most of the tables are InnoDB engine tables, consider setting it to about 50% to 70% of the physical memory; 4. Set the values of innodb_flush_log_at_trx_commit and sync_binlog according to actual needs. If data loss is required, both are set to 1. If a little data loss is acceptable, you can set them to 2 and 10 respectively. If you don't care about data loss at all (for example, on a slave, the worst that can happen is to redo the data), you can set them all to 0. The three settings affect the performance of the database in the following ways: high, medium, and low. The first setting will make the database the slowest, while the last setting will make the database the slowest. 5. Set innodb_file_per_table = 1 and use a separate tablespace. I really can't think of any benefit of using a shared tablespace. 6. Set innodb_data_file_path = ibdata1:1G:autoextend. Do not use the default 10M, otherwise it will be greatly affected when there are high concurrent transactions. 7. Set innodb_log_file_size=256M and innodb_log_files_in_group=2, which can basically meet more than 90% of the scenarios; 8. Set long_query_time = 1. In versions above 5.5, it can be set to less than 1. It is recommended to set it to 0.05 (50 milliseconds) to record the SQL statements that execute slowly for subsequent analysis and troubleshooting. 9. According to the actual business needs, adjust max_connection (maximum number of connections) and max_connection_error (maximum number of errors, it is recommended to set them to more than 100,000, and the parameters open_files_limit, innodb_open_files, table_open_cache, and table_definition_cache can be set to about 10 times the size of max_connection; 10. A common mistake is to set tmp_table_size and max_heap_table_size to a relatively large value. I have seen settings of 1G. These two options are allocated for each connection session, so do not set them too large, otherwise it will easily cause OOM. Other connection session-level options such as sort_buffer_size, join_buffer_size, read_buffer_size, read_rnd_buffer_size, etc. should also be set not too large. 11. Since it is recommended not to use the MyISAM engine anymore, you can set the key_buffer_size to about 32M, and it is strongly recommended to turn off the query cache function;
3.3. Schema design specifications and SQL usage suggestions
Here are some common Schema design specifications and SQL usage suggestions that can help improve MySQL efficiency: 1. All InnoDB tables are designed with a non-business-use auto-increment column as the primary key. This is true for most scenarios. Not many InnoDB tables are truly read-only. If this is the case, it is more cost-effective to use TokuDB. 2. If the field length meets the requirements, choose a smaller length as much as possible. In addition, try to add NOT NULL constraints to field attributes to improve performance to a certain extent; 3. Avoid using TEXT/BLOB types as much as possible. If necessary, it is recommended to split them into sub-tables and not put them together with the main table to avoid poor read performance when using SELECT *. 4. When reading data, select only the required columns and do not use SELECT * every time to avoid serious random read problems, especially when reading some TEXT/BLOB columns; 5. When creating an index for a VARCHAR(N) column, it is usually sufficient to create a prefix index of about 50% (or even less) of its length to meet more than 80% of query requirements. There is no need to create a full-length index for the entire column. 6. Generally speaking, the performance of subqueries is relatively poor. It is recommended to convert them into JOIN expressions. 7. When querying multiple tables, the types of related fields should be as consistent as possible and all of them should have indexes; 8. When querying multiple tables, use the table with the smallest result set (note that this refers to the filtered result set, not necessarily the table with the smallest amount of data) as the driving table; 9. When multiple tables are joined and sorted, the sorting field must be in the driving table, otherwise the sorting column cannot use the index; 10. Use more composite indexes and fewer independent indexes, especially do not create independent indexes for columns with small cardinality (for example, the total number of unique values in the column is less than 255); 11. For SQL with paging function, it is recommended to associate with primary key first and then return the result set, which will be much more efficient;
3.4 Other suggestions
Other suggestions for managing and maintaining MySQL include: 1. Generally, the physical size of a single table does not exceed 10GB, the number of rows in a single table does not exceed 100 million, and the average length of a row does not exceed 8KB. If the machine performance is sufficient, MySQL can fully handle this amount of data and there is no need to worry about performance issues. This recommendation is mainly due to the high cost of ONLINE DDL. 2. Don't worry too much about the mysqld process taking up too much memory, as long as OOM kill does not occur and a large amount of SWAP is not used; 3. In the past, the purpose of running multiple instances on a single machine was to maximize the use of computing resources. If a single instance could already use up most of the computing resources, there was no need to run multiple instances. 4. Use pt-duplicate-key-checker regularly to check and delete duplicate indexes. Use the pt-index-usage tool regularly to check and delete indexes that are rarely used; 5. Collect slow query logs regularly and analyze them with the pt-query-digest tool. You can combine the Anemometer system to manage slow queries in order to analyze slow queries and perform subsequent optimization work. 6. You can use pt-kill to kill SQL requests that take too long. In the Percona version, there is an option innodb_kill_idle_transaction that can also achieve this function; 7. Use pt-online-schema-change to complete the ONLINE DDL requirements of large tables; 8. Use pt-table-checksum and pt-table-sync regularly to check and repair data differences between MySQL master-slave replication;
This optimization reference introduces applicable scenarios in most cases. If your application scenario is different from the one described in this article, it is recommended to make adjustments based on the actual situation instead of blindly copying it. You may also be interested in:- MySQL parameter related concepts and query change methods
- Python connection mysql method and common parameters
- pyMySQL SQL statement parameter passing problem, single parameter or multiple parameter description
- Python MySQL in parameterization description
- Python MySQLdb parameter passing method when executing sql statements
- Python MySQL datetime formatting as parameter operations
- Detailed explanation of the entry-level use of MySql stored procedure parameters
- Some notes on modifying the innodb_data_file_path parameter of MySQL
- Detailed explanation of MYSQL configuration parameter optimization
- Detailed explanation of table_open_cache parameter optimization and reasonable configuration under MySQL 5.6
- Let's talk about parameters in MySQL
|