A mature database architecture is not designed with high availability, high scalability and other features from the beginning. It is only as the number of users increases that the infrastructure is gradually improved. This article mainly discusses the problems and optimization solutions faced by MySQL database in its development cycle. Putting aside the front-end application, it can be roughly divided into the following five stages: Phase 1: Database table design After the project is approved, the development department will develop the project according to the needs of the product department. Phase 2: Database deployment It's time for the operations engineer to come on board and launch the project. Phase 3: Database performance optimization If MySQL is deployed on an ordinary X86 server, without any optimization, MySQL can theoretically handle about 1500 QPS. After optimization, it may be increased to about 2000 QPS. Otherwise, when the number of visits reaches about 1,500 concurrent connections, the database processing performance may respond slowly, and the hardware resources are relatively abundant. At this time, it is time to consider performance optimization issues. So how can we maximize the performance of the database? It mainly focuses on hardware configuration, database configuration, and architecture, which are specifically divided into the following: 3.1 Hardware Configuration If conditions permit, you must use SSD solid-state drives instead of SAS mechanical hard drives and adjust the RAID level to RAID1+0, which has better read and write performance than RAID1 and RAID5. After all, the pressure on the database mainly comes from disk I/O. 3.2 Database Configuration Optimization There are two most widely used storage engines in MySQL: one is MyISAM, which does not support transaction processing, has fast read performance, and uses table-level locks. The other is InnoDB, which supports transaction processing (ACID properties) and is designed for large data processing and row-level locking. max_connections = 151 # Process the maximum number of connections at the same time. It is recommended to set the maximum number of connections to about 80% of the upper limit of connections. sort_buffer_size = 2M # The buffer size when querying sorting, which only works for order by and group by, is recommended to be increased to 16M open_files_limit = 1024 # Limit the number of open files. If the value viewed by show global status like 'open_files' is equal to or greater than the open_files_limit value, the program will be unable to connect to the database or will freeze. MyISAM parameter default values: key_buffer_size = 16M # Index buffer size, generally set to 30-40% of physical memory read_buffer_size = 128K # Read operation buffer size, it is recommended to set 16M or 32M query_cache_type = ON # Enable query cache function query_cache_limit = 1M # Query cache limit, only query results below 1M will be cached to prevent the result data from overwriting the cache pool query_cache_size = 16M # Check the buffer size, which is used to cache SELECT query results. The next time the same SELECT query is run, the results will be returned directly from the cache pool. You can increase this value by multiples. InnoDB parameter default values: innodb_buffer_pool_size = 128M # Index and data buffer size, it is recommended to set it to about 70% of physical memory innodb_buffer_pool_instances = 1 # Number of buffer pool instances. It is recommended to set 4 or 8 innodb_flush_log_at_trx_commit = 1 # Key parameter. 0 means that the log is written and synchronized to the disk approximately every second. A database failure will result in the loss of about 1 second of transaction data. 1. After each SQL statement is executed, it is written to the log and synchronized to the disk. This has a large I/O overhead. After executing the SQL statement, you have to wait for the log to be read and written, which is inefficient. 2 means that the log is only written to the system cache and then synchronized to the disk every second. This is very efficient. If the server fails, the transaction data will be lost. For those who do not have very high requirements for data security, setting 2 is recommended. It has high performance and the effect is obvious after modification. innodb_file_per_table = OFF # Whether to share the tablespace. The default setting for versions 5.7+ is ON. The idbdata file in the shared tablespace keeps growing, affecting the I/O performance to a certain extent. It is recommended to enable independent tablespace mode. The index and data of each table are stored in their own independent tablespace, which allows a single table to be moved between different databases. innodb_log_buffer_size = 8M # Log buffer size. Since the log is refreshed once a second at most, it usually does not need to exceed 16M. 3.3 System kernel parameter optimization Most MySQL servers are deployed on Linux systems, so some parameters of the operating system will also affect MySQL performance. The following is an appropriate optimization of Linux kernel parameters. net.ipv4.tcp_fin_timeout = 30 #TIME_WAIT timeout, the default is 60s net.ipv4.tcp_tw_reuse = 1 # 1 means enabling reuse, allowing TIME_WAIT sockets to be reused for new TCP connections, 0 means disabling net.ipv4.tcp_tw_recycle = 1 # 1 means enabling fast recycling of TIME_WAIT sockets, 0 means disabling net.ipv4.tcp_max_tw_buckets = 4096 # The system maintains the maximum number of TIME_WAIT sockets. If this number is exceeded, the system will randomly clear some TIME_WAIT and print a warning message net.ipv4.tcp_max_syn_backlog = 4096 # Enter the maximum length of the SYN queue. Increasing the queue length can accommodate more waiting connections. In Linux systems, if the number of file handles opened by a process exceeds the system default value of 1024, a "too many files open" message will be prompted, so the open file handle limit needs to be adjusted. Restart to make it permanent: # vi /etc/security/limits.conf * soft nofile 65535 * hard nofile 65535 Effective immediately for the current user: # ulimit -SHn 65535 Phase 4: Database Architecture Expansion As the business volume grows, the performance of a single database server can no longer meet business needs, and it is time to consider adding a server expansion architecture. The main idea is to decompose the load of a single database, break through the disk I/O performance, store hot data in the cache, and reduce the frequency of disk I/O access. 4.1 Increase cache Add a cache system to the database to cache hot data in memory. If the requested data is in the cache, MySQL will no longer be requested, reducing the database load. There are two types of cache implementation: local cache and distributed cache. Local cache caches data in the local server memory or file. Distributed cache can cache massive amounts of data and has good scalability. The mainstream distributed cache systems are: memcached and redis. Memcached has stable performance, data is cached in memory, and the speed is very fast. The theoretical QPS can reach about 80,000. If you want data persistence, choose redis, the performance is not lower than memcached. 4.2 Master-slave replication and read-write separation In a production environment, business systems usually read more and write less, so a master-slave architecture can be deployed. The master database is responsible for write operations and performs dual-machine hot standby, and multiple slave databases perform load balancing and are responsible for read operations. Mainstream load balancers: LVS, HAProxy, Nginx. 4.3 Database Sharding Database sharding is to separate related tables in the database into different databases according to the business, such as web, bbs, blog and other databases. If the business volume is large, the separated database can be used as a master-slave replication architecture to further avoid excessive pressure on a single database. 4.4 Sub-table The amount of data is increasing dramatically. A table in the database has millions of data, which makes query and insertion take too long. How can we solve the pressure of a single table? You should consider splitting this table into multiple small tables to reduce the pressure on a single table and improve processing efficiency. This method is called table sharding. 4.5 Partitioning Partitioning is to divide the data of a table into multiple blocks according to the fields in the table structure (such as range, list, hash, etc.). These blocks can be on one disk or on different disks. After partitioning, it is still a table on the surface, but the data is hashed in multiple locations. In this way, multiple hard disks can process different requests at the same time, thereby improving the disk I/O read and write performance. Phase 5: Database Maintenance Database maintenance is the job of database engineers or operation and maintenance engineers, including system monitoring, performance analysis, performance tuning, database backup and recovery and other major tasks. 5.1 Key indicators of performance status Technical terms: QPS (Queries Per Second) and TPS (Transactions Per Second)
So, here is the calculation method, calculate QPS based on Questions mysql> show global status like 'Questions'; mysql> show global status like 'Uptime'; QPS = Questions / Uptime Calculate TPS based on Com_commit and Com_rollback: mysql> show global status like 'Com_commit'; mysql> show global status like 'Com_rollback'; mysql> show global status like 'Uptime'; TPS = (Com_commit + Com_rollback) / Uptime Another way to calculate: Calculate QPS based on Com_select, Com_insert, Com_delete, and Com_update: mysql> show global status where Variable_name in('com_select','com_insert','com_delete','com_update'); Wait for 1 second and execute again to get the interval difference. The QPS is the value of each variable in the second time minus the corresponding variable value in the first time. TPS calculation method: mysql> show global status where Variable_name in('com_insert','com_delete','com_update'); When calculating TPS, we do not count the query operations, and only need to calculate the four values of insert, delete, and update. After netizens tested these two calculation methods, it was concluded that when there are many MyISAM tables in the database, using Questions calculation is more accurate. When there are many InnoDB tables in the database, it is more accurate to use Com_* for calculation. 5.2 Enable slow query log MySQL enables slow query logs to analyze which SQL statements are slow. Dynamic enabling is supported: mysql> set global slow-query-log=on # Enable slow query logmysql> set global slow_query_log_file='/var/log/mysql/mysql-slow.log'; # Specify the location of the slow query log filemysql> set global log_queries_not_using_indexes=on; # Record queries that do not use indexesmysql> set global long_query_time=1; # Only record slow queries with processing time of more than 1 second. To analyze slow query logs, you can use the mysqldumpslow tool that comes with MySQL. The analyzed logs are relatively simple. mysqldumpslow -t 3 /var/log/mysql/mysql-slow.log # To view the slowest three queries, you can also use Percona's pt-query-digest tool, which has comprehensive log analysis capabilities and can analyze slow log, binlog, and general log. Analyze the slow query log: pt-query-digest /var/log/mysql/mysql-slow.log Analyze binlog log: mysqlbinlog mysql-bin.000001 >mysql-bin.000001.sql pt-query-digest --type=binlog mysql-bin.000001.sql Analyze general logs: pt-query-digest --type=genlog localhost.log 5.3 Database backup Backing up the database is the most basic task and the most important one, otherwise the consequences will be serious, you know! For high-frequency backup strategies, it is crucial to choose a stable and fast tool. If the database size is less than 2G, it is recommended to use the official logical backup tool mysqldump. If the size is over 2G, it is recommended to use Percona's physical backup tool xtrabackup, otherwise it will be as slow as a snail. Both tools support hot standby under the InnoDB storage engine without affecting business read and write operations. 5.4 Database Repair Sometimes the MySQL server suddenly loses power or shuts down abnormally, which can cause table corruption and make it impossible to read table data. At this time, you can use the two tools that come with MySQL to repair it, myisamchk and mysqlcheck. The former can only repair MyISAM tables and stop the database, while the latter can repair both MyISAM and InnoDB tables online. myisamchk common parameters: -f --force Force repair, overwrite old temporary files, generally not used -r --recover Recovery mode -q --quik Quick recovery -a --analyze Analyze table -o --safe-recover Old recovery mode, if -r cannot repair, you can try this parameter -F --fast Only check tables that are not closed normally For example: myisamchk -r -q *.MYI Common parameters of mysqlcheck: -a --all-databases Check all databases -r --repair Repair table -c --check Check table, default option -a --analyze Analyze table -o --optimize Optimize table -q --quik Quickest check or repair table -F --fast Only check tables that are not closed normally For example: mysqlcheck -r -q -uroot -p123456 weibo 5.5 MySQL Server Performance Analysis Focus: KB_read/s, KB_wrtn/s The amount of data read and written per second, mainly evaluated based on the maximum read and write speed of the disk per second. r/s, w/s: The number of read and write requests per second, which can be understood as IOPS (input and output per second), is one of the main indicators for measuring disk performance. summary Due to the original design limitations of relational databases, they are unable to cope with large data processing. Therefore, NoSQL (non-relational database) has become popular. It is inherently inspiring and has the characteristics of distribution, high performance, and high reliability. It makes up for certain inherent deficiencies of relational databases and is very suitable for storing unstructured data. Mainstream NoSQL databases include: MongoDB, HBase, Cassandra, etc. The improvement of optimization effect at the database level alone is not very obvious. It is mainly necessary to choose the appropriate database according to the business scenario! This is the end of this article about simply understanding MySQL database optimization techniques. For more relevant MySQL database optimization techniques, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope everyone will support 123WORDPRESS.COM in the future! You may also be interested in:
|
<<: Summary of Vue's cross-domain problem handling and solutions
>>: VMware virtual machine three connection methods example analysis
Differences between Docker and Docker Machine Doc...
This article uses an example to describe how to u...
The most common mistake made by many website desi...
Prerequisites To run containers on Windows Server...
1. Download Dependency npm install @antv/data-set...
When using Oracle database for fuzzy query, The c...
1. Enter start in the menu bar and click startup ...
1. Demand The backend provides such data for the ...
Jupyter notebook is configured under the docker c...
The MySQL version used in this example is mysql-8...
1.1 What is MySQL multi-instance? Simply put, MyS...
When developing a web project, you need to instal...
1. SSH remote management SSH is a secure channel ...
Preface The database has always been my weak poin...
Table of contents Preface MySQL master-slave repl...