Learn MySQL in a simple way

Learn MySQL in a simple way

Preface

The database has always been my weak point. Combining my own experience (python+sqlalchemy), I will make a record. I prefer to use ORM. I always feel that spelling SQL is a painful thing (mainly because I am not good at SQL). I have also encountered some database pitfalls in previous maintenance projects, such as encoding problems, loss of floating-point precision, etc., to prevent repeating the pitfalls in the future.

Chapter 1: Using Help

Use the mysql built-in help command

  1. msyql> ? data types types
  2. mysql> ? int
  3. mysql> ? create table

Chapter 2: Selection of table type (storage engine)

The two most commonly used engines are:

1. Myisam is the default storage engine of MySQL. When create creates a new table and does not specify the storage engine of the new table, Myisam is used by default. Each MyISAM is stored in three files on disk. The file names are the same as the table names, and the extensions are .frm (to store table definition), .MYD (MYData, to store data), and .MYI (MYIndex, to store indexes). Data files and index files can be placed in different directories to evenly distribute IO and achieve faster speed.

2. The InnoDB storage engine provides transaction security with commit, rollback, and crash recovery capabilities. However, compared with the MyISAM storage engine, InnoDB writes less efficiently and takes up more disk space to retain data and indexes.

Common environment:

1. MyISAM: The default MySQL plug-in storage engine, which is one of the most commonly used storage engines in Web, data warehousing and other application environments

2. InnoDB: Used for transaction processing applications, with many features, including ACID transaction support.

Chapter 3: Choosing the right data type

First, select the appropriate storage engine and determine the appropriate data type based on the specified storage engine.

  • MyISAM: It is best to use fixed-length data columns instead of variable-length data columns.
  • InnoDB: varchar is recommended

Some data types to note:

1. char and varchar: The storage and retrieval methods are different, and the maximum length and whether the trailing space is retained are also different. char has a fixed length. If the length is insufficient, it will be padded with spaces. If PAD_CHAR_TO_FULL_LENGTH is not set when getting, the trailing spaces will be removed by default.
varchar variable-length string, trailing spaces are retained during retrieval. Note that the query is case-insensitive. If you use sqlalchemy to be case-sensitive, do not use func.binary function.

2. Text and blob: text and blob are updated or deleted in large quantities, large "holes" will be left. It is recommended to use the OPTIMIZE TABLE function to defragment such tables regularly. Avoid retrieving large blob or text values. Separate text and blob columns into separate tables.

3. Floating point number float and fixed point number decimal:

Note a few points:

1. Although floating-point numbers can represent a larger data range, they have error problems.

2. For issues that are sensitive to precision, such as currency, fixed-point storage should be used. I encountered some problems in previous projects, and ended up having to solve them by zooming in and out, which was ugly.

3. When programming, if you encounter floating point numbers, pay attention to the error problem and try to avoid floating point comparison (comparing floating point numbers requires the difference to be less than a specific precision). In Python 3.5, you can compare like this: float_eq = partial(math.isclose, rel_tol=1e-09, abs_tol=0.0)

4. Pay attention to the processing of some special values ​​in floating-point numbers.

Chapter 4: Character Sets

It is important to choose the right character set at the beginning, otherwise it will be very costly to change it later. Character set is a long-standing problem in Python 2, which confuses many novices. The project I maintained before used the default latin1 character set of msyql, which resulted in the need to manually encode the string into utf8 every time it was written. Recently, I used python3.5+flask to do projects and used utf8 directly, and I never encountered encoding problems again:

  • Create a database using utf8, CREATE DATABASE IF NOT EXISTS my_db default charset utf8 COLLATE utf8_general_ci;
  • The sqlalchemy connection url uses mysql://root:[email protected]:3306/my_db?charset=utf8. Don't worry about the garbled code.

Chapter 5: Design and use of indexes

All MySQL column types can be indexed, and using indexes on relevant columns is the best way to improve the performance of select operations. Principles of index design:

1. The index column being searched is not necessarily the column to be selected. The most suitable columns for indexing are those that appear in the where clause or are specified in a join clause, rather than those that appear in the select list following the select keyword.

2. Use a unique index. Indexing works well for columns with unique values, but poorly for columns with multiple duplicate values.

3. Use short indexes. If you index a string column, you should specify a prefix length whenever possible.

4. Use the leftmost prefix. When you create an n-column index, you actually create n indexes that MySQL can use. A multicolumn index can function as several indexes because the leftmost set of columns in the index can be used to match rows; such a set of columns becomes a leftmost prefix.

5. Don’t over-index. Indexes waste disk space and reduce write performance.

6. Consider the types of comparisons performed on the columns.

Chapter 6: Locking Mechanism and Transaction Control

The InnoDB engine provides row-level locks, supports two lock modes: shared locks and exclusive locks, and four different isolation levels. MySQL supports local transactions through statements such as AUTOCOMIT, START TRANSACTIONS, COMMIT, and ROLLBACK.

Chapter 7: Security Issues in SQL

SQL injection: Utilize the external interface of some databases to insert user data into the actual database operation language (sql), thereby achieving the purpose of invading the database or even the operating system. The main reason is that the program does not strictly filter the data input by the user, resulting in the execution of illegal database query statements. Preventive measures:

  1. prepareStatement = Bind-variable , do not use concatenated sql
  2. Using application-supplied conversion functions
  3. Custom function validation (form validation, etc.)

Chapter 8: SQL Mode and Related Issues

Change the default MySQL execution mode. For example, if an insert or update is incorrect under strict mode, MySQL will give an error and abandon the operation. set session sql_mode='STRICT_TRANS_TABLES' . Setting sql_mode requires application personnel to weigh various pros and cons and make an appropriate choice.

Chapter 9: Common SQL Skills

  1. Retrieve the row containing the maximum/minimum value: MAX([DISTINCE] expr), MIN([DISTINCE] expr)
  2. How to use rand()/rand(n) to extract random rows
  3. Use group by and with rollup clauses to do statistics
  4. Use bit group functions for statistics

Chapter 10: Other issues that require attention

Case sensitivity of database and table names: Different platforms and systems have different case sensitivity. The recommendation is to always use lowercase names.
Things to note when using foreign keys: InnoDB in MySQL supports checking of external keyword constraints.

Chapter 11: SQL Optimization

General steps to optimize SQL:

1. Use show status and application characteristics to understand the execution frequency of various SQL statements and the approximate execution ratio of various SQL statements. For example, the InnoDB parameter Innode_rows_read queries the number of rows returned, Innodb_rows_inserted executes the number of rows inserted by insert, and Innodb_rows_updated executes the number of rows updated. There are also several parameters: Connections attempts to connect to the MySQL server, Uptime server working time, Slow_queries number of slow queries.

2. Locate SQL statements with low execution efficiency. There are two ways: one is to locate statements with low execution efficiency through the slow query log. When started with the --log-slow-queries[=file_name] option, mysqld writes a log file containing all SQL statements whose execution time exceeds long_query_time seconds. The other is to use show processlist to view the current threads in MySQL, including the thread status, whether the table is locked, etc. You can view the SQL execution status in real time and optimize some table locking operations.

3. Analyze the execution plan of inefficient SQL through EXPLAIN: EXPLAIN can tell you when you must index the table to get a faster SELECT that uses the index to find records. The following is an explanation of the results obtained after EXPLAIN is executed:

  • select_type: select type
  • table: table of the output result set
  • type: Indicates the connection type of the table. When there is only one row in the table and the value of type is system, it is the best connection type; when an index is used to connect tables in a select operation, the value of type is ref; when the select table connection does not use an index, the value of type is often seen as ALL, indicating that the table is fully scanned. In this case, you need to consider improving the efficiency of table connection by creating an index.
  • possible_keys: indicates the index columns that can be used when querying.
  • key: indicates the index to be used
  • key_len: index length
  • rows: scan range
  • Extra: Notes and description of the implementation

4. Identify the problem and take corresponding optimization measures.

Indexing issues

  1. Index storage classification: The data files and index files of the myisam table are automatically separated, and the data and indexes of innodb are placed in the same table space. The index storage type of MyISAM and InnoDB is btree
  2. How MySQL uses indexes: Indexes are used to quickly find rows with a specific value in a column. The most important condition for using an index in a query is to use the index keyword in the query condition. If it is a multi-column index, the index can only be used if the leftmost prefix of the multi-column keyword is used in the query condition. Otherwise, the index cannot be used.
  3. Check the usage of the index: The value of Handler_read_key represents the number of times a row is indexed. A low value means that the index is not used frequently. A high Handler_read_rnd_next value means that queries are running inefficiently and should be remedied by creating indexes. show status like 'Handler_read%';

Two simple and practical optimization methods

  • Periodic analysis tables: ANALYZE TABLE, CHECK TABLE, CHECKSUM TABLE
  • Use OPTIMIZE table;

Optimization from the client (code side) perspective

  1. Use persistent connections to the database to avoid connection overhead. In the code, we generally use connection pool
  2. Check that all interludes use the necessary indexes.
  3. Avoid executing complex select queries on frequently updated tables to avoid problems related to locking tables due to read-write conflicts.
  4. Take advantage of default values ​​and insert values ​​explicitly only if they differ from the default. This reduces the amount of syntax analysis that MySQL needs to do, thereby increasing the insertion speed.
  5. Read-write separation improves performance
  6. Try not to use auto-increment variables in table fields to prevent the auto-increment of the field from affecting efficiency in high concurrency situations. It is recommended to implement auto-increment of fields through applications.

Chapter 12: Optimizing Database Objects

Optimize table data type: PROCEDURE ANALYZE() makes optimization suggestions based on the current table type. In practice, statistical information can be combined with actual application optimization.

Improve table access efficiency by splitting: The splitting here is mainly for MyISAM type tables.

  • Vertical splitting: Split the frequently accessed fields and infrequently accessed fields in the table into two tables according to the frequency of application access. Frequently accessed fields should be of fixed length as much as possible.
  • Horizontal splitting: According to the application situation, the data can be split horizontally into several tables or divided into multiple partitions through partitioning. This can effectively avoid the lock problem caused by reading and updating the MyISAM table.

Denormalization: Normalization design emphasizes independence and minimizes data redundancy. More redundancy means more physical space is occupied and also brings problems to data maintenance and consistency checks. Appropriate redundancy can reduce multi-table access and significantly improve query efficiency. In this case, you can consider improving efficiency through appropriate redundancy.

Use redundant statistics tables: Use create temporary table for statistical analysis

Choose a more appropriate table type: 1. If the application has serious lock conflicts, please consider whether to deliberately change the storage engine to InnoDB. The row lock mechanism can effectively reduce the occurrence of lock conflicts. 2. If the application has a lot of query operations and does not have strict requirements on transaction integrity, you can consider using Myisam.

Chapter 13: Locking Problems

Waiting for locks: table_locks_waited and table_locks_immediate status variables to analyze table lock contention on the system. Check Innode_row_lock to analyze row lock contention.

Chapter 14: Optimizing MySQL Server

View the current parameters of MySQL Server

  1. View the default values ​​of server parameters: mysqld --verbose --help
  2. To view the actual values ​​of server parameters: shell> mysqladmin variables or mysql> SHOW VARIABLES
  3. View the server running status value: mysqladmin extended-status or mysql>SHOW STATUS

Important parameters affecting MySQL performance

  1. key_buffer_size: key cache
  2. table_cache: The number of caches opened in the database
  3. innode_buffer_pool_size: The size of the memory buffer for caching InnoDB data and indexes
  4. innodb_flush_log_at_trx_commit: It is recommended to set it to 1. When each transaction is committed, the log buffer is written to the log file, and the log file is refreshed for disk operations.

Chapter 15: I/O Issues

Disk seeks are a huge performance bottleneck.

  1. Distribute I/O using disk arrays or virtual file volumes
  2. Distributing I/O using Symbolic Links

Chapter 16: Application Optimization

  1. Use connection pool: Establishing a connection is relatively costly, and establishing a connection pool can improve access performance.
  2. Reduce access to MySQL: 1. Avoid repeated retrieval of the same data. 2 Using MySQL query cache
  3. Add cache layer
  4. Load balancing: 1. Use MySQL to replicate and distribute query operations. 2 Distributed Database Architecture

Summarize

The above is the relevant content about MySQL. I hope that the content of this article can be of some help to everyone in learning or using MySQL. If you have any questions, you can leave a message to communicate.

You may also be interested in:
  • MySQL installation diagram MySQL graphic installation tutorial (detailed instructions)
  • Summary of MySQL date data type and time type usage
  • MySQL user creation and authorization method
  • Usage of replace in mySQL
  • Detailed explanation of mysql stored procedure
  • Detailed explanation of MYSQL import and export commands
  • Completely uninstall mysql (stop service, uninstall related programs, delete registry
  • Detailed explanation of MySQL trigger usage
  • MySQL Error Codes
  • MySql query time period method

<<:  How to handle images in Vue forms

>>:  Solution to the problem of slow docker pull image speed

Recommend

MySQL batch adding and storing method examples

When logging in to the stress test, many differen...

Solution to the automatic stop of MySQL service

This article mainly introduces the solution to th...

Using CSS to implement loading animation of Android system

There are two common loading icons on the web, on...

The whole process of developing a Google plug-in with vue+element

Simple function: Click the plug-in icon in the up...

New ideas for time formatting in JavaScript toLocaleString()

Table of contents 1. Conventional ideas for time ...

Basic Implementation of AOP Programming in JavaScript

Introduction to AOP The main function of AOP (Asp...

Summary of some tips on MySQL index knowledge

Table of contents 1. Basic knowledge of indexing ...

Web page image optimization tools and usage tips sharing

As a basic element of a web page, images are one ...

Solution to overflow of html table

If the table is wide, it may overflow. For exampl...

React encapsulates the global bullet box method

This article example shares the specific code of ...

Briefly talk about mysql left join inner join

Preface I have been busy developing a cold chain ...

Detailed explanation of flex layout in CSS

Flex layout is also called elastic layout. Any co...

Detailed explanation of four solutions to floating problems in CSS layout

1. Cause: The effect after the subbox is set to f...