Super detailed MySQL usage specification sharing

Super detailed MySQL usage specification sharing

Recently, there have been many database-related operations involved, and the company's existing specifications are not very comprehensive. Based on the relevant specifications of various experts on the Internet, I have compiled some standard usage for my own use. I hope you can correct me.

Database environment

dev: Development environment

Developers can read, write, and modify the table structure. Developers can modify the table structure and the data in it at will, but they need to ensure that it does not affect other development colleagues.

test: test environment

Development can read and write, and developers can modify the table structure through tools.

online: online environment

Developers are not allowed to perform database operations directly in the online environment. If operations are required, they must find a DBA to perform the operations and make corresponding records. Stress testing is prohibited.

The key issue is that the user permissions corresponding to the MySQL servers in each environment must be clearly divided and recognizable, and able to specifically distinguish business scenarios.

Naming conventions

Basic naming rules

  • Use meaningful English words, with words separated by underscores. (Do not use pinyin)
  • Only English letters, numbers, and underscores can be used, and the name must start with an English letter.
  • Libraries, tables, and fields should all be in lowercase. Do not use camel case.
  • Avoid using ORACLE and MySQL reserved words such as desc and keywords such as index.
  • The name cannot exceed 32 characters. The meaning of the name must be clear. It is recommended to use nouns instead of verbs.
  • Databases and tables all use the prefix
  • Temporary database and table names must be prefixed with tmp and suffixed with the date.
  • The backup database and table must be prefixed with bak and suffixed with the date.

Why are all libraries, tables, and fields in lowercase?

In MySQL, databases and tables correspond to directories and files within those directories. Therefore, the sensitivity of the operating system determines the case sensitivity of database and table names.

  • Windows is not case sensitive.
  • Linux case rules
  • Database names and table names are strictly case-sensitive;
  • Table aliases are strictly case-sensitive;
  • Column names and column aliases are case-insensitive in all cases;
  • Variable names are also strictly case-sensitive;
  • How to solve it if camel case naming has been set? You just need to add lower_case_table_names = 1 in the MySQL configuration file my.ini.

Table naming

Tables in the same module should use the same prefix as much as possible, and the table names should be as meaningful as possible. All log tables begin with log_

Field naming

  • An English word or abbreviation that expresses its actual meaning. Boolean fields are prefixed with is_ followed by the past participle of the verb.
  • Fields with the same meaning in each table should have the same name. Fields with the same meaning across tables are named with the table name_field name minus the module prefix.
  • The foreign key field uses table name_field name to indicate its association relationship.
  • The primary key of a table is usually agreed to be id, which is an auto-increment type, and the foreign keys of other tables are expressed in the form of xxx_id.

Index naming

  • Non-unique indexes must be named as "idx_field name_field name[_field name]"
  • Unique indexes must be named as "uniq_field name_field name[_field name]"

Constraint Naming

  • Primary key constraint: pk_table name.
  • Unique constraint: uk_table name_field name. (Uniqueness check logic is also required in the application.)

Table design specifications

The table engine depends on the actual application scenario. MyISAM is recommended for log and report tables, and InnoDB is recommended for tables related to transactions, audits, and amounts. Unless otherwise specified, the innodb engine is used when creating tables.

The default character set is utf8mb4, and the database collation rule is utf8mb4_general_ci. (Since the database definition uses the default, the data table can be redefined, but for the sake of insurance, it is recommended to write

Why does the character set not select utf8 and the collation not use utf8_general_ci?

MySQL using utf8 encoding cannot save Emoji expressions that use 4 bytes as placeholders. In order to make the backend project fully support Emoji expressions input by the client, upgrading the encoding to utf8mb4 is the best solution. If the characterEncoding of the JDBC connection string is set to utf8 or the emoji data cannot be inserted normally after the above configuration, you need to specify the character set of the connection as utf8mb4 in the code.

All tables and fields should use the comment column attribute to describe the true meaning of the table or field. If it is an enumeration value, it is recommended to define the contents used in the field.

Unless otherwise specified, the first id field in the table must be the primary key and automatically increment. It is prohibited to use it as a context or condition for data transfer outside of a transaction. Do not use varchar as the primary key statement design.

Unless otherwise specified, the table must contain the create_time and modify_time fields, that is, the table must contain fields that record the creation time and modification time.

Unless otherwise specified, the table must contain is_del to indicate whether the data has been deleted. In principle, physical deletion of database data is not allowed.

  • Use as little storage space as possible to store the data of a field
  • Don't use char or varchar when you can use int.
  • Don't use int when you can use tinyint
  • Use UNSIGNED to store non-negative values.
  • It is not recommended to use ENUM and SET types. Use TINYINT instead.
  • Use short data types. For example, when the value range is 0-80, use TINYINT UNSIGNED.
  • To store precise floating point numbers, DECIMAL must be used instead of FLOAT and DOUBLE.
  • Time field, except for special cases, uses int to record unix_timestamp
  • To store years, use the YEAR type.
  • To store dates, use the DATE type.
  • It is recommended to use the TIMESTAMP type to store time (accurate to seconds), because TIMESTAMP uses 4 bytes and DATETIME uses 8 bytes.
  • It is recommended to use INT UNSIGNED to store IPV4.
  • Avoid using TEXT and BLOB types as much as possible.
  • It is prohibited to use VARBINARY and BLOB in the database to store images, files, etc. It is recommended to use other storage methods (TFS/SFS). MySQL only saves pointer information.
  • The size of a single record cannot exceed 8k (column length (Chinese)_3(UTF8)+column length (English)_1)

What is the difference between datetime and timestamp?

Similarities:

The display format of a TIMESTAMP column is the same as that of a DATETIME column. The display width is fixed at 19 characters and the format is YYYY-MM-DD HH:MM:SS.

Differences:

TIMESTAMP

  • 4 bytes are used to store the time range: 1970-01-01 08:00:01 ~ 2038-01-19 11:14:07. The value is saved in UTC format, which involves time zone conversion. The current time zone is converted when storing, and converted back to the current time zone when retrieving.
  • datetime is stored in 8 bytes, and the time range is: 1000-01-01 00:00:00 ~ 9999-12-31 23:59:59
  • The actual format is stored, regardless of the time zone

How to use the automatic assignment property of TIMESTAMP?

Use the current time as the default value for ts: ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP. When a row is updated, update the value of ts: ts TIMESTAMP DEFAULT 0 ON UPDATE CURRENT_TIMESTAMP.

You can combine 1 and 2: ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP.

How to use INT UNSIGNED to store ip?

Use INT UNSIGNED instead of char(15) to store IPv4 addresses, and convert them using the MySQL functions inet_ntoa and inet_aton. There is currently no conversion function for IPv6 addresses, and they need to be stored using DECIMAL or two bigINTs.

  • If there is no remark, all fields are set to NOT NULL and default values ​​are set;
  • Do not store plain text passwords in the database
  • If there are no remarks, all Boolean fields, such as is_hot and is_deleted, must have a default value set to 0;
  • If there is no remark, the sorting field order_id is sorted in descending order by default in the program;
  • Do not add length to the integer definition, for example, use INT instead of INT[4]

INT[M], what does the M value represent?

Note that the number after the brackets of the numeric type only indicates the width and has nothing to do with the storage range. Many people think that the value ranges of INT(4) and INT(10) are (-9999 to 9999) and (-9999999999 to 99999999999 respectively. This understanding is wrong. In fact, when the M value in the integer type is used in combination with the ZEROFILL attribute, the column values ​​can be made equal in width. Regardless of the value of M in INT[M], its value range is (-2147483648 to 2147483647 when signed), (0 to 4294967295 when unsigned).

The display width does not restrict the range of values ​​that can be held in the column, nor does it restrict the display of values ​​that exceed the specified width of the column. When used in conjunction with the optional extended attribute ZEROFILL, by default supplementary spaces are replaced by zeros. For example, for a column declared as INT(5) ZEROFILL, the value 4 is retrieved as 00004. Note that if you store a value that exceeds the display width in an integer column, you may encounter problems when MySQL generates temporary tables for complex joins, because in these cases MySQL believes that the data will fit in the original column width. If you specify ZEROFILL for a numeric column, MySQL automatically adds the UNSIGNED attribute to the column.

Using VARBINARY to store case-sensitive variable-length strings

When to use CHAR and when to use VARCHAR?

The CHAR and VARCHAR types are similar, but they are stored and retrieved differently. They also differ in terms of maximum length and whether trailing spaces are preserved. The length declared for CHAR and VARCHAR types indicates the maximum number of characters you want to store. For example, CHAR(30) can occupy 30 characters.

The length of a CHAR column is fixed to the length declared when the table is created. The length can be any value from 0 to 255. When CHAR values ​​are stored, they are padded on the right with spaces to the specified length. When CHAR values ​​are retrieved, trailing spaces are removed. No case conversion is performed during storage or retrieval.

The values ​​in a VARCHAR column are variable-length strings. The length can be specified as a value from 0 to 65,535. (The maximum effective length of a VARCHAR is determined by the maximum row size and the character set used. The overall maximum length is 65,532 bytes). In contrast to CHAR, VARCHAR values ​​are stored with only the required number of characters, plus an additional byte to record the length (two bytes are used if the column's declared length exceeds 255). VARCHAR values ​​are stored without padding. Trailing spaces are preserved when values ​​are saved and retrieved, in compliance with standard SQL.

char is suitable for storing the MD5 hash of the user's password, and its length is always the same. For values ​​that change frequently, char is also better than varchar because fixed-length rows are less likely to be fragmented. For very short columns, char is also more efficient than varchar. A char(1) string takes up only one byte for a single-byte character set, but a varchar(1) string takes up two bytes because one byte is used to store the length information.

Index design specifications

MySQL query speed depends on good index design, so indexes are critical to high performance. A reasonable index will speed up the query (including the speed of UPDATE and DELETE. MySQL will load the page containing the row into memory and then perform the UPDATE or DELETE operation). An unreasonable index will slow down the query. MySQL index search is similar to the pinyin and radical search of the Xinhua Dictionary. When the pinyin and radical indexes do not exist, they can only be searched by turning the pages one by one. When a MySQL query cannot use an index, MySQL will perform a full table scan, which will consume a lot of IO. The purpose of index: deduplication, accelerated positioning, avoidance of sorting, covering index.

What is a covering index?

In the InnoDB storage engine, the secondary index (non-primary key index) does not directly store the row address, but stores the primary key value. If the user needs to query a data column that is not included in the secondary index, the user needs to first find the primary key value through the secondary index, and then query other data columns through the primary key, so the query needs to be performed twice. The concept of a covering index is that the query can be completed in one index. The efficiency of a covering index will be higher, and the primary key query is a natural covering index. Reasonable creation of indexes and reasonable use of query statements can improve performance when covering indexes are used. For example, SELECT email,uid FROM user_email WHERE uid=xx. If uid is not the primary key, you can add an index as index(uid,email) to improve performance.

Basic index specifications

  • Index quantity control: the number of indexes in a single table shall not exceed 5, and the number of fields in a single index shall not exceed 5.
  • Comprehensive evaluation of data density and distribution
  • Consider the query and update ratio

Why can't there be too many indexes in a table?

InnoDB's secondary index uses b+tree for storage, so b+tree needs to be adjusted during UPDATE, DELETE, and INSERT. Too many indexes will slow down the update process.

Use prefix index for strings. The length of prefix index should not exceed 8 characters. It is recommended to give priority to prefix index. If necessary, add pseudo column and create index.

Do not index blob/text fields, and do not index large fields, as this will cause the index to take up too much storage space.

What is a prefix index?

To put it simply, prefix indexing is to create an index for the first few characters of the text (the specific number of characters is specified when creating the index). This creates a smaller index, so queries are faster. Prefix index can effectively reduce the size of index files and improve indexing speed. But prefix indexes also have their disadvantages: MySQL cannot use prefix indexes in ORDER BY or GROUP BY, nor can they be used as covering indexes.

The syntax for creating a prefix index is: ALTER TABLE table_name ADD KEY(column_name(prefix_length));

Primary key criteria

  • The table must have a primary key
  • Do not use frequently updated columns
  • Try not to select string columns
  • Do not use UUID MD5 HASH
  • Use a non-null unique key by default
  • It is recommended to choose auto-increment or number generator

Important SQL must be indexed, and core SQL should be given priority to covering indexes.

  • WHERE condition column for UPDATE and DELETE statements
  • Fields for ORDER BY, GROUP BY, and DISTINCT
  • Fields of multi-table JOIN

The most distinguishing field is placed first

  • Choose fields with better filtering properties and put them first, such as order number, userid, etc. It is generally not recommended to put fields with better filtering properties such as type and status at the first place.
  • The index follows the left prefix principle. When a joint index (a, b, c) is created, the index can only be used when the query condition contains (a) or (a, b) or (a, b, c). When (a, c) is used as a condition, only the index of column a can be used. Therefore, it is necessary to ensure that the number of columns returned by a is not too large, otherwise the statement design will be unreasonable. (b, c) cannot be used.
  • Create joint indexes appropriately (avoid redundancy), (a,b,c) is equivalent to (a), (a,b), (a,b,c)

Index Taboos

  • Do not create indexes on low cardinality columns, such as "Gender"
  • Do not perform mathematical operations or function operations on index columns
  • Don't index frequently used small tables
  • Avoid using foreign keys
  • Foreign keys are used to protect referential integrity and can be implemented on the business side.
  • Operations on the parent table and child table will affect each other, reducing availability
  • INNODB's own restrictions on online DDL

Limitations of indexes in MYSQL

The total length of the MYISAM storage engine index cannot exceed 1000 bytes
BLOB and TEXT type columns can only create prefix indexes
MYSQL currently does not support the use of functional indexes. When inequality (!= or <>) is used, MYSQL cannot use the index.
After using function operations (such as abs (column)) on the filter field, MYSQL cannot use the index.
When the join condition field types in the join statement are inconsistent, MYSQL cannot use the index. When using the LIKE operation, if the condition starts with a wildcard (such as '%abc...'), MYSQL cannot use the index.
When using non-equal value queries, MYSQL cannot use Hash indexes.

Statement design specifications

Using prepared statements

  • Passing only parameters is more efficient than passing SQL statements
  • Parse once, use many times
  • Reduce the probability of SQL injection

Avoid implicit conversions

This will cause the index to fail

Take advantage of prefix indexes

  • Must be the leftmost prefix
  • It is not possible to use two range conditions at the same time
  • Queries that do not use the % leading character, such as like "%ab"

Do not use negative queries, such as not in/like

  • Unable to use index, resulting in full table scan
  • Full table scan leads to reduced buffer pool utilization

Avoid using stored procedures, triggers, UDFs, events, etc.

  • Let the database do what it does best
  • Reduce business coupling to leave room for sharding and sharding
  • Avoiding BUGs

Avoid JOINs with large tables

MySQL is best at single-table primary key/secondary index queries
JOIN consumes more memory and generates temporary tables

Avoid doing math in the database

  • MySQL is not good at mathematical operations and logical judgments
  • Unable to use index

Reduce the number of interactions with the database

  • INSERT ... ON DUPLICATE KEY UPDATE
  • REPLACE INTO, INSERT IGNORE, INSERT INTO VALUES(),(),()
  • UPDATE ... WHERE ID IN (10,20,50,…)

Use paging appropriately

Limit the number of pages displayed in pagination. Only the previous page and the next page can be clicked. Delayed association is used.

How to use paging correctly?

Suppose there is a paging statement like the following: SELECT * FROM table ORDER BY id LIMIT 10000, 10. The way MySQL handles LIMIT OFFSET is to retrieve all the data of OFFSET+LIMIT, then remove OFFSET and return the bottom LIMIT. Therefore, when the OFFSET value is large, MySQL's query performance will be very low. This can be solved by using id > n:

The method of using id > n has limitations. The problem of discontinuous ids can be solved by passing in the last id at the same time when turning the page.

http://example.com/page.php?last=100 
select * from table where id<100 order by id desc limit 10 
//Previous page http://example.com/page.php?first=110 
select * from table where id>110 order by id desc limit 10

The biggest disadvantage of this method is that if there is an insert/delete operation during browsing, the page will not be updated, and the total number of pages may still be calculated based on the new count(*), which may eventually result in some records being inaccessible. To fix this problem, you can continue to introduce the current page number and whether there have been operations such as insertion/deletion that affect the total number of records since the last page turn and cache them.

select * from table where id >= (select id from table order by id limit #offset#, 1)
  • Reject large SQL and split it into small SQL
  • Take advantage of the query cache
  • Take advantage of multi-core CPUs
  • Use in instead of or. The value of in should not exceed 1000.
  • Do not use order by rand()
  • Use EXPLAIN diagnostics to avoid creating temporary tables

The EXPLAIN statement (executed in the MySQL client) can obtain information about how MySQL executes SELECT statements. By executing EXPLAIN on the SELECT statement, you can know whether MySQL uses indexes, full table scans, temporary tables, sorting, and other information when executing the SELECT statement. Try to avoid MySQL performing full table scans, using temporary tables, sorting, etc. See the official documentation for details.

Use union all instead of union

What is the difference between union all and union?

The union and union all keywords both merge two result sets into one, but the two are different in terms of usage and efficiency.

After the union table is linked, the duplicate records will be filtered out, so after the table is linked, the resulting result set will be sorted, the duplicate records will be deleted, and then the result will be returned. like:

select * from test_union1 
union select * from test_union2

When this SQL is run, it first retrieves the results of the two tables, then uses the sorting space to sort and delete duplicate records, and finally returns the result set. If the amount of data in the table is large, it may result in disk sorting.

Union all simply merges the two results and returns them. In this way, if there is duplicate data in the two returned result sets, the returned result set will contain the duplicate data.

In terms of efficiency, union all is much faster than union, so if you can confirm that the two merged result sets do not contain duplicate data, then use union all, as follows:

select * from test_union1 union all select * from test_union2
  • The program should have a mechanism to capture SQL exceptions
  • Prevent a single SQL statement from updating multiple tables simultaneously
  • Do not use select *. The SELECT statement only retrieves the required fields.
  • Consumes CPU and IO, consumes network bandwidth
  • Covering indexes cannot be used
  • Reduce the impact of table structure changes
  • Because it is large, select/join may generate a temporary table
  • UPDATE and DELETE statements do not use LIMIT
  • The INSERT statement must explicitly specify the field name, and do not use INSERT INTO table()
  • INSERT statements are submitted using batches (INSERT INTO table VALUES(),(),()…), and the number of values ​​does not exceed 500.
  • When counting the number of records in a table, use COUNT(*) instead of COUNT(primary_key) and COUNT(1) Note: This is only applicable to MyISAM.
  • For data update, it is recommended to use secondary index to first query the primary key, and then update the data based on the primary key.
  • Cross-database query is prohibited
  • Subqueries are prohibited. It is recommended to convert subqueries into associated queries.
  • For the program processing of varchar type fields, please verify the user input and do not exceed its preset length;

Table Specifications

If the data volume of a single table exceeds 5 million or the data capacity exceeds 10G within one to two years, consider splitting the table. It is necessary to consider historical data migration or application self-deletion of historical data in advance. You can use equal and balanced splitting or splitting according to business rules. The data tables to be split must discuss the splitting strategy with the DBA

  • Use HASH to scatter the table, use a decimal number as the suffix of the table name, and the subscript starts from 0
  • The table divided by date and time must comply with the format of YYYY[MM][dd][HH]
  • Adopt appropriate database and table sharding strategies. For example, 1000 libraries and 10 tables, 100 libraries and 100 tables, etc.
  • It is forbidden to use partitioned tables. Partitioned tables have strict requirements on partition keys. It becomes more difficult to perform DDL, SHARDING, and single-table recovery on partitioned tables as they become larger.
  • Split large fields and fields with low access frequency to separate hot and cold data

Code of Conduct

  • Importing and exporting data in batches must be notified in advance to the DBA for assistance and observation
  • Prohibit online backend management and statistical queries from the database
  • Application accounts with super permissions are prohibited
  • When a product fails due to a problem not caused by the database, promptly notify the DBA to assist in troubleshooting
  • Promotional activities or new features must be notified to the DBA in advance for traffic evaluation
  • If database data is lost, contact DBA in time for recovery
  • Multiple alter operations on a single table must be combined into one operation
  • Do not store business logic in the MySQL database
  • The database solution selection and design of major projects must be notified to the DBA in advance
  • For particularly important databases and tables, communicate with the DBA in advance to determine maintenance and backup priorities
  • Do not update or query database in batches during peak business hours Other specifications
  • When submitting online table creation and modification requirements, all relevant SQL statements must be specified in detail

Other specifications

It is not recommended to store log data on MySQL. Hbase or OceanBase should be given priority. If storage is required, please ask the DBA to evaluate the use of compressed tables for storage.

The above is the details of the super detailed MySQL usage specifications. For more information about MySQL usage specifications, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • Summary of MySQL usage specifications
  • Summary of MySQL database usage specifications
  • An experienced person will show you how to develop a professional and standardized MySQL startup script
  • Summary of MySQL development standards and usage skills
  • MySQL database development specifications [recommended]
  • MySQL database naming standards and conventions
  • Detailed explanation of Mysql table creation and index usage specifications
  • MYSQL database naming and design specifications
  • Professional MySQL development design specifications and SQL writing specifications

<<:  OpenLayers realizes the method of aggregate display of point feature layers

>>:  Use of Linux xargs command

Recommend

MySQL high concurrency method to generate unique order number

Preface After this blog post was published, some ...

How to solve the error of connecting to the database when ServerManager starts

Servermanager startup connection database error R...

JavaScript Advanced Programming: Variables and Scope

Table of contents 1. Original value and reference...

Solve the cross-domain problem of get and post requests of vue $http

Vue $http get and post request cross-domain probl...

MySQL InnoDB MRR Optimization Guide

Preface MRR is the abbreviation of Multi-Range Re...

How MLSQL Stack makes stream debugging easier

Preface A classmate is investigating MLSQL Stack&...

Use of Linux cal command

1. Command Introduction The cal (calendar) comman...

Share the pitfalls of MySQL's current_timestamp and their solutions

Table of contents MySQL's current_timestamp p...

Solve the problem of no my.cnf file in /etc when installing mysql on Linux

Today I wanted to change the mysql port, but I fo...

Linux system disk formatting and manually adding swap partition

Windows: Support NTFS, FAT Linux supports file fo...

Implementing custom scroll bar with native js

This article example shares the specific code of ...

Detailed analysis of MySQL optimization of like and = performance

introduction Most people who have used databases ...

How to change the CentOS server time to Beijing time

1. I purchased a VPS and CentOS system, and found...