1.1 Data Type Overview The data type is a field constraint that limits what kind of data each field can store, how much data it can store, and the format in which it can be stored. MySQL/MariaDB generally has five types of data types: integer, floating point, string, date and time, and special ENUM and SET types. The meaning, limitations and related descriptions of these five data types are shown in the figure below: For the number of bytes occupied by each data type, see the official MariaDB manual. 1.2 Storage mechanism and operation mode The reason why the data type can limit the data storage length of a field is that the address space is strictly delimited in the memory when the table is created. The length of the address space can store as many bytes of data as required. Of course, this is a very rough concept. For more specific storage methods, see the description below. There are two ways to limit the range of data types: one is to strictly limit the space, and only as much data as the allocated space can be stored, and the data exceeding the limit will be cut off; the other is to use extra byte bits to mark whether the bytes in a certain address space store data. If data is stored, it will be marked, and if not, it will not be marked. 1.2.1 Integer storage Here we mainly explain the storage method of integers. As for the storage method of floating-point data types, there are too many things to consider. For integer data types, it strictly limits the space, but it is different from characters because the 0 and 1 on the bit position of each divided byte can directly calculate the value, so its range is calculated according to the number of bit values. A byte has 8 bits, and these 8 bits can form 2^8=256 values. Similarly, 2 bytes have a total of 2^16=65536 values. A 4-byte int occupies 32 bits and can represent a range of 0-2^32. That is to say, numbers between 0 and 255 only take up one byte, and numbers between 256 and 65535 take up two bytes. Note that the integer data type in MySQL/mariadb can use the parameter M, where M is a positive integer, such as INT(M), tinyint(M). The M represents the display length. For example, int(4) means that a 4-digit integer will be displayed when output. If the number of digits of the actual value is less than the display value width, spaces are used to fill in the left side by default. When the number of digits exceeds the limit, the displayed result will not be affected. Generally, this function will be used with the zerofill attribute to fill in 0 instead of spaces, but after using zerofill, the column will automatically become an unsigned field. For example: CREATE TABLE test3(id INT(2) ZEROFILL NOT NULL); INSERT INTO test3 VALUES(1),(2),(11),(111); SELECT id FROM test3; +-----+ |id| +-----+ | 01 | | 02 | | 11 | | 111 | +-----+ 4 rows in set (0.00 sec) The only thing you need to note is that the display width only affects the display effect and does not affect any operations such as storage, comparison, length calculation, etc. 1.2.2 Storage of character types This mainly explains the storage methods and differences between char and varchar. The char type is often called a "fixed-length string type". It strictly limits the length of the space, but it limits the number of characters rather than the number of bytes. However, in previous versions, the number of bytes was limited. Therefore, char(M) strictly stores M characters, the insufficient part is padded with spaces, and the part exceeding M characters is directly truncated. Since the char type has the ability to "fill with spaces if it is too short", in order to reflect the authenticity of the data, the trailing space part will be automatically deleted when retrieving data from the address space. This is a special feature of char. Even the trailing spaces we store manually will be considered to be automatically filled and thus deleted during retrieval. That is to say, in the where statement, name='gaoxiaofang ' and name='gaoxiaofang' have the same result. For example: create table test2(a char(4) charset utf8mb4); insert into test2 values('Congratulations'),('Congratulations on your successful promotion'),('hello'),('he '); select concat(a,'x') from test2; +---------------+ | concat(a,'x') | +---------------+ | Congratulations x | | Congratulations on becoming x | | hellx | | hex | +---------------+ 4 rows in set From the above results, you can see that char(4) can only store 4 characters and remove the trailing spaces. VARCHAR is often called a "variable-length string type". When storing data, it uses extra byte bits to mark whether a byte stores data. Each byte (not character) stored takes up one bit for recording, so one extra byte can mark a total of 256 bytes, and two extra bytes can mark a total of 65536 bytes. But MySQL/mariadb limits the maximum storage capacity to 65536 bytes. This means that if it is a single-byte character, it can store up to 65536 characters. If it is a multi-byte character, such as UTF8, each character occupies 3 bytes, it can store up to 65536/3=21845 utf8 characters. Therefore, when storing varchar(M), in addition to the actual data space length, the bit length of 1 or 2 bytes must be calculated. That is, for single-byte characters, the actual space occupied is M+1 or M+2 bytes, and for multi-byte characters (such as 3 bytes), the actual space occupied is M*3+1 or M*3+2 bytes. Since varchar storage requires an extra bit to record each byte, shorter data will not be automatically padded, so the explicitly stored trailing spaces will also be stored and marked on the bit, which means that the trailing spaces will not be deleted. Like char(M), when varchar(2) is specified, only two-byte characters can be stored. If the number exceeds the limit, the characters are cut off. Regarding char, varchar, and text string types, they do not consider trailing spaces when comparing, but spaces are considered when doing like matching or regular matching because the characters are accurate when matching. For example: create table test4(a char(4),b varchar(5)); insert into test4 values('ab ','ab '); select a='ab ',b='ab ',a=b from test4; +-----------+--------------+-----+ | a='ab ' | b='ab ' | a=b | +-----------+--------------+-----+ | 1 | 1 | 1 | +-----------+--------------+-----+ 1 row in set select a like 'ab ' from test4; +-------------------+ | a like 'ab ' | +-------------------+ | 0 | +-------------------+ 1 row in set Finally, it should be noted that when storing (or loading into memory), storing values as numeric types saves more space than storing them as character types or date and time types. Because integer values are stored directly through bit calculations, any integer between 0-255 only occupies one byte, and any integer between 256-65535 occupies two bytes. When it occupies four bytes, it can represent any one of billions of integers. This obviously saves much more space than character storage where each character occupies one byte. For example, the value "100" takes up three bytes when stored as a character type, but only one byte when stored as a numeric type. Therefore, the database treats values not enclosed in quotes as numeric values by default. If you explicitly want to store them as character or date and time types, you should use quotes to avoid ambiguity. 1.2.3 Storage of date and time types Date and time data must be stored in quotation marks to avoid ambiguity with numeric data. The input method of date and time is very loose. The following methods are allowed: any allowed separator, and it is recommended to use a 4-digit year. 20110101 2011-01-01 18:40:20 2011/01/01 18-40-20 20110101184020 1.2.4 ENUM Data Type The ENUM data type is an enumeration type. The definition method is ENUM ('value1', 'value2', 'value3', ...). When inserting data into a field of this type, only one of the values or NULL can be inserted. When inserting other values or null (that is, ''), they will be truncated to empty data. When storing, case is ignored (it will be converted to characters in ENUM) and trailing spaces are truncated. mysql> create table test6(id int auto_increment primary key,name char(20),gender enum('Mail','f')); mysql> insert into test6(name,gender) values('malongshuai','Mail'),('gaoxiaofang','F'),('wugui','x'),('tuner',null),('woniu',''); Query OK, 5 rows affected Records: 5 Duplicates: 0 Warnings: 2 mysql> show warnings; +---------+------+---------------------------------------------+ | Level | Code | Message | +---------+------+---------------------------------------------+ | Warning | 1265 | Data truncated for column 'gender' at row 3 | | Warning | 1265 | Data truncated for column 'gender' at row 5 | +---------+------+---------------------------------------------+ 2 rows in set mysql> select * from test6; +----+-------------+--------+ | id | name | gender | +----+-------------+--------+ | 1 | malongshuai | Mail | | 2 | gaoxiaofang | f | | 3 | wugui | | | 4 | tuner | NULL | | 5 | woniu | | +----+-------------+--------+ 5 rows in set ENUM type data is stored through index values. Compared with string type, it only needs 1 or 2 bytes to store. Theoretically, when the number of values is less than 256, only one byte is needed, and when the number exceeds 256 but is less than 65536, 2 bytes are used for storage. MySQL/MariaDB is limited to storing a maximum of 65536 values. Of course, this is a theoretical limitation. There are many factors to consider in actual storage. For example, NULL also occupies bits, so in actual storage, 250 values may require 2 bytes. Each value of ENUM is numbered by an index number. Whether searching or operating the field, the index value is used for operation. value1's index = 1, value2's index = 2, and so on. But please note that there are two special index values: index = NULL for NULL values and index = 0 for empty data. For example, ENUM('a','b','c'), when you insert '','b','a','c',NULL,'xxx' into the field in sequence, since the first and last ones will be truncated to empty data, their indexes are 0, the index of the inserted NULL is NULL, and the index values of the inserted 'b','a','c' are 2,1,3 respectively. So the correspondence between index number and value is:
Use ENUM index to retrieve data: mysql> select * from test6 where gender=2; +----+-------------+--------+ | id | name | gender | +----+-------------+--------+ | 2 | gaoxiaofang | f | +----+-------------+--------+ 1 row in set It is especially recommended not to use ENUM to store numerical values, because whether it is sorting, retrieval or other operations, they are all based on the index value as a condition, which can easily lead to misunderstanding. For example, the following uses ENUM to store two values and then perform retrieval and sorting operations. mysql> create table test7(id enum('3','1','2')); mysql> insert into test7 values('1'),('2'),('3'); # When searching, id=2, but the result is 1, because id=2's 2 is the index value of enum, and the value of index=2 in enum is 1 mysql> select * from test7 where id=2; +----+ |id| +----+ | 1 | +----+ 1 row in set # When sorting by id, it is also sorted by index size mysql> select * from test7 order by id asc; +----+ |id| +----+ | 3 | | 1 | | 2 | +----+ 3 rows in set Therefore, it is strongly recommended not to store values in ENUM, as even floating-point values can easily be ambiguous. 1.2.5 SET Data Type For SET type, similar to enum, it is case-insensitive, trailing spaces are deleted when storing, and null is also a valid value. But the difference is that multiple given values can be combined. For example, set('a','b','c','d') can store 'a, b', 'd, b', etc. Multiple members are separated by commas. Therefore, when using multiple members, commas cannot appear in the values of the members themselves. If the value to be stored is not in the set list, it is truncated to a null value. The space occupied by the SET data type is related to the number of SET members M, which is calculated as (M+7)/8 rounded up. So: 1-8 members occupy 1 byte; Members 9-16 occupy 2 bytes; Members 17-24 occupy 3 bytes; Members 25-32 occupy 4 bytes; Members 33-64 take up 8 bytes. MySQL/MariaDB is limited to a maximum of 64 members. When storing data of the SET data type, duplicate members are ignored and stored in the order in which they are enumerated. For example, set('b','b','a'), the result of storing 'a,b,a' and 'b,a,b' is 'b,a'. mysql> create table test8(a set('d','b','a')); mysql> insert into test8 values('b,b,a'),('b,a,b'),('bab'); Query OK, 3 rows affected Records: 3 Duplicates: 0 Warnings: 1 mysql> select * from test8; +-----+ | a | +-----+ | b,a | | b,a | | | +-----+ 3 rows in set Use find_in_set(set_value,set_column_name) to retrieve the row containing the specified set value set_value. For example, to retrieve rows where field a contains member b: mysql> select * from test8 where find_in_set('b',a); +-----+ | a | +-----+ | b,a | | b,a | +-----+ 2 rows in set 1.3 Data type attributes: unsigned The unsigned attribute makes the numeric data unsigned. Using the unsigned attribute will change the range of the numeric data type. For example, the range of the signed tinyint type is -128 to 127, while when using unsigned the range becomes 0 to 255. At the same time, unsigned will also restrict the column from inserting negative values. create table t(a int unsigned,b int unsigned); insert into t select 1,2; insert into t select -1,-2; In the above statements, when executing the second statement to insert a negative number, an error will be reported, indicating that the range is out of range. Using unsigned does have its uses in some cases. For example, a general ID primary key column does not allow negative numbers, which is equivalent to implementing a check constraint. However, using unsigned may sometimes cause unexpected problems: if a negative number is obtained during numerical calculations, an error will be reported. For example, in the table t above, fields a and b are both unsigned columns, and there is a row with a=1, b=2. mysql> select * from t; +---+---+ | a | b | +---+---+ | 1 | 2 | +---+---+ 1 row in set At this time, if ab is calculated, an error will occur. Not only that, but any unsigned column involved in the calculation will result in a negative number. mysql> select ab from t; 1690 - BIGINT UNSIGNED value is out of range in '(`test`.`t`.`a` - `test`.`t`.`b`)' mysql> select a-2 from t; 1690 - BIGINT UNSIGNED value is out of range in '(`test`.`t`.`a` - 2)' If the result of the calculation is not negative, it has no effect. mysql> select 2-a,a*3 from t; +-----+-----+ | 2-a | a*3 | +-----+-----+ | 1 | 3 | +-----+-----+ 1 row in set This is not a bug in MySQL/MariaDB. Unsigned in C language also has similar problems. This problem can be solved by setting set sql_mode='no_unsigned_subtraction' in MySQL/MariaDB. So I personally recommend not to use the unsigned attribute to modify the field. 1.4 Data type attributes: zerofill After zerofill modifies the field, the insufficient field display part will be filled with 0 instead of spaces. When zerofill is enabled, it will be automatically set to unsigned. Zerofill is generally only used after setting the display width of the column. The display width of the columns has been introduced above. mysql> create table t1(id int(4) zerofill); mysql> select * from t1; +-------+ |id| +-------+ | 0001 | | 0002 | | 0011 | | 83838 | +-------+ 4 rows in set (0.00 sec) Zerofill only modifies the display result and does not affect the stored data value. The above (MariaDB) MySQL data type and storage mechanism comprehensive explanation is all the content that the editor shares with you. I hope it can give you a reference. I also hope that you will support 123WORDPRESS.COM. You may also be interested in:
|
<<: How to monitor array changes in Vue
>>: Java+Tomcat environment deployment and installation process diagram
I heard that there is an interview question: How ...
Table of contents 1. Get the first link first 2. ...
Table of contents 1. What are microtasks? 2. What...
Passive Check With passive health checks, NGINX a...
What is HTML? To put it simply: HTML is used to m...
In the horizontal direction, you can set the alig...
type is the control used for input and output in t...
Table of contents 1. What is event delegation? 2....
Table of contents Code: Replenish: Summarize Requ...
Let's first look at the definition of the pos...
docker-compose-monitor.yml version: '2' n...
1. Background Generally, in a data warehouse envi...
Table of contents Creating Arrays in JavaScript U...
Reasons why the 1px line becomes thicker When wor...
The MySQL built-in date function TIMESTAMPDIFF ca...