Detailed explanation of MySQL Innodb storage structure and storage of Null values

Detailed explanation of MySQL Innodb storage structure and storage of Null values

background:

Tablespace: All INNODB data is stored in the tablespace (shared tablespace). If innodb_file_per_table is turned on, the data of each table will be stored in a separate tablespace (exclusive tablespace).
Exclusive tablespaces include: data, index, insert cache, and data dictionary. The shared tablespace includes: Undo information (will not reclaim <physical space>), double write cache information, transaction information, etc.
Segment: It makes up the table space and consists of zones.
Extent: consists of 64 consecutive pages. Each page is 16K, totaling 1M. For large data segments, 4 zones can be applied for each time.
Page: It is the unit of INNODB disk management and consists of rows.
Row: includes transaction ID, rollback pointer, column information, etc.

Purpose 1:
Understand the information of each page of the table space and the storage of overflow row data. Through the tool written by the author of the book, Jiang Chengyao: http://code.google.com/p/david-mysql-tools/source/browse/trunk/py_innodb_page_type/
3 scripts:

py_innodb_page_info.py

View Code 

#!/usr/bin/env python 
#encoding=utf-8
import mylib
from sys import argv
from mylib import myargv

if __name__ == '__main__':
 myargv = myargv(argv)
 if myargv.parse_cmdline() == 0:
  pass
 else:
  mylib.get_innodb_page_type(myargv)

mylib.py

View Code 

encoding=utf-8
import os
import include
from include import *

TABLESPACE_NAME='D:\\mysql_data\\test\\t.ibd'
VARIABLE_FIELD_COUNT = 1
NULL_FIELD_COUNT = 0

class myargv(object):
 def __init__(self, argv):
  self.argv = argv
  self.parms = {}
  self.tablespace = ''

 def parse_cmdline(self):
  argv = self.argv
  if len(argv) == 1:
   print 'Usage: python py_innodb_page_info.py [OPTIONS] tablespace_file'
   print 'For more options, use python py_innodb_page_info.py -h'
   return 0
  while argv:
   if argv[0][0] == '-':
    if argv[0][1] == 'h':
     self.parms[argv[0]] = ''
     argv = argv[1:]
     break
    if argv[0][1] == 'v':
     self.parms[argv[0]] = ''
     argv = argv[1:]
    else:
     self.parms[argv[0]] = argv[1]
     argv = argv[2:]
   else:
    self.tablespace = argv[0]
    argv = argv[1:]
  if self.parms.has_key('-h'):
   print 'Get InnoDB Page Info'
   print 'Usage: python py_innodb_page_info.py [OPTIONS] tablespace_file\n'
   print 'The following options may be given as the first argument:'
   print '-h help '
   print '-o output put the result to file'
   print '-t number thread to anayle the tablespace file'
   print '-v verbose mode'
   return 0
  return 1

def mach_read_from_n(page,start_offset,length):
 ret = page[start_offset:start_offset+length]
 return ret.encode('hex')

def get_innodb_page_type(myargv):
 f = file(myargv.tablespace,'rb')
 fsize = os.path.getsize(f.name)/INNODB_PAGE_SIZE
 ret = {}
 for i in range(fsize):
  page = f.read(INNODB_PAGE_SIZE)
  page_offset = mach_read_from_n(page,FIL_PAGE_OFFSET,4)
  page_type = mach_read_from_n(page,FIL_PAGE_TYPE,2)
  if myargv.parms.has_key('-v'):
   if page_type == '45bf':
    page_level = mach_read_from_n(page,FIL_PAGE_DATA+PAGE_LEVEL,2)
    print "page offset %s, page type <%s>, page level <%s>"%(page_offset,innodb_page_type[page_type],page_level)
   else:
    print "page offset %s, page type <%s>"%(page_offset,innodb_page_type[page_type])
  if not ret.has_key(page_type):
   ret[page_type] = 1
  else:
   ret[page_type] = ret[page_type] + 1
 print "Total number of pages: %d:"%fsize
 for type in ret:
  print "%s: %s"%(innodb_page_type[type],ret[type])

include.py

View Code 

#encoding=utf-8
INNODB_PAGE_SIZE = 16*1024*1024

# Start of the data on the page
FIL_PAGE_DATA = 38


FIL_PAGE_OFFSET = 4 # page offset inside space
FIL_PAGE_TYPE = 24 # File page type

# Types of an undo log segment */
TRX_UNDO_INSERT = 1
TRX_UNDO_UPDATE = 2

# On a page of any file segment, data may be put starting from this offset
FSEG_PAGE_DATA = FIL_PAGE_DATA

# The offset of the undo log page header on pages of the undo log
TRX_UNDO_PAGE_HDR = FSEG_PAGE_DATA

PAGE_LEVEL = 26 #level of the node in an index tree; the leaf level is the level 0 */

innodb_page_type={
 '0000':u'Freshly Allocated Page',
 '0002':u'Undo Log Page',
 '0003':u'File Segment inode',
 '0004':u'Insert Buffer Free List',
 '0005':u'Insert Buffer Bitmap',
 '0006':u'System Page',
 '0007':u'Transaction system Page',
 '0008':u'File Space Header',
 '0009':u'Extended description page',
 '000a':u'Uncompressed BLOB Page',
 '000b':u'1st compressed BLOB Page',
 '000c':u'Subsequent compressed BLOB Page',
 '45bf':u'B-tree Node'
 }

innodb_page_direction={
 '0000': 'Unknown(0x0000)',
 '0001': 'Page Left',
 '0002': 'Page Right',
 '0003': 'Page Same Rec',
 '0004': 'Page Same Page',
 '0005': 'Page No Direction',
 'ffff': 'Unkown2(0xffff)'
}


INNODB_PAGE_SIZE=1024*16 # InnoDB Page 16K

Test 1:

root@localhost : test 02:26:13>create table tt(id int auto_increment,name varchar(10),age int,address varchar(20),primary key (id))engine=innodb;
Query OK, 0 rows affected (0.17 sec)
root@zhoujy:/var/lib/mysql/test# ls -lh tt.ibd 
-rw-rw---- 1 mysql mysql 96K 2012-10-17 14:26 tt.ibd

View ibd:

root@zhoujy:/home/zhoujy/jiaoben/read_ibd# python py_innodb_page_info.py /var/lib/mysql/test/tt.ibd -v
page offset 00000000, page type <File Space Header>
page offset 00000001, page type <Insert Buffer Bitmap>
page offset 00000002, page type <File Segment inode>
page offset 00000003, page type <B-tree Node>, page level <0000> ---Leaf node page offset 00000000, page type <Freshly Allocated Page>
page offset 00000000, page type <Freshly Allocated Page>
Total number of pages: 6: 
Freshly Allocated Page: 2
Insert Buffer Bitmap: 1
File Space Header: 1
B-tree Node: 1
File Segment inode: 1

explain:
Total number of page: Total number of pages
Freshly Allocated Page: Available page
Insert Buffer Bitmap: Insert cache bitmap page
Insert Buffer Free List: Insert cache free list page
B-tree Node: data page
Uncompressed BLOB Page: Binary large object page, the page that stores overflow rows, that is, the information obtained on the overflow page is that the table initialization size is 96K, which is obtained by Total number of page * 16. 1 data page, 2 free pages.

root@localhost : test 02:42:58>insert into tt values(name,age,address) values('aaa',23,'HZZZ');

Question: Why is there no application area? A zone is 64 consecutive pages, 1M in size. Then the table size should also be at least 1M. But now it's only 96K (default). The reason is that at the beginning of each segment, there are 32 pages of fragmented pages to store data. After they are used up, 64 pages are applied continuously. A maximum of 4 areas can be applied each time to ensure the order of data. It can be seen here that the table size increases by at least 64 pages of space, that is, 1M increase.
verify:
Fill the 32 fragment pages with data, 32*16 = 512K. See if you can apply for a space larger than 1M.

View Code 

root@zhoujy:/home/zhoujy/jiaoben/read_ibd# ls -lh /var/lib/mysql/test/tt.ibd 
-rw-rw---- 1 mysql mysql 576K 2012-10-17 15:30 /var/lib/mysql/test/tt.ibd
root@zhoujy:/home/zhoujy/jiaoben/read_ibd# python py_innodb_page_info.py /var/lib/mysql/test/tt.ibd -v
page offset 00000000, page type <File Space Header>
page offset 00000001, page type <Insert Buffer Bitmap>
page offset 00000002, page type <File Segment inode>
page offset 00000003, page type <B-tree Node>, page level <0001>
page offset 00000004, page type <B-tree Node>, page level <0000>
page offset 00000005, page type <B-tree Node>, page level <0000>
page offset 00000006, page type <B-tree Node>, page level <0000>
page offset 00000007, page type <B-tree Node>, page level <0000>
page offset 00000008, page type <B-tree Node>, page level <0000>
page offset 00000009, page type <B-tree Node>, page level <0000>
page offset 0000000a, page type <B-tree Node>, page level <0000>
page offset 0000000b, page type <B-tree Node>, page level <0000>
page offset 0000000c, page type <B-tree Node>, page level <0000>
page offset 0000000d, page type <B-tree Node>, page level <0000>
page offset 0000000e, page type <B-tree Node>, page level <0000>
page offset 0000000f, page type <B-tree Node>, page level <0000>
page offset 00000010, page type <B-tree Node>, page level <0000>
page offset 00000011, page type <B-tree Node>, page level <0000>
page offset 00000012, page type <B-tree Node>, page level <0000>
page offset 00000013, page type <B-tree Node>, page level <0000>
page offset 00000014, page type <B-tree Node>, page level <0000>
page offset 00000015, page type <B-tree Node>, page level <0000>
page offset 00000016, page type <B-tree Node>, page level <0000>
page offset 00000017, page type <B-tree Node>, page level <0000>
page offset 00000018, page type <B-tree Node>, page level <0000>
page offset 00000019, page type <B-tree Node>, page level <0000>
page offset 0000001a, page type <B-tree Node>, page level <0000>
page offset 0000001b, page type <B-tree Node>, page level <0000>
page offset 0000001c, page type <B-tree Node>, page level <0000>
page offset 0000001d, page type <B-tree Node>, page level <0000>
page offset 0000001e, page type <B-tree Node>, page level <0000>
page offset 0000001f, page type <B-tree Node>, page level <0000>
page offset 00000020, page type <B-tree Node>, page level <0000>
page offset 00000021, page type <B-tree Node>, page level <0000>
page offset 00000022, page type <B-tree Node>, page level <0000>
page offset 00000023, page type <B-tree Node>, page level <0000>
Total number of pages: 36:
Insert Buffer Bitmap: 1
File Space Header: 1
B-tree Node: 33
File Segment inode: 1

"Extra" pages: 4
page offset 00000000, page type <File Space Header> : file header space page
page offset 00000001, page type <Insert Buffer Bitmap>: Insert buffer bitmap page
page offset 00000002, page type <File Segment inode>: file segment node
page offset 00000003, page type <B-tree Node>, page level <0001>: root page fragmentation pages: 32
page type <B-tree Node>, page level <0000>
There are 36 pages in total. The origin of the ibd size of 576K is: 32*16=512K (fragmented pages) + 4*16=64 (extra pages). If you want to insert more, you should apply for at least 1M pages:

root@zhoujy:/home/zhoujy/jiaoben/read_ibd# ls -lh /var/lib/mysql/test/tt.ibd 
-rw-rw---- 1 mysql mysql 2.0M 2012-10-17 16:10 /var/lib/mysql/test/tt.ibd
root@zhoujy:/home/zhoujy/jiaoben/read_ibd# python py_innodb_page_info.py /var/lib/mysql/test/tt.ibd
Total number of pages: 128:
Freshly Allocated Page: 91
Insert Buffer Bitmap: 1
File Space Header: 1
B-tree Node: 34
File Segment inode: 1

The page size jumps from 36 to 128. Because 32 fragmented pages have been used up, new pages will use the zone method to apply for space. The fact that there are many available pages in the information just proves this point.

▲ Overflow row data storage: The INNODB storage engine is index-organized, that is, there are at least two rows of records in each page. Therefore, if only one row of records can be stored in a page, INNODB will automatically put the row data in the overflow page. When an overflow row occurs, the actual data is stored in the BLOB page, and the data page only stores the first 768 bytes of the data (old file format). The new file format (Barracuda) uses a complete row overflow method, in which the data page only stores a 20-byte pointer, and the BLOB also stores all the data. How to check if there is overflow row data in the table?

root@localhost : test 04:52:34>create table t1 (id int,name varchar(10),memo varchar(8000))engine = innodb default charset utf8;
Query OK, 0 rows affected (0.16 sec)

root@localhost : test 04:53:10>insert into t1 values(1,'zjy',repeat('我',8000));
Query OK, 1 row affected (0.00 sec)

View ibd:

root@zhoujy:/home/zhoujy/jiaoben/read_ibd# python py_innodb_page_info.py /var/lib/mysql/test/t1.ibd -v
page offset 00000000, page type <File Space Header>
page offset 00000001, page type <Insert Buffer Bitmap>
page offset 00000002, page type <File Segment inode>
page offset 00000003, page type <B-tree Node>, page level <0000>
page offset 00000004, page type <Uncompressed BLOB Page>
page offset 00000005, page type <Uncompressed BLOB Page>
Total number of pages: 6:
Insert Buffer Bitmap: 1
Uncompressed BLOB Page: 2
File Space Header: 1
B-tree Node: 1
File Segment inode: 1

From the information, we can see that the row of records just inserted has overflowed and is saved in two BLOB pages (<Uncompressed BLOB Page>). Because 1 page is only 16K and 2 rows of data need to be stored, each row of record should be less than 8K. However, the above is much larger than 8K, so it overflows. Of course, this does not include very large fields. If a table has 5 fields that are all varchar(512) [the sum of multiple varchars is greater than 8K], it will also overflow:

root@localhost : test 05:08:39>create table t2 (id int,name varchar(1000),address varchar(512),company varchar(200),xx varchar(512),memo varchar(512),dem varchar(1000))engine = innodb default charset utf8;
Query OK, 0 rows affected (0.17 sec)
root@localhost : test 05:08:43>insert into t2 values(1,repeat('周',1000),repeat('我',500),repeat('丁',500),repeat('啊',500),repeat('哦',500),repeat('阿a',500));

1000+500+500+500+500+500=3500*3>8000 bytes; the line will overflow:

root@zhoujy:/home/zhoujy/jiaoben/read_ibd# python py_innodb_page_info.py /var/lib/mysql/test/t2.ibd -v
page offset 00000000, page type <File Space Header>
page offset 00000001, page type <Insert Buffer Bitmap>
page offset 00000002, page type <File Segment inode>
page offset 00000003, page type <B-tree Node>, page level <0000>
page offset 00000004, page type <Uncompressed BLOB Page>
page offset 00000000, page type <Freshly Allocated Page>
Total number of pages: 6:
Insert Buffer Bitmap: 1
Freshly Allocated Page: 1
File Segment inode: 1
B-tree Node: 1
File Space Header: 1
Uncompressed BLOB Page: 1

The <Uncompressed BLOB Page> page stores the real data, so what does the data page store? View with hexdump:

root@zhoujy:/home/zhoujy/jiaoben/read_ibd# hexdump -C -v /var/lib/mysql/test/t1.ibd > t1.txt

View ibd:

View Code 

3082 0000c090 00 32 01 10 80 00 00 01 7a 6a 79 e6 88 91 e6 88 |.2......zjy.....|
3083 0000c0a0 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 |................|
3084 0000c0b0 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 |................|
3085 0000c0c0 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 |................|
3086 0000c0d0 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 |................|
3087 0000c0e0 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 |................|
3088 0000c0f0 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 |................|
3089 0000c100 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 |................|
3090 0000c110 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 |................|
3091 0000c120 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 |................|
3092 0000c130 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 |................|
3093 0000c140 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 |................|
3094 0000c150 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 |................|
3095 0000c160 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 |................|
3096 0000c170 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 |................|
3097 0000c180 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 |................|
3098 0000c190 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 |................|
3099 0000c1a0 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 |................|
3100 0000c1b0 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 |................|
3101 0000c1c0 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 |................|
3102 0000c1d0 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 |................|
3103 0000c1e0 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 |................|
3104 0000c1f0 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 |................|
3105 0000c200 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 |................|
3106 0000c210 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 |................|
3107 0000c220 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 |................|
3108 0000c230 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 |................|
3109 0000c240 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 |................|
3110 0000c250 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 |................|
3111 0000c260 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 |................|
3112 0000c270 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 |................|
3113 0000c280 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 |................|
3114 0000c290 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 |................|
3115 0000c2a0 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 |................|
3116 0000c2b0 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 |................|
3117 0000c2c0 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 |................|
3118 0000c2d0 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 |................|
3119 0000c2e0 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 |................|
3120 0000c2f0 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 |................|
3121 0000c300 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 |................|
3122 0000c310 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 |................|
3123 0000c320 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 |................|
3124 0000c330 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 |................|
3125 0000c340 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 |................|
3126 0000c350 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 |................|
3127 0000c360 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 |................|
3128 0000c370 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 |................|
3129 0000c380 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 88 91 e6 |................|
3130 0000c390 88 91 e6 88 91 e6 88 91 e6 88 91 00 00 02 1c 00 |................|

There are exactly 48 lines in the text, and each line is 16 bytes. 48*16=768 bytes, which just verifies what I said before: the data page only stores the first 768 bytes of data (old file format).

Summary 1:
The above information can clearly show the distribution and utilization of each page in the ibd tablespace and the step size of the tablespace size increase. Pay special attention to the overflow rows. A page contains at least 2 rows of data. The more rows stored in a page, the better the performance.

************************************
************************************

Purpose 2:
Understand how tablespaces store data and how NULL values ​​are stored.

Test 2:
Understand the storage format (row_format) of INNODB before testing. Old format (Antelope): Compact <default>, Redumdant; New format (Barracuda): Compressed, Dynamic.
This tests the default storage format.
Compact row recording is as follows:

 |Variable length field length list (1~2 bytes) |NULL flag (1 byte) |Record header information (5 bytes) |RowID (6 bytes) |Transaction ID (6 bytes) |Rollback pointer (7 bytes) |

Except for the "NULL flag" [all fields in the table are defined as NOT NULL], "RowID" [there is a primary key in the table], and "Variable-length field length list" [there are no variable-length fields] which may not exist, all other information will appear. Therefore, a row of data requires an additional 18 bytes in addition to the fields occupied by the column data.

1: All fields are NULL

mysql> create table mytest(t1 varchar(10),t2 varchar(10),t3 varchar(10) ,t4 varchar(10))engine=innodb charset=latin1 row_format=compact;
Query OK, 0 rows affected (0.08 sec)

mysql> insert into mytest values('a','bb','bb','ccc');
Query OK, 1 row affected (0.02 sec)

mysql> insert into mytest values('a','ee','ee','fff');
Query OK, 1 row affected (0.01 sec)

mysql> insert into mytest values('a',NULL,NULL,'fff');
Query OK, 1 row affected (0.00 sec)

After the test data is prepared, execute the shell command:

root@zhoujy:/usr/local/mysql/test# hexdump -C -v mytest.ibd > /home/zhoujy/mytest.txt

Open the mytest.txt file and find the line supremum:

0000c070 73 75 70 72 65 6d 75 6d 03 02 02 01 00 00 00 10 |supremum........| -----------> One line, 16 bytes 0000c080 00 25 00 00 00 03 b9 00 00 00 00 02 49 01 82 00 |.%..........I...|
0000c090 00 01 4a 01 10 61 62 62 62 62 63 63 63 03 02 02 |..J..abbbbccc...|
0000c0a0 01 00 00 00 18 00 23 00 00 00 03 b9 01 00 00 00 |......#.........|
0000c0b0 02 49 02 83 00 00 01 4b 01 10 61 65 65 65 65 66 |.I.....K..aeeeef|
0000c0c0 66 66 03 01 06 00 00 20 ff a6 00 00 00 03 b9 02 |ff..... ........|
0000c0d0 00 00 00 02 49 03 84 00 00 01 4c 01 10 61 66 66 |....I.....L..aff|
0000c0e0 66 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |f...............|

explain:
The first row of data:
03 02 02 01/*Variable length field*/ ---- The type of the 4 fields in the table is varchar, there is no NULL data, and each field is less than 255.
00 /*NULL flag, there is no null data in the first row*/
00 00 10 00 25 /*Record header information, fixed 5 bytes*/
00 00 00 03 b9 00/*RowID, fixed 6 bytes, the table has no primary key*/
00 00 00 02 49 01 /*Transaction ID, fixed 6 bytes*/
82 00 00 01 4a 01 10 /*Rollback pointer, fixed 7 bytes*/
61 62 62 62 62 63 63 63/* Column data*/
The second row of data is the same as the first row (the colors match).
The colors of the third row of data (with NULL values) and the first row of explanations are quite different:

03 02 02 01 VS 03 01 ----------When the value is NULL, the variable-length field list does not occupy storage space.
61 62 62 62 62 63 63 63 VS 61 66 66 66 --------- NULL values ​​are not stored and do not take up space

Conclusion: When the value is NULL, the variable-length field list does not occupy storage space. NULL values ​​are not stored and do not take up space, but require a flag bit (one per row).

2: All fields are NOT NULL

mysql> create table mytest(t1 varchar(10) NOT NULL,t2 varchar(10) NOT NULL,t3 varchar(10) NOT NULL,t4 varchar(10) NOT NULL)engine=innodb charset=latin1 row_format=compact;
Query OK, 0 rows affected (0.03 sec)

mysql> insert into mytest values('a','bb','bb','ccc');
Query OK, 1 row affected (0.01 sec)

mysql> insert into mytest values('a','ee','ee','fff');
Query OK, 1 row affected (0.01 sec)

mysql> insert into mytest values('a',NULL,NULL,'fff');
ERROR 1048 (23000): Column 't2' cannot be null

The steps are the same as above, and the ibd result is:

0000c070 73 75 70 72 65 6d 75 6d 03 02 02 01 00 00 10 00 |supremum........|
0000c080 24 00 00 00 03 b9 03 00 00 00 02 49 07 87 00 00 |$..........I....|
0000c090 01 4f 01 10 61 62 62 62 62 63 63 63 03 02 02 01 |.O..abbbbccc....|
0000c0a0 00 00 18 ff cb 00 00 00 03 b9 04 00 00 00 02 49 |...............I|
0000c0b0 08 88 00 00 01 50 01 10 61 65 65 65 65 66 66 66 |.....P..aeeeefff|

Compared with the above, it is found that the NULL flag information is missing.
Conclusion: NULL values ​​will have extra space to store, i.e. 1 byte per row. For tables with the same data, the table with NULL values ​​in the field is larger than the table with NOT NULL values.

Three: 1 NULL and 1 '' data:

mysql> create table mytest(t1 varchar(10) NOT NULL,t2 varchar(10) NOT NULL DEFAULT '',t3 varchar(10) NOT NULL ,t4 varchar(10))engine=innodb charset = latin1 row_format=compact;
Query OK, 0 rows affected (0.02 sec)
mysql> insert into mytest(t1,t2) values('A','BB');
Query OK, 1 row affected, 1 warning (0.01 sec)

The steps are the same as above, and the ibd result is:

0000c070 73 75 70 72 65 6d 75 6d 00 02 01 01 00 00 10 ff |supremum........|
0000c080 ef 00 00 00 43 b9 03 00 00 00 02 4a 15 90 00 00 |....C......J....|
0000c090 01 c2 01 10 41 42 42 00 00 00 00 00 00 00 00 00 |....ABB.........|

The main difference from the above two lies in the variable-length list and column data.

Conclusion: The column data information shows that NULL data and '' data do not occupy any space. For the information of the variable-length field list, a comparison shows that: although '' data does not need to occupy any storage space, it still needs to occupy one byte in the variable-length field list <after all, it is still a '' value>, and the NULL value does not need to occupy ", but NULL will have an additional flag bit, so there is an optimization statement: "Try to set NOT NULL in the database table if it can be set to NOT NULL, unless NULL is really needed. ” is proven here.

The above tests are all for VARCHAR variable-length types, what about CHAR?

CHAR Test:

root@localhost : test 10:33:35>create table mytest(t1 char(10),t2 char(10),t3 char(10) ,t4 char(10))engine=innodb charset = latin1 row_format=compact; Query OK, 0 rows affected (0.16 sec)

root@localhost : test 10:33:59>insert into mytest values('a','bb','bb','ccc');
Query OK, 1 row affected (0.00 sec)

root@localhost : test 10:34:09>insert into mytest values('a','ee','ee','fff');
Query OK, 1 row affected (0.00 sec)

root@localhost : test 10:34:19>insert into mytest values('a',NULL,NULL,'fff');
Query OK, 1 row affected (0.00 sec)

Open the file generated by ibd:

0000c060 02 00 1b 69 6e 66 69 6d 75 6d 00 04 00 0b 00 00 |...infimum......|
0000c070 73 75 70 72 65 6d 75 6d 00 00 00 10 00 41 00 00 |supremum.....A..|
0000c080 00 0a f5 00 00 00 00 81 2d 07 80 00 00 00 32 01 |........-.....2.|
0000c090 10 61 20 20 20 20 20 20 20 20 20 62 62 20 20 20 |.a bb |
0000c0a0 20 20 20 20 20 62 62 20 20 20 20 20 20 20 20 63 | bb c|
0000c0b0 63 63 20 20 20 20 20 20 20 00 00 00 18 00 41 00 |cc .....A.|
0000c0c0 00 00 0a f5 01 00 00 00 81 2d 08 80 00 00 00 32 |.........-.....2|
0000c0d0 01 10 61 20 20 20 20 20 20 20 20 20 65 65 20 20 |..a ee |
0000c0e0 20 20 20 20 20 20 65 65 20 20 20 20 20 20 20 20 | ee |
0000c0f0 66 66 66 20 20 20 20 20 20 20 06 00 00 20 ff 70 |fff ... .p|
0000c100 00 00 00 0a f5 02 00 00 00 81 2d 09 80 00 00 00 |..........-.....|
0000c110 32 01 10 61 20 20 20 20 20 20 20 20 20 66 66 66 |2..a fff|
0000c120 20 20 20 20 20 20 20 00 00 00 00 00 00 00 00 00 | .........|

Compared with varchar, it is found that: there is less variable-length field list, but for char, a fixed length is required to store, and if the fixed length cannot be stored, it will be filled up. For example: 20; and NULL values ​​do not need to occupy storage space.

Mixed (varchar, char):

root@localhost : test 11:21:48>create table mytest(t1 int,t2 char(10),t3 varchar(10) ,t4 char(10))engine=innodb charset = latin1 row_format=compact;
Query OK, 0 rows affected (0.17 sec)

root@localhost : test 11:21:50>insert into mytest values(1,'a','b','c');
Query OK, 1 row affected (0.00 sec)

root@localhost : test 11:22:06>insert into mytest values(11,'aa','bb','cc');
Query OK, 1 row affected (0.00 sec)

From the table structure above, we can see that:
1. Variable length field list length: 1
2. NULL flag: 1
3. Record header information: 5
4, RowID: 6
5. Transaction ID: 6
6. Rollback pointer: 7

IDB information:

0000c070 73 75 70 72 65 6d 75 6d 01 00 00 00 10 00 33 00 |supremum......3.| 
0000c080 00 00 0a f5 07 00 00 00 81 2d 1a 80 00 00 00 32 |.........-.....2|
0000c090 01 10 80 00 00 01 61 20 20 20 20 20 20 20 20 20 |......a |
0000c0a0 62 63 20 20 20 20 20 20 20 20 20 02 00 00 00 18 |bc .....|
0000c0b0 ff be 00 00 00 0a f5 08 00 00 00 81 2d 1b 80 00 |............-...|
0000c0c0 00 00 32 01 10 80 00 00 0b 61 61 20 20 20 20 20 |..2......aa |
0000c0d0 20 20 20 62 62 63 63 20 20 20 20 20 20 20 20 00 | bbcc .|

From the above information, we can conclude that it is the same as expected: because there is only one varchar field in the table, the length of the variable-length list is only: 01
Pay special attention to the following information about the data storage in each column: the t1 field is of int type and occupies 4 bytes. The first line: 80 00 00 01 represents the number 1; the second line: 80 00 00 0b represents the number 11. [select hex(11) == B ], the rest is the same as above.

The above are descriptions of the latin1 single-byte character set, what about the multi-byte character set?

root@localhost : test 11:52:10>create table mytest(id int auto_increment,t2 varchar(10),t3 varchar(10) ,t4 char(10),primary key(id))engine=innodb charset = utf8 row_format=compact;
Query OK, 0 rows affected (0.17 sec)

root@localhost : test 11:52:11>insert into mytest(t2,t3,t4) values('bb','bb','ccc');
Query OK, 1 row affected (0.00 sec)

root@localhost : test 11:55:34>insert into mytest(t2,t3,t4) values('we','they','our');
Query OK, 1 row affected (0.00 sec)

The ibd information is as follows:

0000c070 73 75 70 72 65 6d 75 6d 0a 02 02 00 00 00 10 00 |supremum........|
0000c080 28 80 00 00 01 00 00 00 81 2d 27 80 00 00 00 32 |(........-'....2|
0000c090 01 10 62 62 62 62 63 63 63 20 20 20 20 20 20 20 |..bbbbccc |
0000c0a0 0a 06 06 00 00 00 18 ff c7 80 00 00 02 00 00 00 |................|
0000c0b0 81 2d 28 80 00 00 00 32 01 10 e6 88 91 e4 bb ac |.-(....2........|
0000c0c0 e4 bb 96 e4 bb ac e6 88 91 e4 bb ac e7 9a 84 20 |............... |

Because the table has a primary key, the ROWID (6 bytes) is gone.
Pay special attention to: the variable-length field list is 3? There are only 2 varchar columns in the table. The test shows that under the condition of multi-byte character set, char type is treated as a variable-length type, and there is basically no difference in their row storage, so the variable-length list is 3, because it is the utf8 character set, which occupies three bytes. So one Chinese character takes up 3 bytes of space in one page ("we": e6 88 91 e4 bb ac).
Data column information:
The value 1 of the id column should be 80 00 00 01, why does this display 00 32 01 10, and all ids are 00 32 01 10. The test found that when the id is an auto-increment primary key, the 4-byte length of the id is represented by 00 32 01 10. Otherwise, as in the previous example, use select HEX (X) to represent it.

Summary 2:
The above tests are all based on the COMPACT storage format. Regardless of whether it is varchar or char, NULL values ​​do not occupy storage space. It is particularly important to note that the record header information of Redumdant requires 6 fixed bytes. Under the condition of a multi-byte character set, there is basically no difference between the row storage of CHAR and VARCHAR.

This is the end of this article about MySQL Innodb storage structure and storage of Null values. For more information about MySQL Innodb storage structure and storage of Null values, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Performance comparison test of MySQL's two table storage structures MyISAM and InnoDB
  • InnoDB type MySql restore table structure and data
  • Detailed explanation of the index and storage structure of the MySQL InnoDB engine
  • Detailed explanation of the data page structure of MySQL's InnoDB storage engine
  • MySQL InnoDB memory structure details

<<:  Script to quickly list all host names (computer names) in the LAN under Linux

>>:  Implementing a web player with JavaScript

Recommend

How to use Nexus to add jar packages to private servers

Why do we need to build a nexus private server? T...

When to use Map instead of plain JS objects

Table of contents 1. Map accepts any type of key ...

Let's talk about my understanding and application of React Context

Table of contents Preface First look at React Con...

Docker connects to a container through a port

Docker container connection 1. Network port mappi...

MySql development of automatic synchronization table structure

Development Pain Points During the development pr...

The table tbody in HTML can slide up and down and left and right

When the table header is fixed, it needs to be di...

Design Tips: We think you will like it

<br />Looking at this title, you may find it...

Steps to deploy Spring Boot project using Docker

Table of contents Create a simple springboot proj...

Using JS to implement binary tree traversal algorithm example code

Table of contents Preface 1. Binary Tree 1.1. Tra...

Detailed process of installing Jenkins-2.249.3-1.1 with Docker

Table of contents 1. Install Docker 2. Pull the J...

How to maintain a long connection when using nginx reverse proxy

· 【Scene description】 After HTTP1.1, the HTTP pro...