MySQL data compression performance comparison details

MySQL data compression performance comparison details

The data required by the Data Cube is rarely or never updated once it is written. This kind of data is well suited for compression to reduce disk usage. MySQL itself provides two compression methods - archive engine and myisampack method for MyISAM engine. Today, we tested these two methods separately and compared their advantages and disadvantages in terms of disk usage and query performance. As for why I do this, you should understand, and I will introduce it later. And look at the text:

1. Test environment

1.1 Hardware and Software

A 64-bit 2.6.18-92 kernel Linux development machine, 4G memory, and four 2800Mhz Dual-Core AMD Opteron (tm) Processor 2220 CPUs.

MySQL is placed on a 7200 rpm SAT hard disk, not in raid ;

No optimization was made to MySQL and query cache was turned off to prevent query cache from interfering with the test results.

1.2 Table Structure

2,424,753 records, actual data of a shard in the production environment;

The joint indexes ( partition_by1,idx_rank ) and ( partition_by1,chg_idx ) are established respectively, where partition_by1 is a varchar type with a length of 32 and is used for retrieval; the other two fields are floating point numbers and are mostly used for sorting;

As a sub-column, autokid acts as PRIMARY KEY and is only used to ensure atomicity during data loading. It has no practical significance.

2. Test Purpose

2.1 Comparison of compression space

The greater the compression rate, the smaller the disk space occupied, which directly reduces the data storage cost;

2.2 Query Performance Comparison

There should not be a noticeable degradation in query performance after compression. Archive does not support indexing, so performance degradation is inevitable. We should also have an idea of ​​how much the performance will be reduced and whether it is acceptable.

3. Testing Tools

3.1 mysqlslap

The official tool is of course the best choice. For an introduction to mysqlslap , please refer to the official documentation.

3.2 Test query

A total of 9973 actual SQL statements accessing the topranks_v3 table in the production environment were intercepted, from which 7 with the largest number of visits were extracted, with a concurrency of 50 and repeated 10 times. The command is as follows:

./mysqlslap --defaults-file=../etc/my.cnf -u**** -p**** -c50 -i10 -q ../t.sql --debug-info

4. Test conclusion

Comparison Items Disk space Time consumed (seconds) CPU Idle LOAD concurrent
Base table (MyISAM) 403956004 2.308 30 15 50
ARCHIVE 75630745 >300 75 4 1
PACK 99302109 2.596 30 twenty two 50

Based on the test data given in the table above, we can simply draw the following conclusions:

  • For the test table, the space occupied by Archive table is about 18.7% of the previous space, and the space occupied after myisampack is about 24.6% of the previous space. The difference between the two is not much. From the perspective of space utilization alone, it seems that we need to choose archive table.
  • Let's look at query performance and compare it with the benchmark table. In terms of total time and system load, the query performance of pack table under 50 concurrency is comparable to that of the benchmark table; while archive table takes more than 5 minutes under single concurrency (I can't wait any longer, so I kill it)!

So, we seem to be able to conclude that for tables that require online queries, ARCHIVE engine can basically be ignored.

Why is ARCHIVE engine so slow during this test?

We know that mysql provides the archive storage engine to reduce disk overhead, but there is also a prerequisite, that is, the archived data does not need to be or is rarely queried online, and it does not matter if the query is slow occasionally. Due to the above reasons, archive tables are not allowed to create indexes other than auto-increment columns.

With this consensus, let's take a test SQL to analyze why there is such a big difference in query performance before and after not using the index.

In our test SQL, there is such a line:

SELECT c1,c2,...,cn FROM mysqlslap.rpt_topranks_v3
WHERE ... AND partition_by1 = '50008090'
ORDER BY added_quantity3 DESC
LIMIT 500


As we said before, the test table has an index on the partition_by1 field. So, we preliminarily judge that this query should use the partition_by1 index on the benchmark table and myisampack table. EXPLAIN:

mysql> EXPLAIN
    -> SELECT ... FROM mysqlslap.rpt_topranks_v3
    -> WHERE ... AND partition_by1 = '50008090'
    -> ORDER BY added_quantity3 DESC
    -> LIMIT 500\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        TABLE: rpt_topranks_v3
         type: ref
possible_keys: idx_toprank_pid,idx_toprank_chg
          KEY: idx_toprank_pid
      key_len: 99
          ref: const
         rows: 2477
        Extra: USING WHERE; USING filesort
1 row IN SET (0.00 sec)

As we expected, this query uses the index on the partition_by1 field, matches the target number of rows 2477, and then there is a sort on the added_quantity3 field. Since added_quantity3 has no index, filesort is used.

Let's take a look at the EXPLAIN results of this SQL on the archive table:

mysql> EXPLAIN
    -> SELECT ... FROM mysqlslap.rpt_topranks_v3_<strong>archive</strong>
    -> WHERE ... AND partition_by1 = '50008090'
    -> ORDER BY added_quantity3 DESC
    -> LIMIT 500\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        TABLE: rpt_topranks_v3_archive
         type: ALL
possible_keys: NULL
          KEY: NULL
      key_len: NULL
          ref: NULL
         rows: 2424753
        Extra: USING WHERE; USING filesort
1 row IN SET (0.00 sec)


EXPLAIN says: " I don't have any indexes available, so I can only scan the entire table for 2424753 rows and then do a filesort ." If you want performance, then you are obviously wronging MySQL .

This is the end of this article about the details of MySQL data compression performance comparison. For more information about MySQL data compression performance comparison, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Detailed tutorial on installation and configuration of compressed version of MySQL database
  • Python regularly backs up MySQL data by date and compresses it
  • MySQL database backup command sharing (MySQL compressed database backup)

<<:  How to use Docker to build a development environment (Windows and Mac)

>>:  Double loading issue when the page contains img src

Recommend

Example code for implementing a text marquee with CSS3

Background Here's what happened, Luzhu accide...

Ubuntu Server Installation Tutorial in Vmware

This article shares with you the Ubuntu server ve...

Detailed installation instructions for the cloud server pagoda panel

Table of contents 0x01. Install the Pagoda Panel ...

Vue mobile terminal realizes finger sliding effect

This article example shares the specific code for...

How to use Element in React project

This is my first time using the element framework...

Detailed installation and configuration of hadoop2.7.2 under ubuntu15.10

There are many Hadoop installation tutorials on L...

HTML implements read-only text box and cannot modify the content

Without further ado, I will post the code for you...

Thinking about grid design of web pages

<br />Original address: http://andymao.com/a...

Storage engine and log description based on MySQL (comprehensive explanation)

1.1 Introduction to storage engines 1.1.1 File sy...

Detailed explanation of Vue plugin

Summarize This article ends here. I hope it can b...

The whole process record of vue3 recursive component encapsulation

Table of contents Preface 1. Recursive components...

Full analysis of MySQL INT type

Preface: Integer is one of the most commonly used...

Is it true that the simpler the web design style, the better?

Original address: http://www.webdesignfromscratch...

MySQL 5.7 and above version download and installation graphic tutorial

1. Download 1. MySQL official website download ad...

Detailed explanation of Docker Volume permission management

Volume data volume is an important concept of Doc...