MySQL data compression performance comparison details

MySQL data compression performance comparison details

The data required by the Data Cube is rarely or never updated once it is written. This kind of data is well suited for compression to reduce disk usage. MySQL itself provides two compression methods - archive engine and myisampack method for MyISAM engine. Today, we tested these two methods separately and compared their advantages and disadvantages in terms of disk usage and query performance. As for why I do this, you should understand, and I will introduce it later. And look at the text:

1. Test environment

1.1 Hardware and Software

A 64-bit 2.6.18-92 kernel Linux development machine, 4G memory, and four 2800Mhz Dual-Core AMD Opteron (tm) Processor 2220 CPUs.

MySQL is placed on a 7200 rpm SAT hard disk, not in raid ;

No optimization was made to MySQL and query cache was turned off to prevent query cache from interfering with the test results.

1.2 Table Structure

2,424,753 records, actual data of a shard in the production environment;

The joint indexes ( partition_by1,idx_rank ) and ( partition_by1,chg_idx ) are established respectively, where partition_by1 is a varchar type with a length of 32 and is used for retrieval; the other two fields are floating point numbers and are mostly used for sorting;

As a sub-column, autokid acts as PRIMARY KEY and is only used to ensure atomicity during data loading. It has no practical significance.

2. Test Purpose

2.1 Comparison of compression space

The greater the compression rate, the smaller the disk space occupied, which directly reduces the data storage cost;

2.2 Query Performance Comparison

There should not be a noticeable degradation in query performance after compression. Archive does not support indexing, so performance degradation is inevitable. We should also have an idea of ​​how much the performance will be reduced and whether it is acceptable.

3. Testing Tools

3.1 mysqlslap

The official tool is of course the best choice. For an introduction to mysqlslap , please refer to the official documentation.

3.2 Test query

A total of 9973 actual SQL statements accessing the topranks_v3 table in the production environment were intercepted, from which 7 with the largest number of visits were extracted, with a concurrency of 50 and repeated 10 times. The command is as follows:

./mysqlslap --defaults-file=../etc/my.cnf -u**** -p**** -c50 -i10 -q ../t.sql --debug-info

4. Test conclusion

Comparison Items Disk space Time consumed (seconds) CPU Idle LOAD concurrent
Base table (MyISAM) 403956004 2.308 30 15 50
ARCHIVE 75630745 >300 75 4 1
PACK 99302109 2.596 30 twenty two 50

Based on the test data given in the table above, we can simply draw the following conclusions:

  • For the test table, the space occupied by Archive table is about 18.7% of the previous space, and the space occupied after myisampack is about 24.6% of the previous space. The difference between the two is not much. From the perspective of space utilization alone, it seems that we need to choose archive table.
  • Let's look at query performance and compare it with the benchmark table. In terms of total time and system load, the query performance of pack table under 50 concurrency is comparable to that of the benchmark table; while archive table takes more than 5 minutes under single concurrency (I can't wait any longer, so I kill it)!

So, we seem to be able to conclude that for tables that require online queries, ARCHIVE engine can basically be ignored.

Why is ARCHIVE engine so slow during this test?

We know that mysql provides the archive storage engine to reduce disk overhead, but there is also a prerequisite, that is, the archived data does not need to be or is rarely queried online, and it does not matter if the query is slow occasionally. Due to the above reasons, archive tables are not allowed to create indexes other than auto-increment columns.

With this consensus, let's take a test SQL to analyze why there is such a big difference in query performance before and after not using the index.

In our test SQL, there is such a line:

SELECT c1,c2,...,cn FROM mysqlslap.rpt_topranks_v3
WHERE ... AND partition_by1 = '50008090'
ORDER BY added_quantity3 DESC
LIMIT 500


As we said before, the test table has an index on the partition_by1 field. So, we preliminarily judge that this query should use the partition_by1 index on the benchmark table and myisampack table. EXPLAIN:

mysql> EXPLAIN
    -> SELECT ... FROM mysqlslap.rpt_topranks_v3
    -> WHERE ... AND partition_by1 = '50008090'
    -> ORDER BY added_quantity3 DESC
    -> LIMIT 500\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        TABLE: rpt_topranks_v3
         type: ref
possible_keys: idx_toprank_pid,idx_toprank_chg
          KEY: idx_toprank_pid
      key_len: 99
          ref: const
         rows: 2477
        Extra: USING WHERE; USING filesort
1 row IN SET (0.00 sec)

As we expected, this query uses the index on the partition_by1 field, matches the target number of rows 2477, and then there is a sort on the added_quantity3 field. Since added_quantity3 has no index, filesort is used.

Let's take a look at the EXPLAIN results of this SQL on the archive table:

mysql> EXPLAIN
    -> SELECT ... FROM mysqlslap.rpt_topranks_v3_<strong>archive</strong>
    -> WHERE ... AND partition_by1 = '50008090'
    -> ORDER BY added_quantity3 DESC
    -> LIMIT 500\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        TABLE: rpt_topranks_v3_archive
         type: ALL
possible_keys: NULL
          KEY: NULL
      key_len: NULL
          ref: NULL
         rows: 2424753
        Extra: USING WHERE; USING filesort
1 row IN SET (0.00 sec)


EXPLAIN says: " I don't have any indexes available, so I can only scan the entire table for 2424753 rows and then do a filesort ." If you want performance, then you are obviously wronging MySQL .

This is the end of this article about the details of MySQL data compression performance comparison. For more information about MySQL data compression performance comparison, please search for previous articles on 123WORDPRESS.COM or continue to browse the following related articles. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Detailed tutorial on installation and configuration of compressed version of MySQL database
  • Python regularly backs up MySQL data by date and compresses it
  • MySQL database backup command sharing (MySQL compressed database backup)

<<:  How to use Docker to build a development environment (Windows and Mac)

>>:  Double loading issue when the page contains img src

Recommend

Several ways to shut down Hyper-V service under Windows 10

When using VMware Workstation to open a virtual m...

Detailed example of IOS database upgrade data migration

Detailed example of IOS database upgrade data mig...

Analysis and Solution of ERROR:2002 Reported When MySQL Starts

Preface This article mainly introduces the analys...

Detailed explanation of SQL injection - security (Part 2)

If there are any errors in this article or you ha...

Super detailed basic JavaScript syntax rules

Table of contents 01 JavaScript (abbreviated as: ...

Three ways to delete a table in MySQL (summary)

drop table Drop directly deletes table informatio...

How to solve the problem of too many open files in Linux

The cause is that the process opens a number of f...

Illustration-style website homepage design New trend in website design

You can see that their visual effects are very bea...

Vue uses rules to implement form field validation

There are many ways to write and validate form fi...

Detailed explanation of viewing and setting file permissions on Mac

Preface To modify file permissions in the termina...

Make your text dance with the marquee attribute in HTML

Syntax: <marquee> …</marquee> Using th...

Detailed explanation of the use of title tags and paragraph tags in XHTML

XHTML Headings Overview When we write Word docume...

Example analysis of mysql stored procedure usage

This article describes the usage of MySQL stored ...

Detailed tutorial on installing JDK1.8 on Linux

1. Cleaning before installation rpm -qa | grep jd...