How to create a test database with tens of millions of test data in MySQL

How to create a test database with tens of millions of test data in MySQL

Sometimes you need to create some test data, based on the test library officially provided by MySQL, and insert hundreds of thousands, millions, or tens of millions of data. This is done using some functions and stored procedures.

Official test library address: https://github.com/datacharmer/test_db

We imported the official database and made some simplifications, leaving three tables: department table, employee table and employment table, and removed foreign key associations. Because it is test data, the correspondence between dates is not accurate.

Necessary functions

Generate a random string

RAND(): Generates a random number between 0 and 1

FLOOR: Downward integer (FLOOR(1.2)=1)

CEILING rounds up (CEILING(1.2)=2)

substring: intercept string

concat: string concatenation

CREATE DEFINER=`root`@`localhost` FUNCTION `rand_string`(n INT) RETURNS varchar(255) CHARSET latin1
BEGIN
	DECLARE chars_str varchar(100) DEFAULT 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz';
  DECLARE return_str varchar(255) DEFAULT '';
  DECLARE i INT DEFAULT 0;
  WHILE i < n DO
    SET return_str = concat(return_str,substring(chars_str,FLOOR(1 + RAND()*62),1));
    SET i = i +1;
  END WHILE;
  RETURN return_str;
END

View all custom functions

show function status

Test use:

select rand_string(5);

Generate a random year, month, and day string

Generate a random date within a specified time period

SELECT
	date(
	from_unixtime( unix_timestamp( '2000-01-01' ) + floor( rand() * ( unix_timestamp( '2020-12-31' ) - unix_timestamp( '2000-01-01' ) + 1 ) ) ));

Function: Generate a random date within a specified time period

CREATE DEFINER=`root`@`localhost` FUNCTION `rand_date_string`(startDate varchar(255),endDate varchar(255)) RETURNS varchar(255) CHARSET latin1
BEGIN
  DECLARE return_str varchar(255) DEFAULT '';
	
	 SET return_str =date(
		from_unixtime( unix_timestamp( startDate ) 
				+ floor( rand() * ( unix_timestamp( endDate ) - unix_timestamp( startDate ) + 1 ) ) 
			)
	 );
  
  RETURN return_str;
END

Test use:

select rand_date_string('2000-01-01','2020-12-31');
//Result 2001-09-10

Stored procedure generates data

If one million employees are inserted into each department, the employee table will have nine million records.

CREATE DEFINER=`root`@`localhost` PROCEDURE `data`()
BEGIN
	
	DECLARE i INT DEFAULT 1;
	DECLARE j INT DEFAULT 0;
	DECLARE id INT DEFAULT 0;
	WHILE i < 10 DO
			WHILE j < 1000000 DO
				insert into employees_m (emp_no,birth_date,first_name,last_name,gender,hire_date) VALUES(
					id,rand_date_string('1970-01-01','1997-01-01'),rand_string(5),rand_string(5),'M',NOW());
				
				insert into dept_emp_m (emp_no,dept_no,from_date,to_date) values(
					 id,concat('d00',i),rand_date_string('1990-01-01','2020-12-31'),'2020-12-31');
					 
			SET j=j+1;  
			SET id=id+1;  
			END WHILE;
	SET j = 0;
	SET i=i+1;   
	END WHILE;

END

Approximate time to insert 9 million records: 4868s

The above method of inserting data will be very slow. You can insert data into a memory table, that is, change the storage engine of the table to MEMORY. In this way, memory will be used to store data, which will be much faster than directly inserting it into a table with the INNODB engine. It is just that there is no persistence, but the speed is very fast. It takes about 1227.89s to insert 10 million data.

appendix

Modify the table storage engine

ALTER TABLE dept_emp_m ENGINE=MEMORY;

Adjust the size of the memory table and modify the configuration file

[mysqld]
max_heap_table_size = 2048M
tmp_table_size = 2048M

The above is the details of how to create a test library with tens of millions of test data in MySQL. For more information about tens of millions of test data in MySQL, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • MySQL loop inserts tens of millions of data
  • How to quickly paginate MySQL data volumes of tens of millions
  • Detailed explanation of several practical solutions for quickly deleting large amounts of data (tens of millions) in MySQL
  • Summary of SQL query optimization knowledge points for MySQL tens of millions of big data
  • MySql quick insert tens of millions of large data examples
  • How to quickly create tens of millions of test data in MySQL
  • Detailed explanation of the batch query design pattern for MySQL sharding to achieve distributed storage of millions of records
  • MySQL million-level paging optimization (MySQL ten million-level fast paging)
  • How to optimize MySQL tables with tens of millions of data?
  • Optimize Mysql limit, reference the compound index of fast paging from one million to ten million and apply it to lightweight framework

<<:  Solve the problem of starting two ports that occupy different ports when docker run

>>:  How to implement a simple HTML video player

Recommend

Analysis of the Principles of MySQL Slow Query Related Parameters

MySQL slow query, whose full name is slow query l...

How to purchase and install Alibaba Cloud servers

1. Purchase a server In the example, the server p...

Steps for encapsulating element-ui pop-up components

Encapsulate el-dialog as a component When we use ...

Real-time refresh of long connection on Vue+WebSocket page

Recently, the Vue project needs to refresh the da...

Use vue3 to implement a human-cat communication applet

Table of contents Preface Initialize the project ...

Native js to implement drop-down box selection component

This article example shares the specific code of ...

Idea deploys remote Docker and configures the file

1. Modify the Linux server docker configuration f...

Analysis of two implementation methods for adding static routing in Linux

Command to add a route: 1.Route add route add -ne...

Detailed explanation of MYSQL log and backup and restore issues

This article shares MYSQL logs and backup and res...

jQuery realizes the shuttle box function

This article example shares the specific code of ...

MySQL example of getting today and yesterday's 0:00 timestamp

As shown below: Yesterday: UNIX_TIMESTAMP(CAST(SY...

Vue implements zip file download

This article example shares the specific code of ...