Steps for importing tens of millions of data into MySQL using .Net Core

Steps for importing tens of millions of data into MySQL using .Net Core

Preliminary preparation

Order Test Form

CREATE TABLE `trade` (
  `id` VARCHAR(50) NULL DEFAULT NULL COLLATE 'utf8_unicode_ci',
  `trade_no` VARCHAR(50) NULL DEFAULT NULL COLLATE 'utf8_unicode_ci',
  UNIQUE INDEX `id` (`id`),
  INDEX `trade_no` (`trade_no`)
)
COMMENT = 'Order'
COLLATE='utf8_unicode_ci'
ENGINE=InnoDB;

Test environment

Operating system: Window 10 Professional

CPU: Intel(R) Core(TM) i7-8650U CPU @1.90GHZ 2.11 GHZ

Memory: 16G

MySQL version: 5.7.26

Implementation method:

1. Insert a single data

This is the most common way, importing data one by one through a loop. The obvious disadvantage of this method is that it requires connecting to the database every time.

Implementation code:

//Start time var startTime = DateTime.Now;
using (var conn = new MySqlConnection(connsql))
{
    conn.Open();
​
    //Insert 100,000 data for (var i = 0; i < 100000; i++)
    {
        //Insert var sql = string.Format("insert into trade(id,trade_no) values('{0}','{1}');",
            Guid.NewGuid().ToString(), "trade_" + (i + 1)
            );
        var sqlComm = new MySqlCommand();
        sqlComm.Connection = conn;
        sqlComm.CommandText = sql;
        sqlComm.ExecuteNonQuery();
        sqlComm.Dispose();
    }
​
    conn.Close();
}
​
//Completion time var endTime = DateTime.Now;
​
//Time consuming var spanTime = endTime - startTime;
Console.WriteLine("Loop insertion method takes: " + spanTime.Minutes + "minutes" + spanTime.Seconds + "seconds" + spanTime.Milliseconds + "milliseconds");

100,000 tests take:

In the above example, we are importing 100,000 records in batches and need to connect to the database 100,000 times. We change the SQL statement to 1000 and concatenate them into 1, which can reduce the database connection. The code is modified as follows:

//Start time var startTime = DateTime.Now;
using (var conn = new MySqlConnection(connsql))
{
    conn.Open();
​
    //Insert 100,000 data var sql = new StringBuilder();
    for (var i = 0; i < 100000; i++)
    {
        //Insert sql.AppendFormat("insert into trade(id,trade_no) values('{0}','{1}');",
            Guid.NewGuid().ToString(), "trade_" + (i + 1)
            );
​
        //Merge and insert if (i % 1000 == 999)
        {
            var sqlComm = new MySqlCommand();
            sqlComm.Connection = conn;
            sqlComm.CommandText = sql.ToString();
            sqlComm.ExecuteNonQuery();
            sqlComm.Dispose();
            sql.Clear();
        }
    }
​
    conn.Close();
}
​
//Completion time var endTime = DateTime.Now;
​
//Time consuming var spanTime = endTime - startTime;
Console.WriteLine("Loop insertion method takes: " + spanTime.Minutes + "minutes" + spanTime.Seconds + "seconds" + spanTime.Milliseconds + "milliseconds");

100,000 tests take:

After optimization, the database connection time that originally required 100,000 times only needs to be connected 100 times. Judging from the final operating results, since the database is on the same server and does not involve network transmission, the performance improvement is not obvious.

2. Merge data insertion

MySQL also supports batch data import by merging data. Implementation code:

//Start time var startTime = DateTime.Now;
using (var conn = new MySqlConnection(connsql))
{
    conn.Open();
​
    //Insert 100,000 data var sql = new StringBuilder();
    for (var i = 0; i < 100000; i++)
    {
        if (i % 1000 == 0)
        {
            sql.Append("insert into trade(id,trade_no) values");
        }
​
        // concatenation sql.AppendFormat("('{0}','{1}'),", Guid.NewGuid().ToString(), "trade_" + (i + 1));
​
        //Insert 1000 records at a time if (i % 1000 == 999)
        {
            var sqlComm = new MySqlCommand();
            sqlComm.Connection = conn;
            sqlComm.CommandText = sql.ToString().TrimEnd(',');
            sqlComm.ExecuteNonQuery();
            sqlComm.Dispose();
            sql.Clear();
        }
    }

        ​
    conn.Close();
}
​
//Completion time var endTime = DateTime.Now;
​
//Time consuming var spanTime = endTime - startTime;
Console.WriteLine("Merge data insertion method takes time: " + spanTime.Minutes + "minutes" + spanTime.Seconds + "seconds" + spanTime.Milliseconds + "milliseconds");

100,000 tests take:

Inserting operations in this way can significantly improve the insertion efficiency of the program. Although the first method can also reduce the number of database connections after optimization, the second method reduces the amount of logs after merging (MySQL binlog and innodb transaction logs), reduces the amount and frequency of log flushing, and thus improves efficiency. At the same time, it can also reduce the number of SQL statement parsing times and reduce network transmission IO.

3. MySqlBulkLoader insert

MySQLBulkLoader is also called LOAD DATA INFILE. Its principle is to read data from a file. So we need to save our dataset to a file and then read it from the file.

Implementation code:

//Start time var startTime = DateTime.Now;
using (var conn = new MySqlConnection(connsql))
{
    conn.Open();
    var table = new DataTable();
    table.Columns.Add("id", typeof(string));
    table.Columns.Add("trade_no", typeof(string));
​
    //Generate 100,000 data for (var i = 0; i < 100000; i++)
    {
        if (i % 500000 == 0)
        {
            table.Rows.Clear();
        }
​
        //Record var row = table.NewRow();
        row[0] = Guid.NewGuid().ToString();
        row[1] = "trade_" + (i + 1);
        table.Rows.Add(row);
​
        //Insert 500,000 records in a batch if (i % 500000 != 499999 && i < (100000 - 1))
        {
            continue;
        }
        Console.WriteLine("Start inserting: " + i);
​
        //Convert data to csv format var tradeCsv = DataTableToCsv(table);
        var tradeFilePath = System.AppDomain.CurrentDomain.BaseDirectory + "trade.csv";
        File.WriteAllText(tradeFilePath, tradeCsv);
​
        #region Save to database var bulkCopy = new MySqlBulkLoader(conn)
        {
            FieldTerminator = ",",
            FieldQuotationCharacter = '"',
            EscapeCharacter = '"',
            LineTerminator = "\r\n",
            FileName = tradeFilePath,
            NumberOfLinesToSkip = 0,
            TableName = "trade"
        };
​
        bulkCopy.Columns.AddRange(table.Columns.Cast<DataColumn>().Select(colum => colum.ColumnName).ToList());
        bulkCopy.Load();
        #endregion
    }
​
    conn.Close();
}
​
//Completion time var endTime = DateTime.Now;
​
//Time consuming var spanTime = endTime - startTime;
Console.WriteLine("MySqlBulk method takes: " + spanTime.Minutes + "minutes" + spanTime.Seconds + "seconds" + spanTime.Milliseconds + "milliseconds");

100,000 tests take:

Note: The MySQL database configuration needs to be enabled: Allow file import. The configuration is as follows:

secure_file_priv=

Performance test comparison

For the above three methods, 100,000, 200,000, 1 million, and 10 million data records were tested respectively, and the final performance is as follows:

at last

According to the test data, as the amount of data increases, the MySqlBulkLoader method still performs well, while the performance of other methods decreases significantly. The MySqlBulkLoader method can fully meet our needs.

The above is the detailed content of the steps for .Net Core to import tens of millions of data into Mysql. For more information about importing tens of millions of data into Mysql, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • .Net Core imports tens of millions of data into MySQL database
  • Detailed explanation of MySQL database tens of millions of data query and storage
  • Optimizing query speed of MySQL with tens of millions of data using indexes
  • MySQL loop inserts tens of millions of data
  • How to quickly paginate MySQL data volumes of tens of millions
  • Optimizing the performance of paging query for MySQL with tens of millions of data
  • How to optimize MySQL tables with tens of millions of data?
  • Sharing ideas on processing tens of millions of data in a single MySQL table

<<:  JavaScript Basics: Scope

>>:  Web page WB.ExecWB control printing method call description and parameter introduction

Recommend

The pitfalls and solutions caused by the default value of sql_mode in MySQL 5.7

During normal project development, if the MySQL v...

Linux IO multiplexing epoll network programming

Preface This chapter uses basic Linux functions a...

Detailed explanation of CSS text decoration text-decoration &amp; text-emphasis

In CSS, text is one of the most common things we ...

Super detailed tutorial to implement Vue bottom navigation bar TabBar

Table of contents Project Introduction: Project D...

HTML solves the problem of invalid table width setting

If you set the table-layer:fixed style for a tabl...

HTML table tag tutorial (44): table header tag

<br />In order to clearly distinguish the ta...

A simple example of how to implement fuzzy query in Vue

Preface The so-called fuzzy query is to provide q...

How to make if judgment in js as smooth as silk

Table of contents Preface Code Implementation Ide...

How to use shell scripts in node

background During development, we may need some s...

How to visualize sketched charts in Vue.js using RoughViz

introduce A chart is a graphical representation o...

How to install Composer in Linux

1. Download the installation script - composer-se...

Vue implements user login and token verification

In the case of complete separation of the front-e...

Eight implementation solutions for cross-domain js front-end

Table of contents 1. jsonp cross-domain 2. docume...

How to use the EXPLAIN command in SQL

In daily work, we sometimes run slow queries to r...