PrefaceSoftware applications run in the computer's main memory, which is called random access memory (RAM). JavaScript, especially Nodejs (server-side js) allows us to write small to large software projects for end users. Handling program memory is always a tricky issue, as a bad implementation can block all other applications running on a given server or system. C and C++ programmers do care about memory management, because there are terrible memory leaks hiding in every corner of the code. But for JS developers, have you really cared about this issue? Since JS developers typically do web server programming on dedicated, high-capacity servers, they may not notice delays in multitasking. For example, in the case of developing a web server, we will also run multiple applications like a database server (MySQL), a cache server (Redis) and other applications as required. We need to be aware that they also consume available main memory. If we write our application carelessly, we are likely to degrade the performance of other processes or even deny them memory allocations altogether. In this article, we will solve a problem to understand NodeJS constructs such as streams, buffers, and pipes and see how each of them supports writing memory-efficient applications. Problem: Large file copyingIf someone is asked to write a file copying program using NodeJS, he will quickly write the following code: const fs = require('fs'); let fileName = process.argv[2]; let destPath = process.argv[3]; fs.readFile(fileName, (err, data) => { if (err) throw err; fs.writeFile(destPath || 'output', data, (err) => { if (err) throw err; }); console.log('New file has been created!'); }); This code simply takes the input file name and path and writes it to the destination path after attempting to read the file, which is not a problem for small files. Now suppose we have a large file (greater than 4 GB) that we need to back up using this program. Take my 7.4G ultra-high-definition 4K movie as an example. I use the above program code to copy it from the current directory to another directory. $ node basic_copy.js cartoonMovie.mkv ~/Documents/bigMovie.mkv Then I got this error message on Ubuntu (Linux):
As you can see, the error occurs while reading the file because NodeJS only allows writing a maximum of 2GB of data into its buffer. To solve this problem, when you are doing I/O intensive operations (copying, processing, compression, etc.), it is best to consider the memory situation. Streams and Buffers in NodeJSTo solve the above problem, we need a way to cut large files into many file blocks, and we need a data structure to store these file blocks. A buffer is a structure used to store binary data. Next, we need a way to read and write file blocks, and Streams provides this capability. BuffersWe can easily create a buffer using the Buffer object. let buffer = new Buffer(10); # 10 is the volume of buffer console.log(buffer); # prints <Buffer 00 00 00 00 00 00 00 00 00 00> In newer versions of NodeJS (>8), you can also write like this. let buffer = new Buffer.alloc(10); console.log(buffer); # prints <Buffer 00 00 00 00 00 00 00 00 00 00> If we already have some data, such as an array or other data set, we can create a buffer for it. let name = 'Node JS DEV'; let buffer = Buffer.from(name); console.log(buffer) # prints <Buffer 4e 6f 64 65 20 4a 53 20 44 45 5> Buffers have some important methods like buffer.toString() and buffer.toJSON() that allow you to drill down into the data they store. We will not create raw buffers directly for the sake of optimizing the code. NodeJS and the V8 engine already implement this by creating internal buffers (queues) when handling streams and network sockets. StreamsIn simple terms, streams are like arbitrary doors on NodeJS objects. In computer networking, ingress is an input action and egress is an output action. We will continue to use these terms below. There are four types of streams:
The following sentence clearly explains why we should use streams. An important goal of the Stream API (and in particular the stream.pipe() method) is to limit data buffering to an acceptable level, so that sources and destinations of different speeds do not clog available memory. We need some way to get the job done without overwhelming the system. This is what we mentioned at the beginning of the article. In the diagram above, we have two types of streams, readable streams and writable streams. The .pipe() method is a very basic method used to connect a readable stream to a writable stream. If you don't understand the diagram above, don't worry. After you look at our examples, you can come back to the diagram and everything will make sense. Pipes are a fascinating mechanism, and we'll use two examples to illustrate them. Solution 1 (Simply use streams to copy files)Let's design a solution to the large file copying problem mentioned above. First we create two flows and then follow the next few steps. 1. Listen for data chunks from a readable stream 2. Write the data block into the writable stream 3. Track the progress of file copying We named this code streams_copy_basic.js /* A file copy with streams and events - Author: Naren Arya */ const stream = require('stream'); const fs = require('fs'); let fileName = process.argv[2]; let destPath = process.argv[3]; const readabale = fs.createReadStream(fileName); const writeable = fs.createWriteStream(destPath || "output"); fs.stat(fileName, (err, stats) => { this.fileSize = stats.size; this.counter = 1; this.fileArray = fileName.split('.'); try { this.duplicate = destPath + "/" + this.fileArray[0] + '_Copy.' + this.fileArray[1]; } catch(e) { console.exception('File name is invalid! please pass the proper one'); } process.stdout.write(`File: ${this.duplicate} is being created:`); readabale.on('data', (chunk)=> { let percentageCopied = ((chunk.length * this.counter) / this.fileSize) * 100; process.stdout.clearLine(); // clear current text process.stdout.cursorTo(0); process.stdout.write(`${Math.round(percentageCopied)}%`); writeable.write(chunk); this.counter += 1; }); readabale.on('end', (e) => { process.stdout.clearLine(); // clear current text process.stdout.cursorTo(0); process.stdout.write("Successfully finished the operation"); return; }); readabale.on('error', (e) => { console.log("Some error occurred: ", e); }); writeable.on('finish', () => { console.log("Successfully created the file copy!"); }); }); In this program, we receive two file paths (source file and target file) passed in by the user, and then create two streams to transfer data blocks from the readable stream to the writable stream. We then define some variables to track the progress of the file copy and then output it to the console (console in this case). At the same time we also subscribe to some events: data: Triggered when a block of data is read end: Triggered when a data block is read by the readable stream error: Triggered when an error occurs while reading a data block By running this program, we can successfully complete the task of copying a large file (7.4 G here). $ time node streams_copy_basic.js cartoonMovie.mkv ~/Documents/4kdemo.mkv However, when we observe the memory status of the program during operation through the task manager, there is still a problem. 4.6GB? The memory consumed by our program while it is running does not make sense here, and it is very likely to block other applications. what happened? If you look closely at the read and write rates in the above figure, you will find some clues. Disk Read: 53.4 MiB/s Disk Write: 14.8 MiB/s This means that producers are producing at a faster rate and consumers are unable to keep up. To save the data block read, the computer stores the excess data in the machine's RAM. That's why there's a spike in RAM. The above code runs in 3 minutes and 16 seconds on my machine...
Solution 2 (file copying based on streams and automatic back pressure)To overcome the above problems, we can modify the program to automatically adjust the read and write speed of the disk. This mechanism is backpressure. We don’t need to do much, just import the readable stream into the writable stream, and NodeJS will take care of the backpressure. Let's name this program streams_copy_efficient.js /* A file copy with streams and piping - Author: Naren Arya */ const stream = require('stream'); const fs = require('fs'); let fileName = process.argv[2]; let destPath = process.argv[3]; const readabale = fs.createReadStream(fileName); const writeable = fs.createWriteStream(destPath || "output"); fs.stat(fileName, (err, stats) => { this.fileSize = stats.size; this.counter = 1; this.fileArray = fileName.split('.'); try { this.duplicate = destPath + "/" + this.fileArray[0] + '_Copy.' + this.fileArray[1]; } catch(e) { console.exception('File name is invalid! please pass the proper one'); } process.stdout.write(`File: ${this.duplicate} is being created:`); readabale.on('data', (chunk) => { let percentageCopied = ((chunk.length * this.counter) / this.fileSize) * 100; process.stdout.clearLine(); // clear current text process.stdout.cursorTo(0); process.stdout.write(`${Math.round(percentageCopied)}%`); this.counter += 1; }); readabale.pipe(writeable); // Auto pilot ON! // In case if we have an interruption while copying writeable.on('unpipe', (e) => { process.stdout.write("Copy has failed!"); }); }); In this example, we replaced the previous data block write operation with one line of code. readabale.pipe(writeable); // Auto pilot ON! The pipe here is where all the magic happens. It controls the speed of disk reads and writes so as not to clog the main memory (RAM). Run it. $ time node streams_copy_efficient.js cartoonMovie.mkv ~/Documents/4kdemo.mkv We copied the same large file (7.4 GB) and let’s look at the memory utilization. shock! Now the Node program only takes up 61.9 MiB of memory. If you observe the read and write rates: Disk Read: 35.5 MiB/s Disk Write: 35.5 MiB/s At any given time, the read and write rates remain consistent due to back pressure. What’s even more surprising is that this optimized program code is 13 seconds faster than the previous one.
Thanks to NodeJS streams and pipes, the memory load was reduced by 98.68% and the execution time was also reduced. That's why the pipeline is a powerful presence. 61.9 MiB is the size of the buffer created by the readable stream. We can also allocate a custom size for the buffer chunk using the read method on the readable stream. const readabale = fs.createReadStream(fileName); readable.read(no_of_bytes_size); In addition to local file copying, this technique can also be used to optimize many I/O operation problems:
in conclusionThe motivation for writing this article is mainly to illustrate that even if NodeJS provides a good API, we may accidentally write code with poor performance. If we could pay more attention to the tools built into it, we could better optimize how the program runs. The above is the detailed content of how to use Node.js to write memory-efficient applications. For more information about Node.js, please pay attention to other related articles on 123WORDPRESS.COM! You may also be interested in:
|
<<: A Deeper Look at SQL Injection
>>: How to configure SSL certificate in nginx to implement https service
Table of contents Preface Scenario simulation Sum...
Table of contents illustrate 1. Blob object 2. Fr...
As shown below: name describe character varying(n...
When creating a MySQL container with Docker, some...
The role of init_connect init_connect is usually ...
1. OpenSSL official website Official download add...
Frame structure tag <frameset></frameset...
Redux is a data state management plug-in. When us...
Table of contents When setting up a MySQL master-...
1. Preliminary preparation (windows7+mysql-8.0.18...
MYSQL is case sensitive Seeing the words is belie...
Preface The string types of MySQL database are CH...
MTR stands for Mini-Transaction. As the name sugg...
0x0 Introduction First of all, what is a hash alg...
Currently, layui officials have not provided the ...