Nodejs Exploration: In-depth understanding of the principle of single-threaded high concurrency

Nodejs Exploration: In-depth understanding of the principle of single-threaded high concurrency

Preface

Since Node.js came into our view, what we know about it is composed of these keywords: event-driven, non-blocking I/O, efficient, and lightweight. This is how it describes itself on its official website.
Node.js® is a JavaScript runtime built on Chrome's V8 JavaScript engine. Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient.

So when we first come into contact with Nodejs, we will have some questions:

1. Why can Javascript running in the browser interact with the operating system at such a low level?
2. Is nodejs really single-threaded?
3. If it is single-threaded, how does it handle high concurrent requests?
4. How is nodejs event-driven implemented?

Are you feeling overwhelmed after seeing these questions? Don’t worry, let’s read this article slowly with these questions in mind.

Architecture at a Glance

The above questions are all very low-level, so let’s start with Node.js itself and take a look at the structure of Node.js.

Node.js standard library, which is written in Javascript, is the API that can be called directly during our use. You can see it in the lib directory in the source code.

Node bindings, this layer is the key to the communication between Javascript and the underlying C/C++. The former calls the latter through bindings to exchange data with each other. Implemented in node.cc

This layer is the key to supporting the operation of Node.js and is implemented in C/C++.
V8: The Javascript VM launched by Google is also the key to why Node.js uses Javascript. It provides an environment for Javascript to run on the non-browser side. Its high efficiency is one of the reasons why Node.js is so efficient.
Libuv: It provides Node.js with cross-platform, thread pool, event pool, asynchronous I/O and other capabilities, and is the key to Node.js's power.
C-ares: Provides the ability to asynchronously process DNS-related functions.
http_parser, OpenSSL, zlib, etc.: provide other capabilities including http parsing, SSL, data compression, etc.

Interacting with the operating system

For example, if we want to open a file and perform some operations, we can write the following code:

var fs = require('fs');fs.open('./test.txt', "w", function(err, fd) { //..do something});

The calling process of this code can be roughly described as: lib/fs.js → src/node_file.cc → uv_fs

lib/fs.js

async function open(path, flags, mode) { mode = modeNum(mode, 0o666); path = getPathFromURL(path);
  validatePath(path);
  validateUint32(mode, 'mode');
  return new FileHandle(
    await binding.openFileHandle(pathModule.toNamespacedPath(path),
             stringToFlags(flags), mode, kUsePromises));
}

src/node_file.cc

static void Open(const FunctionCallbackInfo& args) { Environment* env = Environment::GetCurrent(args); const int argc = args.Length(); if (req_wrap_async != nullptr) { // open(path, flags, mode, req) AsyncCall(env, req_wrap_async, args, "open", UTF8, AfterInteger,
              uv_fs_open, *path, flags, mode);
  } else { // open(path, flags, mode, undefined, ctx) CHECK_EQ(argc, 5); FSReqWrapSync req_wrap_sync; FS_SYNC_TRACE_BEGIN(open); int result = SyncCall(env, args[4], &req_wrap_sync, "open",
                          uv_fs_open, *path, flags, mode); FS_SYNC_TRACE_END(open);
    args.GetReturnValue().Set(result);
  }
}

uv_fs

/* Open the destination file. */
  dstfd = uv_fs_open(NULL, &fs_req,
                     req->new_path,
                     dst_flags,
                     statsbuf.st_mode, NULL);
  uv_fs_req_cleanup(&fs_req);

A picture from Node.js in Simple Terms:

Specifically, when we call fs.open, Node.js calls the Open function at the C/C++ level through process.binding, and then calls the specific method uv_fs_open in Libuv through it. Finally, the execution result is passed back through callback to complete the process.

The methods we call in Javascript will eventually be passed to the C/C++ level through process.binding, and they will ultimately perform the actual operations. This is how Node.js interacts with the operating system.

Single thread

In traditional web service models, multi-threading is mostly used to solve concurrency problems. Because I/O is blocking, a single thread means that users have to wait, which is obviously unreasonable, so multiple threads are created to respond to user requests.
Node.js model for http services:

The single thread of Node.js means that the main thread is "single thread", and the main thread executes the program code step by step according to the coding order. If the synchronous code is blocked and the main thread is occupied, the subsequent program code execution will be stuck. Practice a test code:

var http = require('http');function sleep(time) { var _exit = Date.now() + time * 1000; while( Date.now() < _exit ) {} return;
}var server = http.createServer(function(req, res){
    sleep(10);
    res.end('server sleep 10s');
});

server.listen(8080);

Here is a stack diagram of the code block:

First change the code of index.js to this, then open the browser, and you will find that the browser will respond after 10 seconds and type Hello Node.js.

JavaScript is an interpreted language. Code is pushed into the stack line by line in the order in which it is encoded and executed. After execution is completed, the code is removed and the next line of code is pushed in to execute. In the stack diagram of the code block above, when the main thread accepts the request, the program is pushed into the sleep execution block for synchronous execution (we assume that this is the business processing of the program). If a second request comes in within 10 seconds, it will be pushed into the stack and wait for 10 seconds to complete before further processing the next request. Subsequent requests will be suspended and wait for the previous synchronous execution to complete before execution.

Then we may wonder: Why can a single thread be so efficient and handle tens of thousands of concurrent processes without causing blocking? This is what we call event-driven below.

Event-driven/event loop

Event Loop is a programming construct that waits for and dispatches events or messages in a program.

1. Each Node.js process has only one main thread executing program code, forming an execution context stack.
2. In addition to the main thread, an "event queue" is also maintained. When a user's network request or other asynchronous operation arrives, the node will put it in the Event Queue. It will not be executed immediately, and the code will not be blocked. It will continue to move forward until the main thread code is executed.
3. After the main thread code is executed, the Event Loop, that is, the event loop mechanism, starts to take out the first event from the beginning of the Event Queue, assigns a thread from the thread pool to execute this event, then continues to take out the second event, and assigns a thread from the thread pool to execute it, then the third, and the fourth. The main thread continuously checks whether there are any unexecuted events in the event queue until all events in the event queue are executed. After that, whenever a new event is added to the event queue, the main thread will be notified to take it out in sequence and hand it over to EventLoop for processing. When an event is executed, the main thread will be notified, the main thread will execute the callback, and the thread will be returned to the thread pool.
4. The main thread keeps repeating the third step above.

The node.js single thread we see is just a js main thread. The asynchronous operation is essentially completed by the thread pool. Node entrusts all blocking operations to the internal thread pool for implementation. It is only responsible for continuous round-trip scheduling and does not perform real I/O operations, thereby realizing asynchronous non-blocking I/O. This is the essence of node single thread and event-driven.

Implementation of the event loop in Node.js:

Node.js uses V8 as the parsing engine of js, and uses its own libuv for I/O processing. Libuv is an event-driven cross-platform abstraction layer that encapsulates some underlying features of different operating systems and provides a unified API to the outside world. The event loop mechanism is also implemented in it. In src/node.cc:

Environment* CreateEnvironment(IsolateData* isolate_data,
                               Local context, int argc, const char* const* argv, int exec_argc, const char* const* exec_argv) {
  Isolate* isolate = context->GetIsolate(); HandleScope handle_scope(isolate);
  Context::Scope context_scope(context); auto env = new Environment(isolate_data, context,
                             v8_platform.GetTracingAgent());
  env->Start(argc, argv, exec_argc, exec_argv, v8_is_profiling); return env;
}

This code establishes a node execution environment. You can see uv_default_loop() in the third line, which is a function in the libuv library. It initializes the uv library itself and the default_loop_struct therein, and returns a pointer to it, default_loop_ptr. After that, Node will load the execution environment and complete some setup operations, and then start the event loop.

{
    SealHandleScope seal(isolate);
    bool more;
    env.performance_state()->Mark(
        node::performance::NODE_PERFORMANCE_MILESTONE_LOOP_START);
    do {
      uv_run(env.event_loop(), UV_RUN_DEFAULT);

      v8_platform.DrainVMTasks(isolate);

      more = uv_loop_alive(env.event_loop()); if (more)
        continue;

      RunBeforeExit(&env); // Emit `beforeExit` if the loop became alive either after emitting
      // event, or after running some callbacks.
      more = uv_loop_alive(env.event_loop());
    } while (more == true);
    env.performance_state()->Mark(
        node::performance::NODE_PERFORMANCE_MILESTONE_LOOP_EXIT);
  }

  env.set_trace_sync_io(false);

  const int exit_code = EmitExit(&env);
  RunAtExit(&env);

more is used to indicate whether to proceed to the next cycle. env->event_loop() will return the default_loop_ptr previously saved in env, and the uv_run function will start libuv's event loop in the specified UV_RUN_DEFAULT mode. If there are no I/O events and no timer events, uv_loop_alive returns false.

Event Loop Execution Order

According to the official introduction of Node.js, each event loop contains 6 stages, which correspond to the implementation in the libuv source code, as shown in the following figure:

  • Timers phase: This phase executes the callback of timer (setTimeout, setInterval)
  • I/O callbacks stage: execute some system call errors, such as network communication error callbacks
  • Idle, prepare phase: only used internally by the node
  • Poll phase: Get new I/O events. Under appropriate conditions, the node will block here.
  • check phase: execute the callback of setImmediate()
  • Close callbacks phase: execute the socket's close event callback.

Core function uv_run: source code Core source code

int uv_run(uv_loop_t* loop, uv_run_mode mode) { int timeout; int r; int ran_pending; //First check if our loop is still alive //Alive means whether there is an asynchronous task in the loop //If not, just terminate r = uv__loop_alive(loop); if (!r)
    uv__update_time(loop); //The legendary event loop, you read it right! It's a big while
  while (r != 0 && loop->stop_flag == 0) { //Update event phase uv__update_time(loop); //Process timer callback uv__run_timers(loop); //Process asynchronous task callback ran_pending = uv__run_pending(loop); //Useless phase uv__run_idle(loop);
    uv__run_prepare(loop); //It is worth noting here //From here to the following uv__io_poll are very difficult to understand //First remember that timeout is a time //After uv_backend_timeout is calculated, it is passed to uv__io_poll
    //If timeout = 0, uv__io_poll will directly skip timeout = 0; if ((mode == UV_RUN_ONCE && !ran_pending) || mode == UV_RUN_DEFAULT)
      timeout = uv_backend_timeout(loop);

    uv__io_poll(loop, timeout); // just run setImmediate
    uv__run_check(loop); //Close file descriptors and other operations uv__run_closing_handles(loop); if (mode == UV_RUN_ONCE) { /* UV_RUN_ONCE implies forward progress: at least one callback must have
       * been invoked when it returns. uv__io_poll() can return without doing
       * I/O (meaning: no callbacks) when its timeout expires - which means we
       * have pending timers that satisfy the forward progress constraint.
       *
       * UV_RUN_NOWAIT makes no guarantees about progress so it's omitted from
       * the check.
       */
      uv__update_time(loop);
      uv__run_timers(loop);
    }

    r = uv__loop_alive(loop); if (mode == UV_RUN_ONCE || mode == UV_RUN_NOWAIT) break;
  } /* The if statement lets gcc compile it to a conditional store. Avoids
   * dirtying a cache line.
   */
  if (loop->stop_flag != 0) loop->stop_flag = 0; return r;
}

I have written the code in great detail, and I believe that those who are not familiar with C code can easily understand it. Yes, the event loop is just a big while! The veil of mystery is thus lifted.

uv__io_poll stage

This stage is designed very cleverly. The second parameter of this function is a timeout parameter, and this timeout comes from the uv_backend_timeout function. Let's take a look!

Source code

int uv_backend_timeout(const uv_loop_t* loop) { if (loop->stop_flag != 0) return 0; if (!uv__has_active_handles(loop) && !uv__has_active_reqs(loop)) return 0; if (!QUEUE_EMPTY(&loop->idle_handles)) return 0; if (!QUEUE_EMPTY(&loop->pending_queue)) return 0; if (loop->closing_handles) return 0; return uv__next_timeout(loop);
}

It turns out to be a multi-step if function. Let's analyze it one by one.

1. stop_flag: When this flag is 0, it means that the event loop will exit after running this round, and the return time is 0

2. !uv__has_active_handles and !uv__has_active_reqs: As the names indicate, if there are no asynchronous tasks (including timers and asynchronous I/O), then the timeout time must be 0.

3. QUEUE_EMPTY(idle_handles) and QUEUE_EMPTY(pending_queue): Asynchronous tasks are registered in pending_queue. Whether successful or not, they have been registered. If there is nothing, these two queues are empty, so there is no need to wait.

4. closing_handles: Our loop has entered the closing phase, no need to wait

All the above conditions are judged and judged just to wait for this sentence return uv__next_timeout(loop); this sentence tells uv__io_poll: how long will you stop? Next, let's continue to see how this magical uv__next_timeout gets the time.

int uv__next_timeout(const uv_loop_t* loop) { const struct heap_node* heap_node; const uv_timer_t* handle;
  uint64_t diff;

  heap_node = heap_min((const struct heap*) &loop->timer_heap); if (heap_node == NULL) return -1; /* block indefinitely */

  handle = container_of(heap_node, uv_timer_t, heap_node); if (handle->timeout time) return 0; //This code gives the key guidance diff = handle->timeout - loop->time; //Cannot be greater than the maximum INT_MAX
  if (diff > INT_MAX)
    diff = INT_MAX; return diff;
}

After the waiting is over, it will enter the check phase. Then it will enter the closing_handles phase, and an event loop will end. Because it is source code analysis, I won’t go into details. You can only read the official documentation.

Summarize

1. Nodejs interacts with the operating system. The methods we call in Javascript will eventually be passed to the C/C++ level through process.binding, and they will ultimately perform the actual operations. This is how Node.js interacts with the operating system.

2. The so-called single-threaded nodejs is just the main thread. All network requests or asynchronous tasks are handed over to the internal thread pool for implementation. It is only responsible for continuous round-trip scheduling, and the event loop continuously drives event execution.

3. The reason why Nodejs can handle high concurrency with a single thread is due to the event loop mechanism of the libuv layer and the underlying thread pool implementation.

4. Event loop is the main thread that continuously reads events from the event queue of the main thread, driving the execution of all asynchronous callback functions. The event loop has a total of 7 stages, each stage has a task queue. When all stages are executed sequentially once, the event loop completes a tick.

The above is the detailed content of Nodejs's exploration of in-depth understanding of the principle of single-threaded high concurrency. For more information about Nodejs, please pay attention to other related articles on 123WORDPRESS.COM!

You may also be interested in:
  • NodeJs high memory usage troubleshooting actual combat record
  • Detailed explanation of using Nodejs built-in encryption module to achieve peer-to-peer encryption and decryption
  • Detailed explanation of asynchronous iterators in nodejs
  • Detailed explanation of nodejs built-in modules
  • Nodejs module system source code analysis
  • A brief discussion on event-driven development in JS and Nodejs
  • How to use module fs file system in Nodejs
  • Summary of some tips for bypassing nodejs code execution
  • Nodejs error handling process record
  • How to use nodejs to write a data table entity class generation tool for C#

<<:  Mysql triggers are used in PHP projects to back up, restore and clear information

>>:  CentOS 7 switching boot kernel and switching boot mode explanation

Recommend

Using streaming queries in MySQL to avoid data OOM

Table of contents 1. Introduction 2. JDBC impleme...

Summary of MySQL character sets

Table of contents Character Set Comparison Rules ...

HTTP and HTTP Collaboration Web Server Access Flow Diagram

A web server can build multiple web sites with in...

Use pure CSS to achieve switch effect

First is the idea We use the <input type="...

Basic structure of HTML documents (basic knowledge of making web pages)

HTML operation principle: 1. Local operation: ope...

Vue3 list interface data display details

Table of contents 1. List interface display examp...

Detailed explanation of FTP environment configuration solution (vsftpd)

1. Install vsftpd component Installation command:...

How to use dd command in Linux without destroying the disk

Whether you're trying to salvage data from a ...

Examples of some usage tips for META tags in HTML

HTML meta tag HTML meta tags can be used to provi...

JavaScript to achieve the effect of clicking on the self-made menu

This article shares the specific code of JavaScri...

Comprehensive website assessment solution

<br />Sometimes you may be asked questions l...

How to use ssh tunnel to connect to mysql server

Preface In some cases, we only know the intranet ...

How to use css variables in JS

How to use css variables in JS Use the :export ke...

An example of elegant writing of judgment in JavaScript

Table of contents Preface 1. Monadic Judgment 1.1...