OverviewThe emergence of Node.js allows front-end engineers to work across clients on the server. Of course, the birth of a new operating environment will also bring new modules, functions, or even ideological innovations. This article will lead readers to appreciate the module design ideas of Node.js (hereinafter referred to as Node) and analyze some core source code implementations. CommonJS SpecificationNode initially followed the CommonJS specification to implement its own module system, and at the same time made some customizations that were different from the specification. The CommonJS specification is a module format defined to solve the scope problem of JavaScript, which allows each module to execute in its own namespace. This specification emphasizes that modules must export external variables or functions through module.exports, import the output of other modules into the current module scope through require(), and follow the following conventions:
Node's implementation of the CommonJS specificationThe module.require function inside the module and the global require function are defined to load modules. In the Node module system, each file is considered a separate module. When a module is loaded, it is initialized as an instance of the Module object. The basic implementation and properties of the Module object are as follows: function Module(id = "", parent) { // Module id, usually the absolute path of the module this.id = id; this.path = path.dirname(id); this.exports = {}; //Current module caller this.parent = parent; updateChildren(parent, this, false); this.filename = null; // Is the module loaded? this.loaded = false; //Module referenced by the current module this.children = []; } Each module exposes its exports attribute as a user interface. Module exports and importsIn Node, you can use the module.exports object to export a variable or function as a whole, or you can mount the variable or function to be exported to the attributes of the exports object. The code is as follows: // 1. Use exports: I usually use it to export tool library functions or constants exports.name = 'xiaoxiang'; exports.add = (a, b) => a + b; // 2. Use module.exports: export an entire object or a single function... module.exports = { add, minus } The module is referenced through the global require function. The module name, relative path or absolute path can be passed in. When the module file suffix is js / json / node, the suffix can be omitted, as shown in the following code: // Reference module const { add, minus } = require('./module'); const a = require('/usr/app/module'); const http = require('http'); Note: The exports variable is available in the module's file-level scope and is assigned to module.exports before the module is executed. exports.name = 'test'; console.log(module.exports.name); // test module.export.name = 'test'; console.log(exports.name); // test If exports is given a new value, it will no longer be bound to module.exports, and vice versa: exports = { name: 'test' }; console.log(module.exports.name, exports.name); // undefined, test ]When the module.exports property is completely replaced by a new object, it is usually necessary to reassign exports as well: module.exports = exports = { name: 'test' }; console.log(module.exports.name, exports.name) // test, test Module system realizes analysis module positioningThe following is the code implementation of the require function: // require entry function Module.prototype.require = function(id) { //... requireDepth++; try { return Module._load(id, this, /* isMain */ false); // Load module } finally { requireDepth--; } }; The above code receives the given module path, where requireDepth is used to record the depth of module loading. The Module class method _load implements the main logic of Node loading modules. Let's parse the source code implementation of the Module._load function. For your convenience, I have added comments to the text. Module._load = function(request, parent, isMain) { // Step 1: Resolve the full path of the module const filename = Module._resolveFilename(request, parent, isMain); // Step 2: Load the module, which is divided into three cases. // Case 1: If there is a cached module, directly return the exports property of the module const cachedModule = Module._cache[filename]; if (cachedModule !== undefined) return cachedModule.exports; // Case 2: Loading built-in modules const mod = loadNativeModule(filename, request); if (mod && mod.canBeRequiredByUsers) return mod.exports; // Case 3: Build module load const module = new Module(filename, parent); // After loading, cache the module instance Module._cache[filename] = module; // Step 3: Load the module file module.load(filename); // Step 4: Return the export object return module.exports; }; Loading strategyThe above code contains a lot of information. We mainly look at the following issues: What is the module's caching strategy? Analyzing the above code, we can see that the _load function gives different loading strategies for three situations, namely:
How does Module._resolveFilename(request, parent, isMain) resolve the file name? Let's look at the class method defined as follows: Module._resolveFilename = function(request, parent, isMain, options) { if (NativeModule.canBeRequiredByUsers(request)) { // Prioritize loading built-in modules return request; } let paths; // options used by node require.resolve function, options.paths is used to specify the search path if (typeof options === "object" && options !== null) { if (ArrayIsArray(options.paths)) { const isRelative = request.startsWith("./") || request.startsWith("../") || (isWindows && request.startsWith(".\\")) || request.startsWith("..\\"); if (isRelative) { paths = options.paths; } else { const fakeParent = new Module("", null); paths = []; for (let i = 0; i < options.paths.length; i++) { const path = options.paths[i]; fakeParent.paths = Module._nodeModulePaths(path); const lookupPaths = Module._resolveLookupPaths(request, fakeParent); for (let j = 0; j < lookupPaths.length; j++) { if (!paths.includes(lookupPaths[j])) paths.push(lookupPaths[j]); } } } } else if (options.paths === undefined) { paths = Module._resolveLookupPaths(request, parent); } else { //... } } else { // Find the module existence path paths = Module._resolveLookupPaths(request, parent); } // Find the module path based on the given module and traversal address array, as well as whether it is an entry module const filename = Module._findPath(request, paths, isMain); if (!filename) { const requireStack = []; for (let cursor = parent; cursor; cursor = cursor.parent) { requireStack.push(cursor.filename || cursor.id); } // Module not found, throw an exception (is this a familiar error?) let message = `Cannot find module '${request}'`; if (requireStack.length > 0) { message = message + "\nRequire stack:\n- " + requireStack.join("\n- "); } const err = new Error(message); err.code = "MODULE_NOT_FOUND"; err.requireStack = requireStack; throw err; } //Finally return the full path including the file name return filename; }; The most prominent feature of the above code is the use of the _resolveLookupPaths and _findPath methods. _resolveLookupPaths: Returns an array of traversal scopes used by _findPath by accepting a module name and a module caller. // Module file addressing address array method Module._resolveLookupPaths = function(request, parent) { if (NativeModule.canBeRequiredByUsers(request)) { debug("looking for %j in []", request); return null; } // If it is not a relative path if ( request.charAt(0) !== "." || (request.length > 1 && request.charAt(1) !== "." && request.charAt(1) !== "/" && (!isWindows || request.charAt(1) !== "\\")) ) { /** * Check the node_modules folder * modulePaths is the user directory, the node_path environment variable specifies the directory, the global node installation directory */ let paths = modulePaths; if (parent != null && parent.paths && parent.paths.length) { // The modulePath of the parent module should also be added to the modulePath of the child module, and then trace back to find paths = parent.paths.concat(paths); } return paths.length > 0 ? paths : null; } // When using repl interaction, search for ./ ./node_modules and modulePaths in turn if (!parent || !parent.id || !parent.filename) { const mainPaths = ["."].concat(Module._nodeModulePaths("."), modulePaths); return mainPaths; } // If it is a relative path introduction, add the parent folder path to the search path const parentDir = [path.dirname(parent.filename)]; return parentDir; }; _findPath: Find the corresponding filename and return it based on the target module and the range found by the above function. // Find the real path of the module based on the given module and traversal address array, as well as whether it is a top-level module Module._findPath = function(request, paths, isMain) { const absoluteRequest = path.isAbsolute(request); if (absoluteRequest) { // Absolute path, directly locate the specific module paths = [""]; } else if (!paths || paths.length === 0) { return false; } const cacheKey = request + "\x00" + (paths.length === 1 ? paths[0] : paths.join("\x00")); // Cache path const entry = Module._pathCache[cacheKey]; if (entry) return entry; let exts; let trailingSlash = request.length > 0 && request.charCodeAt(request.length - 1) === CHAR_FORWARD_SLASH; // '/' if (!trailingSlash) { trailingSlash = /(?:^|\/)\.?\.$/.test(request); } // For each path for (let i = 0; i < paths.length; i++) { const curPath = paths[i]; if (curPath && stat(curPath) < 1) continue; const basePath = resolveExports(curPath, request, absoluteRequest); let filename; const rc = stat(basePath); if (!trailingSlash) { if (rc === 0) { // stat status returns 0, then it is a file // File. if (!isMain) { if (preserveSymlinks) { // Instruct the module loader to maintain symbolic links when resolving and caching modules. filename = path.resolve(basePath); } else { // Do not keep symbolic links filename = toRealPath(basePath); } } else if (preserveSymlinksMain) { filename = path.resolve(basePath); } else { filename = toRealPath(basePath); } } if (!filename) { if (exts === undefined) exts = ObjectKeys(Module._extensions); // Parse the suffix filename = tryExtensions(basePath, exts, isMain); } } if (!filename && rc === 1) { /** * If stat returns 1 and the file name does not exist, it is considered a folder * If the file suffix does not exist, try to load the file specified by the main entry in package.json under the directory * If it does not exist, then try index[.js, .node, .json] file */ if (exts === undefined) exts = ObjectKeys(Module._extensions); filename = tryPackage(basePath, exts, isMain, request); } if (filename) { // If the file exists, add the file name to the cache Module._pathCache[cacheKey] = filename; return filename; } } const selfFilename = trySelf(paths, exts, isMain, trailingSlash, request); if (selfFilename) { // Set the path cache Module._pathCache[cacheKey] = selfFilename; return selfFilename; } return false; }; Module loadingStandard module processing After reading the above code, we find that when the module is a folder, the logic of the tryPackage function will be executed. The following is a brief analysis of the specific implementation. // Try to load a standard module function tryPackage(requestPath, exts, isMain, originalPath) { const pkg = readPackageMain(requestPath); if (!pkg) { // If there is no package.json, index is used as the default entry file return tryExtensions(path.resolve(requestPath, "index"), exts, isMain); } const filename = path.resolve(requestPath, pkg); let actual = tryFile(filename, isMain) || tryExtensions(filename, exts, isMain) || tryExtensions(path.resolve(filename, "index"), exts, isMain); //... return actual; } // Read the main field in package.json function readPackageMain(requestPath) { const pkg = readPackage(requestPath); return pkg ? pkg.main : undefined; } The readPackage function is responsible for reading and parsing the contents of the package.json file, as described below: function readPackage(requestPath) { const jsonPath = path.resolve(requestPath, "package.json"); const existing = packageJsonCache.get(jsonPath); if (existing !== undefined) return existing; // Call libuv uv_fs_open execution logic, read package.json file, and cache const json = internalModuleReadJSON(path.toNamespacedPath(jsonPath)); if (json === undefined) { // Then cache the file packageJsonCache.set(jsonPath, false); return false; } //... try { const parsed = JSONParse(json); const filtered = { name: parsed.name, main: parsed.main, exports: parsed.exports, type: parsed.type }; packageJsonCache.set(jsonPath, filtered); return filtered; } catch (e) { //... } } The above two code snippets perfectly explain the role of the package.json file, the configuration entry of the module (the main field in package.json), and why the default file of the module is index. The specific process is shown in the figure below: Module file processingAfter locating the corresponding module, how to load and parse it? The following is a specific code analysis: Module.prototype.load = function(filename) { // Ensure that the module has not been loaded assert(!this.loaded); this.filename = filename; // Find the node_modules of the current folder this.paths = Module._nodeModulePaths(path.dirname(filename)); const extension = findLongestRegisteredExtension(filename); //... // Execute specific file extension parsing function such as js / json / node Module._extensions[extension](this, filename); // Indicates that the module was loaded successfully this.loaded = true; // ... omit esm module support }; Suffix processingIt can be seen that Node.js loads differently for different file suffixes. The following is a simple analysis of .js, .json, and .node. The reading of js files with .js suffix is mainly implemented through Node's built-in API fs.readFileSync. Module._extensions[".js"] = function(module, filename) { // Read file content const content = fs.readFileSync(filename, "utf8"); // Compile and execute code module._compile(content, filename); }; The processing logic of JSON files with json suffix is relatively simple. After reading the file content, execute JSONParse to get the result. Module._extensions[".json"] = function(module, filename) { // Load the file directly in utf-8 format const content = fs.readFileSync(filename, "utf8"); //... try { // Export file contents in JSON object format module.exports = JSONParse(stripBOM(content)); } catch (err) { //... } }; The .node file with suffix .node is a native module implemented by C/C++ and is read by the process.dlopen function. The process.dlopen function actually calls the DLOpen function in the C++ code, and DLOpen calls uv_dlopen, which loads the .node file, similar to the OS loading system library files. Module._extensions[".node"] = function(module, filename) { //... return process.dlopen(module, path.toNamespacedPath(filename)); }; From the three source codes above, we can see and understand that only the JS suffix will execute the instance method _compile in the end. Let's remove some experimental features and debugging-related logic to briefly analyze this code. Compile and executeAfter the module is loaded, Node uses the method provided by the V8 engine to build and run the sandbox and execute the function code. The code is as follows: Module.prototype._compile = function(content, filename) { let moduleURL; let redirects; // Inject public variables __dirname / __filename / module / exports / require into the module, and compile the function const compiledWrapper = wrapSafe(filename, content, this); const dirname = path.dirname(filename); const require = makeRequireFunction(this, redirects); let result; const exports = this.exports; const thisValue = exports; const module = this; if (requireDepth === 0) statCache = new Map(); //... // Execute the function in the module result = compiledWrapper.call( thisValue, exports, require, module, filename, dirname ); hasLoadedAnyUserCJSModule = true; if (requireDepth === 0) statCache = null; return result; }; //Core logic of injecting variables function wrapSafe(filename, content, cjsModuleInstance) { if (patched) { const wrapper = Module.wrap(content); // vm sandbox runs and returns the running result directly, env->SetProtoMethod(script_tmpl, "runInThisContext", RunInThisContext); return vm.runInThisContext(wrapper, { filename, lineOffset: 0, displayErrors: true, // Dynamically load importModuleDynamically: async specifier => { const loader = asyncESM.ESMLoader; return loader.import(specifier, normalizeReferrerURL(filename)); } }); } let compiled; try { compiled = compileFunction( content, filename, 0, 0, undefined, false, undefined, [], ["exports", "require", "module", "__filename", "__dirname"] ); } catch (err) { //... } const { callbackMap } = internalBinding("module_wrap"); callbackMap.set(compiled.cacheKey, { importModuleDynamically: async specifier => { const loader = asyncESM.ESMLoader; return loader.import(specifier, normalizeReferrerURL(filename)); } }); return compiled.function; } In the above code, we can see that the wrapwrapSafe function is called in the _compile function, the injection of the __dirname / __filename / module / exports / require public variables is performed, and the C++ runInThisContext method (located in the src/node_contextify.cc file) is called to build a sandbox environment for the module code to run, and the compiledWrapper object is returned. Finally, the module is run through the compiledWrapper.call method. The above is the detailed content of the source code analysis of the nodejs module system. For more information about the source code analysis of the nodejs module system, please pay attention to other related articles on 123WORDPRESS.COM! You may also be interested in:
|
<<: A brief discussion on MySQL B-tree index and index optimization summary
>>: Detailed usage of kubernetes object Volume
This article shares the specific code of JavaScri...
From getting started to becoming a novice, the Li...
Table of contents About FastDFS 1. Search for ima...
In Ubuntu, you often encounter the situation wher...
Table of contents Preface: 1.Brief introduction t...
1. Installation version details Server: MariaDB S...
This morning I planned to use Wampserver to build...
background Now the company's projects are dev...
All tags must be lowercase In XHTML, all tags must...
0. System requirements CPU I5-10400F or above Mem...
Preface If you are like me, as a hard-working Jav...
structure body, head, html, title text abbr, acro...
1.device-width Definition: Defines the screen vis...
Table of contents What is JSON Why this technolog...
Table of contents 1. Introduction 2. Thought Anal...