Let’s talk in detail about how JavaScript affects DOM tree construction

Let’s talk in detail about how JavaScript affects DOM tree construction

Document Object Model (DOM)

The Document Object Model (DOM) connects a web page to a script or programming language. The DOM model represents a document with a logical tree. Each branch of the tree ends at a node, and each node contains objects. DOM methods allow programmatic access to the tree, thereby changing the structure, style, and content of the document. Nodes can be associated with event handlers, which are executed once an event is triggered.
The HTML file byte stream transmitted from the network to the rendering engine cannot be directly understood by the rendering engine, so it must be converted into an internal structure that the rendering engine can understand. This structure is the DOM. DOM provides a structured representation of HTML documents. In the rendering engine, DOM has three levels of functions:

  • From the perspective of the page, DOM is the basic data structure for generating the page.
  • From the perspective of JavaScript scripts, DOM provides an interface for JavaScript script operations. Through this interface, JavaScript can access the DOM structure to change the structure, style, and content of the document.
  • From a security perspective, DOM is a safety line of defense, and some unsafe content is blocked during the DOM parsing phase.

In short, DOM is the internal data structure that represents HTML. It connects web pages and JavaScript scripts and filters some unsafe content.

DOM and JavaScript

DOM is not a programming language, but without DOM, JavaScript language would not have any concept or model of web pages, XML pages and the elements involved. Every element in a document — including the entire document, the document header, tables within the document, table headers, and text within tables — is part of the Document Object Model (DOM) to which the document belongs, so they can be accessed and manipulated using the DOM and a scripting language such as JavaScript.

Initially, JavaScript and DOM were intertwined, but they eventually evolved into two separate entities. JavaScript can access and manipulate content stored in the DOM, so we can write this approximate equation:

API (web or XML page) = DOM + JS (scripting language)

DOM and JavaScript

How the DOM tree is generated

Inside the rendering engine, there is a module called HTML Parser, whose responsibility is to convert HTML byte stream into DOM structure.
The HTML parser does not wait until the entire document is loaded before parsing. Instead, it parses as the HTML document is loaded. The HTML parser parses as much data as the network process loads.

Process: After the network process receives the response header, it will determine the type of file based on the content-type field in the response header. For example, if the value of content-type is "text/html", the browser will determine that this is an HTML type file, select the corresponding parsing engine based on this judgment, and then select or create a rendering process for the request. After the rendering process is ready, a shared data pipeline will be established between the network process and the rendering process. After the network process receives the data, it will put it into this pipeline, and the rendering process will continuously read data from the other end of the pipeline and send the read data to the HTML parser at the same time.

You can imagine this pipeline as a "water pipe". The byte stream received by the network process pours into this "water pipe" like water, and the other end of the "water pipe" is the HTML parser of the rendering process, which dynamically receives the byte stream and parses it into DOM.

As can be seen from the figure, the conversion of byte stream to DOM requires three stages.

Three stages of parsing HTML

In the first stage, the byte stream is converted into Tokens through a tokenizer.

Parsing HTML is the same. You need to convert the byte stream into tokens through a tokenizer, which are divided into tag tokens and text tokens. The token generated by lexical analysis of HTML code is shown in the figure below:

As can be seen from the figure, Tag Token is divided into StartTag and EndTag.

The second stage is to parse the Token into a DOM node

The HTML parser maintains a Token stack structure, which is mainly used to calculate the parent-child relationship between nodes. The Tokens generated in the first stage will be pushed into this stack in order. The specific processing rules are as follows:

  • If the StartTag Token is pushed into the stack, the HTML parser will create a DOM node for the Token and then add the node to the DOM tree. Its parent node is the node generated by the adjacent element in the stack.
  • If the tokenizer parses a text token, a text node will be generated and then added to the DOM tree. The text token does not need to be pushed into the stack, and its parent node is the DOM node corresponding to the current top token of the stack.
  • If the tokenizer parses out an EndTag tag, such as EndTag div, the HTML parser will check whether the element at the top of the Token stack is StarTag div. If so, it will pop the StartTag div from the stack, indicating that the parsing of the div element is complete.

The new tokens generated by the tokenizer are pushed and popped continuously, and the whole parsing process continues until the tokenizer completes tokenization of all byte streams.

The third stage is to add DOM nodes to the DOM tree

Add the created DOM node to the document to form a DOM tree.

Detailed explanation of HTML parsing process

When the HTML parser starts working, it creates an empty DOM structure with the document as the root by default, and pushes a StartTag document Token to the bottom of the stack. Then the first StartTag html Token parsed by the tokenizer is pushed into the stack, and an html DOM node is created and added to the document, as shown in the following figure

Then, the StartTag body and StartTag div are parsed according to the same process. The status of the Token stack and DOM are shown in the following figure:

The next thing parsed is the text token of the first div. The rendering engine will create a text node for the token and add the token to the DOM. Its parent node is the node corresponding to the top element of the current token stack, as shown in the following figure:

Next, the tokenizer parses out the first EndTag div. At this time, the HTML parser will determine whether the element at the top of the stack is a StartTag div. If it is, the StartTag div will be popped from the top of the stack, as shown in the following figure

Following the same rules, the final result is shown in the figure below:

Through the above introduction, I believe you already know how DOM is generated. However, in an actual production environment, the HTML source file contains not only CSS and JavaScript, but also pictures, audio, video and other files, so the processing process is much more complicated than the above Demo. However, after understanding this simple Demo generation process, we can analyze more complex scenarios.

How JavaScript affects DOM generation

If the page contains a JavaScript script, or imports a script file, the parsing process of this script is slightly different from the above process.
Before the script tag, all the parsing processes are the same as described before. However, when parsing to the script tag, the rendering engine determines that this is a script. At this time, the HTML parser will suspend the DOM parsing and the JavaScript engine will intervene because the JavaScript script may need to modify the currently generated DOM structure.

If the script is loaded via a JavaScript file, the JavaScript code needs to be downloaded first. Here we need to pay special attention to the download environment, because the download process of JavaScript files will block DOM parsing, and downloading is usually very time-consuming and will be affected by factors such as the network environment and the size of the JavaScript file.

If the script is a directly embedded JavaScript script, it is executed directly.

If the JavaScript script modifies the content of the div in the DOM, the content of the parsed div node will also be modified after the script is executed. After the script is executed, the HTML parser resumes the parsing process and continues to parse the subsequent content until the final DOM is generated.

Another situation is that if JavaScript code appears, the statement that modifies the CSS style of the page is used to manipulate the CSSOM, so before executing JavaScript, all CSS styles above the JavaScript statement need to be parsed first. Therefore, if an external CSS file is referenced in the code, before executing JavaScript, you need to wait for the external CSS file to be downloaded and parsed to generate a CSSOM object before executing the JavaScript script.

Before parsing the JavaScript code, the JavaScript engine does not know whether JavaScript manipulates CSSOM. Therefore, when the rendering engine encounters a JavaScript script, regardless of whether the script manipulates CSSOM, it will download the CSS file, parse it, and then execute the JavaScript script. So JavaScript scripts are dependent on style sheets.

Through the above analysis, we know that JavaScript will block DOM generation, and style files will block JavaScript execution, so in actual projects, we need to pay attention to JavaScript files and style sheet files. Improper use will affect page performance.

Optimization during parsing

To prevent page blocking, Chrome browser has made many optimizations, one of the main optimizations is the pre-parsing operation. When the rendering engine receives the byte stream, it will start a pre-parsing thread to analyze the JavaScript, CSS and other related files contained in the HTML file. After parsing the related files, the pre-parsing thread will download these files in advance.

Back to DOM parsing, we know that introducing JavaScript threads will block DOM, but there are some related strategies to avoid it, such as using CDN to speed up the loading of JavaScript files and compress the size of JavaScript files. In addition, if there is no DOM-related code in the JavaScript file, you can set the JavaScript script to be loaded asynchronously by marking the code with async or defer. The usage is as follows:

<script async type="text/javascript" src='foo.js'></script>
<script defer type="text/javascript" src='foo.js'></script>

Although both async and defer are asynchronous, there are some differences. Once the script file marked with async is loaded, it will be executed immediately; while the script file marked with defer needs to be executed before the DOMContentLoaded event.

Summarize

First, we introduced how DOM is generated, and then based on the DOM generation process, we analyzed how JavaScript affects DOM generation. It is also mentioned that both CSS and JavaScript will affect the generation of DOM.

DOM generation process

Parsing HTML requires converting the byte stream into Tokens through a tokenizer.

If the StartTag Token is pushed into the stack, the HTML parser creates a DOM node for the Token and then adds the node to the DOM tree. If the tokenizer parses a text token, a text node is generated and then added to the DOM tree. If the tokenizer parses out the EndTag tag, the HTML parser will check whether the element at the top of the Token stack is the StarTag div. If it is, the StartTag div will be popped out of the stack, indicating that the parsing of the div element is complete.

The new tokens generated by the tokenizer are pushed and popped continuously, and the whole parsing process continues until the tokenizer completes tokenization of all byte streams.

If JavaScript code is encountered during the parsing process, HTML parsing will be stopped. If js is loaded through a script, the script will be downloaded first and then executed. Before execution, CSS will also be parsed to generate CSSOM. This process continues until the entire DOM is constructed.

This is the end of this article about how JavaScript affects DOM tree construction. For more relevant JavaScript DOM tree construction content, please search 123WORDPRESS.COM’s previous articles or continue to browse the following related articles. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • JavaScript Learning Summary (I) ECMAScript, BOM, DOM (Core, Browser Object Model and Document Object Model)
  • JavaScript DOMContentLoaded event case study
  • Detailed explanation of BOM and DOM in JavaScript
  • Java parsing xml file and json conversion method (DOM4j parsing)
  • Does contains in javascript contain function implementation code (extended characters, arrays, DOM)?
  • Document Object Model (DOM) in JavaScript

<<:  Centos7 installation of FFmpeg audio/video tool simple document

>>:  MySQL implements multi-table association statistics (subquery statistics) example

Recommend

React implements the sample code of Radio component

This article aims to use the clearest structure t...

Detailed explanation of Vue's hash jump principle

Table of contents The difference between hash and...

What to do if you forget your password in MySQL 5.7.17

1. Add skip-grant-tables to the my.ini file and r...

How to automatically deploy Linux system using PXE

Table of contents Background Configuring DHCP Edi...

MySQL Oracle and SQL Server paging query example analysis

Recently, I have done a simple study on the data ...

MySql 5.7.21 free installation version configuration method under win10

1. Unzip to the location where you want to instal...

Steps to deploy ingress-nginx on k8s

Table of contents Preface 1. Deployment and Confi...

Example of using nested html pages (frameset usage)

Copy code The code is as follows: <!DOCTYPE ht...

Example code for implementing an Upload component using Vue3

Table of contents General upload component develo...

Detailed explanation of the spacing problem between img tags

IMG tag basic analysis In HTML5, the img tag has ...

CentOS 8 is now available

CentOS 8 is now available! CentOS 8 and RedHat En...

A detailed tutorial on how to install Jenkins on Docker for beginners

Jenkins is an open source software project. It is...

js canvas realizes circular water animation

This article example shares the specific code of ...