The best explanation of HTTPS

The best explanation of HTTPS

Good morning everyone, I haven’t updated my article for a while.

In fact, I stayed at home for more than a month during the epidemic and couldn’t go out, so I had a lot of time. I also used this time to write several new articles. However, since most of these newly written articles are in line with the content of the new book, and due to the impact of the epidemic, my new book has been delayed in publication, resulting in these articles being unable to be published, and I am very anxious. I hope the epidemic will end soon and everyone can return to normal life as soon as possible.

So today I will publish a technical article that has nothing to do with Android.

The https technology is now widely used. As major Internet leaders such as Apple and Google have mandated the use of https in their operating systems, browsers and other mainstream products, the elimination of http has officially entered the countdown.

In fact, there is nothing special that client developers need to pay attention to about https, because the code is no different from writing http requests. But it is precisely for this reason that many client developers do not understand https. They only know that it is a secure encrypted network transmission, but have no idea about its specific working principles.

So do client developers really need to understand https? I think it is still necessary. Mastering the working principle of https can help you understand and solve some problems encountered in your work more effectively. In addition, many companies also like to ask some https-related questions during the interview. If you have no idea about it, you will easily be eliminated here.

When I was learning about https, I consulted a lot of information online, but most of the articles were not easy to understand, which made many people afraid of https. I think in order to understand how https works, you don't necessarily have to know all the details (many articles on the Internet are difficult to understand because they are written in too much detail). In fact, you only need to grasp its overall workflow and figure out why it can ensure the security of network communications. Therefore, today I will bring you the most understandable explanation of https.

Before we officially start explaining https, we must first clarify two concepts: what is symmetric encryption and what is asymmetric encryption? These two concepts are basic knowledge in cryptography and are actually very easy to understand.

Symmetric encryption is relatively simple, that is, the client and the server share the same key, which can be used to encrypt a piece of content and can also be used to decrypt the content. The advantage of symmetric encryption is high encryption and decryption efficiency, but there may be some problems in terms of security because the key stored on the client is at risk of being stolen. Representative algorithms for symmetric encryption include AES, DES, etc.

Asymmetric encryption is a bit more complicated. It divides the key into two types: public key and private key. The public key is usually stored on the client, and the private key is usually stored on the server. Data encrypted with the public key can only be decrypted with the private key, and vice versa, data encrypted with the private key can only be decrypted with the public key. The advantage of asymmetric encryption is that it is more secure, because the encrypted information sent by the client to the server can only be decrypted with the server's private key, so there is no need to worry about being cracked by others, but the disadvantage is that the efficiency of encryption and decryption is much worse than that of symmetric encryption. Representative algorithms of asymmetric encryption include RSA, ElGamal, etc.

After mastering these two concepts, we can start learning https. Here is a question in advance, which is also a question that may be often asked during interviews: In order to ensure the security of data transmission, does https use symmetric encryption or asymmetric encryption?

You will know the answer after studying this article.

First, let's take a look at the problems that exist in the traditional http method during network transmission.

Since the information is in plain text when we transmit data, it is easy for the data to be monitored and stolen. The schematic diagram is as follows:

In addition, the transmitted data may be tampered with by people with ulterior motives, resulting in inconsistency between the content sent and received by the browser and the website. The schematic diagram is as follows:

In other words, using http to transmit data has at least two major risks: data being monitored and data being tampered with. Therefore, http is an insecure transmission protocol.

Well, everyone must know that the solution is to use https, but let's first try to think about how to ensure the security of http transmission, and then we can understand the working principle of https step by step.

Since it is not safe to transmit data in plain text over the network, we obviously have to encrypt the data. As mentioned earlier, there are two main encryption methods: symmetric encryption and asymmetric encryption. The advantage of symmetric encryption is its high encryption and decryption efficiency. When we transmit data on the Internet, we are very particular about efficiency, so it is obvious that symmetric encryption should be used here. The schematic diagram is as follows:

As you can see, since the data we transmit on the Internet are all encrypted, we don't have to worry about being obtained by eavesdroppers because they cannot know what the original text is. After the browser receives the ciphertext, it only needs to use the same key as the website to decrypt the data.

This working mechanism seems to ensure the security of data transmission, but there is a huge loophole: how do the browser and the website agree on what key to use?

This is definitely a difficult problem in the computer world. The browser and the website must use the same key to encrypt and decrypt data normally, but how can the key be known only to the two of them and not to any eavesdroppers? You will find that no matter how it is agreed, the first communication process between the browser and the website must be in plain text. This means that, following the above workflow, we are still unable to create a secure symmetric encryption key.

Therefore, it seems that using only symmetric encryption will never solve this problem. At this time, we need to introduce asymmetric encryption to help solve the problem of not being able to securely create symmetric encryption keys.

So why can asymmetric encryption solve this problem? Let's understand it through a schematic diagram:

As you can see, if we want to securely create a symmetric encryption key, we can let the browser generate it randomly, but the generated key cannot be transmitted directly on the network. Instead, it must be asymmetrically encrypted using the public key provided by the website. Since data encrypted with a public key can only be decrypted using a private key, the transmission of this data over the network is absolutely safe. After receiving the message, the website only needs to use the private key to decrypt it and obtain the key generated by the browser.

In addition, when using this method, asymmetric encryption is only needed when the browser and the website first agree on the key. Once the website receives the key randomly generated by the browser, both parties can use symmetric encryption to communicate, so the work efficiency is very high.

So, do you think the above working mechanism is perfect? Actually not, because we are still missing a very critical step. How can the browser obtain the public key of the website? Although the public key is public data and can be transmitted over the Internet without fear of being eavesdropped by others, what if the public key is tampered with by others? The schematic diagram is as follows:

In other words, as long as we obtain the public key of any website from the Internet, there is a risk that the public key will be tampered with. Once you use a fake public key to encrypt data, it can be decrypted by others with a fake private key, with disastrous consequences.

The design of the solution seems to have reached a dead end here, because we cannot safely obtain the public key of a website no matter what, and it is obviously impossible for us to pre-set the public keys of all websites in the world in the operating system.

At this time, a new concept must be introduced to break the deadlock: CA agency.

CA organizations are specifically used to issue digital certificates to various websites, thereby ensuring that browsers can safely obtain the public keys of various websites. So how do CA organizations accomplish this arduous task? Let’s start analyzing step by step.

First of all, as a website administrator, we need to apply to the CA agency and submit our public key to the CA agency. The CA organization will use the public key we submitted, plus a series of other information, such as website domain name, validity period, etc., to create a certificate.

After the certificate is produced, the CA will encrypt it with its own private key and return the encrypted data to us. We only need to configure the obtained encrypted data on the website server.

Then, whenever a browser requests our website, the encrypted data will first be returned to the browser, and the browser will use the CA's public key to decrypt the data.

If the decryption is successful, we can get the certificate issued by the CA organization for our website, which of course also includes the public key of our website. You can view the detailed information of the certificate by clicking the small lock icon to the left of the URL in the address bar of the browser, as shown in the figure below.

After obtaining the public key, the following process is the same as described in the diagram just now.

If the decryption fails, it means that this encrypted data is not encrypted by a legitimate CA organization using a private key and may have been tampered with. Then a famous abnormal interface will be displayed on the browser, as shown in the figure below.

Then you may ask, is it really safe with a CA agency? We need to use the CA's public key to decrypt data on the browser side, so how can we obtain the CA's public key securely?

This problem is easy to solve because there are infinite websites in the world, but there are only a few CA agencies. Any genuine operating system will have the public keys of all mainstream CA organizations built into the operating system, so we don’t need to obtain them separately. When decrypting, we only need to traverse the public keys of all built-in CA organizations in the system. As long as any public key can decrypt the data normally, it means it is legal.

The built-in certificates of Windows system are as follows:

However, even if the data can be decrypted normally using the CA's public key, there are still problems with the current process. Because each CA agency will issue certificates for thousands of websites, if the attacker knows that abc.com uses a certificate from a certain CA agency, he can also go to the CA agency to apply for a legitimate certificate, and then replace the returned encrypted certificate data when the browser requests abc.com. The schematic diagram is as follows:

It can be seen that since the certificate applied for by the attacker is also produced by a regular CA organization, this encrypted data can of course be successfully decrypted.

It is for this reason that all CA organizations, in addition to the website's public key, must also include a lot of other data in the certificates they create to assist in verification. For example, the website's domain name is one of the important data.

In the same example, if the domain name of the website is added to the certificate, the attacker will return empty-handed. Because even if the encrypted data can be successfully decrypted, the domain name contained in the final decrypted certificate does not match the domain name the browser is requesting, so the browser will still display an abnormal interface. The schematic diagram is as follows:

Well, with the solution designed here, our network transmission is actually secure enough. Of course, this is actually how https works.

So let’s go back to the original question: Does https use symmetric encryption or asymmetric encryption? The answer is obvious. https uses a combination of symmetric encryption and asymmetric encryption.

Of course, if you want to continue to delve deeper, there are many details in https that are worth exploring. But if I continue writing, this article may no longer be the easiest explanation of https, so I think it’s just right to stop here.

If you are like me and are mainly engaged in client-side development, then knowing so much about https is enough to deal with common interviews and problems encountered at work.

This is the end of this article about the easiest to understand HTTPS explanation. For more relevant HTTPS explanation content, please search for previous articles on 123WORDPRESS.COM or continue to browse the related articles below. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • HTTPS Principles Explained
  • HTTPS communication principle and detailed introduction

<<:  Implementing a simple web clock with JavaScript

>>:  MySQL time difference functions (TIMESTAMPDIFF, DATEDIFF), date conversion calculation functions (date_add, day, date_format, str_to_date)

Recommend

MySQL Optimization Solution Reference

Problems that may arise from optimization Optimiz...

Understand the basics of Navicat for MySQL in one article

Table of contents 1. Database Operation 2. Data T...

HTML Basics Must-Read - Comprehensive Understanding of CSS Style Sheets

CSS (Cascading Style Sheet) is used to beautify H...

Detailed explanation of flex and position compatibility mining notes

Today I had some free time to write a website for...

mysql5.7.21 utf8 encoding problem and solution in Mac environment

1. Goal: Change the value of character_set_server...

MySQL index failure principle

Table of contents 1. Reasons for index failure 2....

How to use docker to deploy spring boot and connect to skywalking

Table of contents 1. Overview 1. Introduction to ...

HTML Language Encyclopedia

123WORDPRESS.COM--HTML超文本标记语言速查手册<!-- --> !D...

HTML 5 Reset Stylesheet

This CSS reset is modified based on Eric Meyers...

MySQL 8.0 user and role management principles and usage details

This article describes MySQL 8.0 user and role ma...

MySQL database constraints and data table design principles

Table of contents 1. Database constraints 1.1 Int...

MySQL 5.7.17 installation and configuration method graphic tutorial (windows)

1. Download the software 1. Go to the MySQL offic...

Bootstrap FileInput implements image upload function

This article example shares the specific code of ...

How to align text boxes in multiple forms in HTML

The form code is as shown in the figure. The styl...