Comparative Analysis of High Availability Solutions of Oracle and MySQL

Comparative Analysis of High Availability Solutions of Oracle and MySQL

Regarding the high availability solutions for Oracle and MySQL, I have always wanted to summarize them, so I will briefly talk about them in several series. Through such a comparison, we will have a basic understanding of the detailed differences in the design of the two database architectures. Oracle has a very mature solution. Judging from the PPT I posted on OOW, it is the MAA program, and this year is the 16th anniversary of this program.

Due to the open source nature of MySQL, the community has launched more solutions. In my personal opinion, InnoDB Cluster will be the standard high-availability solution for MySQL in the future.

At present, MGR is good, there are also MySQL Cluster solutions, PXC, Galera and other solutions, but I personally prefer MHA.

Therefore, this article will be divided into several parts for interpretation. First, we will make a basic comparison between RAC and MHA.

Oracle's solutions supported Alibaba's core business needs during its rapid development period. This is probably the architectural system, and it looks huge. The RAC in it is considered a noble one, using expensive commercial storage, with extremely high network bandwidth requirements, and a large number of small machine businesses at the front end with hefty license fees. A very typical IOE classic architecture.

If you want to consider off-site disaster recovery, then the resource allocation must be doubled and the budget must be doubled.

MySQL's architecture is relatively more popular. It can be installed on an ordinary PC, but the scale is high. By splitting the business and horizontally splitting, a large number of nodes can be expanded horizontally. The scale of MySQL clusters of many large Internet companies is hundreds, and thousands is not uncommon. With so many service resources, there is still a probability of failure. Ensuring sustainable access to business services is the key to the technical solution. According to the MHA architecture, the MHA Manager node is basically responsible for the status of the entire cluster, just like a neighborhood committee auntie who knows all the big and small things about the residents.

Of course, the above statement is too general, so let’s start with some details. For example, let’s talk about the Internet first.

Oracle has very strict requirements for the network. Generally, two physical network cards are required. Each server needs at least three IPs, including a public IP, a private IP, and a VIP. In addition to shared storage, at least two computing nodes are required.

Private IP is mutually trusted between nodes. Public IP and VIP are in the same network segment. Simply put, VIP is external and is a drifting IP of the network where public IP is located. In 10g, load balancing is done through VIP. 11g began to have scan-IP, and the original VIP is still retained, so the network configuration requirements in Oracle are still very high. Putting aside the shared storage, the core of the construction is the network configuration, and if the network is connected, it will be connected.

Scan-IP can be further expanded to support up to 3 scan-ips, as shown in the following figure

Of course, there is more to the network layer than just this, and Oracle is very professional in this area. We need to understand TAF. In my book Oracle DBA Work Notes, I wrote:

TAF (Transparent Application Failover) is a failover function in Oracle that is transparent to applications and is particularly widely used in RAC environments. In RAC, there have indeed been major improvements in Load Balancing, from the Load Balancing of multiple VIP addresses in version 10g to SCAN in version 11g, which has been greatly simplified.

However, there are still certain usage restrictions in the implementation of failover. For example, the default SCAN-IP implementation in 11g does not have a failover option by default. If one of the two nodes fails, continuing to query in the original connection will prompt that the session has been disconnected and needs to be reconnected. Client TAF mainly discusses some simple contents of Failover Method and Failover Type.

(1) Failover Method

The main idea of ​​the Failover Method is to achieve it in exchange for failover time or resources.

It can be understood in this way. Suppose we have two nodes. If a session is connected to node 2, but node 2 suddenly hangs up, in order to handle the failover situation faster, there are two failover methods: preconnect and basic.

— Preconnect This preconnection method still takes up more resources. It will pre-occupy some additional resources on each node, which will make the switching smoother and faster.

— In this basic method, when a failover occurs, the corresponding resources will be switched. There will be some lag in the middle, but the resource consumption is relatively much smaller.

In simple terms, the basic method will only make a judgment when a failure occurs, while preconnect is a precautionary measure. From the perspective of actual application, the basic method is more common and is the default failover method.

(2) Failover Type

Failover Type is more powerful and flexible. At this time, the control granularity can be based on the execution of user SQL, which can be either select or session. Let's take a small example to illustrate this.

For example, we have a large query running on node 2, and node 2 suddenly crashes. For the query being executed, for example, there are 10,000 data records, and 8,000 records are found when the failure occurs, then how to deal with the remaining 2,000 records?

The first way is to use select; that is, the failover will be completed and the remaining 2,000 records will continue to be returned. Of course, there will be some context switching in the middle, which is transparent to the user.

The second method is session; that is, directly disconnect and require re-query.

In the 10g version, the configuration of Load Balance + Failover is achieved with the help of VIP configuration as follows:

racdb=
(DESCRIPTION =
(ADDRESS= (PROTOCOL= TCP)(HOST=192.168.3.101)(PORT= 1521))
(ADDRESS= (PROTOCOL= TCP)(HOST=192.168.3.201)(PORT= 1521))
(LOAD_BALANCE = yes)
(FAILOVER = ON)
(CONNECT_DATA =
(SERVER=DEDICATED)
(SERVICE_NAME = racdb)
(FAILOVER_MODE =
(TYPE=SELECT)
(METHOD = BASIC)
(RETRIES = 30)
(DELAY = 5))))
If you want to further extend the failover of SCAN-IP 11g, you also need to set failover_mode and the corresponding type.
RACDB =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = rac-scan)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = RACDB)
)
)

From this perspective, Oracle's solution is really sophisticated. Let’s take a look at MySQL’s solution.

The distributed solution makes MySQL look like a Swiss Army knife. As for the requirements at the network level, it can be said that MySQL has almost no requirements. If you apply for one master and one slave, you only need 4 IP addresses (master, slave, VIP, MHA_Manager (consider a manager node)), and one master and two slaves means 5 IP addresses.

In this regard, MySQL does not natively support so-called load balancing. It can be diverted through the front-end business, such as using a middleware proxy, or continuous splitting. After reaching a certain granularity, the needs can be met through architectural design. Because of logic-based replication, it is easy to expand. One master and multiple slaves are very common, and the cost is not high. There is latency, but it is very low, which can meet most Internet business needs.

When it comes to the conditions that trigger MHA switching, from a network perspective, the following red dots are potential hidden dangers. Some are network interruptions, and some are network delays. When a failure occurs, whether to preserve data or to ensure stable performance, it can be customized based on your own needs. From this point of view, there is a possibility of data loss. It is definitely not a lossless replication with strong consistency.

Overall, RAC is a centralized sharing solution. In addition to sharing at the storage level, multicast at the network level will actually increase the cost of communication between nodes. Therefore, RAC has great requirements for the network. If there is a delay, it is very dangerous, and it will be very embarrassing if a brain split occurs. The MySQL MHA solution is distributed. In an environment that supports large volumes, the cost of communication between nodes is relatively much lower. However, from the perspective of data architecture, because it is a replicated data distribution method, although the storage is not shared, the storage cost is still higher than RAC (not the storage price, but the amount of data stored).

You may also be interested in:
  • MySQL high availability cluster deployment and failover implementation
  • Detailed deployment steps for MySQL MHA high availability configuration and failover
  • MySQL database implements MMM high availability cluster architecture
  • Build a stable and highly available cluster based on mysql+mycat, load balancing, master-slave replication, read-write separation operation
  • MySQL high availability solution MMM (MySQL multi-master replication manager)
  • MySQL Series 14 MySQL High Availability Implementation

<<:  JavaScript implementation of the Game of Life

>>:  Detailed explanation of JavaScript error capture

Recommend

CentOS8 network card configuration file

1. Introduction CentOS8 system update, the new ve...

How to use Docker containers to implement proxy forwarding and data backup

Preface When we deploy applications to servers as...

Detailed steps to install Docker 1.8 on CentOS 7

Docker supports running on the following CentOS v...

Docker deployment RabbitMQ container implementation process analysis

1. Pull the image First, execute the following co...

CentOS 7.6 installation of MySQL 5.7 GA version tutorial diagram

Table of contents Environment Preparation Environ...

About converting textarea text to html, that is, carriage return and line break

Description: Change the carriage return in the tex...

Centos7 installation of FFmpeg audio/video tool simple document

ffmpeg is a very powerful audio and video process...

A brief analysis of the difference between static and self in PHP classes

Use self:: or __CLASS__ to get a static reference...

Detailed explanation of how to reduce memory usage in MySql

Preface By default, MySQL will initialize a large...

8 ways to manually and automatically backup your MySQL database

As a popular open source database management syst...

MySQL Index Detailed Explanation

Table of contents 1. Index Basics 1.1 Introductio...

5 commonly used objects in JavaScript

Table of contents 1. JavaScript Objects 1).Array ...

Alibaba Cloud ESC Server Docker Deployment of Single Node Mysql

1. Download the accelerated version of msyql dock...