Question Guide 1. Purpose In this post, we will discuss the comparison between Hadoop 2.x and Hadoop 3.x. What new features are added in Hadoop 3 version, what are the compatible Hadoop 2 programs in Hadoop 3, what is the difference between Hadoop 2 and Hadoop 3? 2. Comparison between Hadoop 2.x and Hadoop 3.x This section will describe 22 differences between Hadoop 2.x and Hadoop 3.x. Now let's discuss each one 2.1License Hadoop 2.x - Apache 2.0, open source 2.2 Minimum supported Java version Hadoop 2.x - the minimum supported version of java is java 7 2.3 Fault Tolerance Hadoop 2.x - can handle fault tolerance by replication (wasting space). 2.4 Data Balancing Hadoop 2.x − For data balancing use HDFS balancer. 2.5 Storage Scheme Hadoop 2.x - Use 3X replica scheme 2.6 Storage Overhead Hadoop 2.x - HDFS has 200% overhead in storage space. 2.7 Storage Overhead Example Hadoop 2.x - If there are 6 blocks, then due to the replication scheme, there will be 18 blocks occupying the space. 2.8YARN Timeline Service Hadoop 2.x - Uses the old Timeline service which has scalability issues. 2.9 Default Port Range Hadoop 2.x − In Hadoop 2.0, some of the default ports are in the Linux ephemeral port range. So at startup, they will not be able to bind. 2.10 Tools Hadoop 2.x − Use Hive, pig, Tez, Hama, Giraph, and other Hadoop tools. 2.11 compatible file systems Hadoop 2.x − HDFS (Default FS), FTP File System: It stores all the data on a remotely accessible FTP server. Amazon S3 (Simple Storage Service) file system Windows Azure Storage Blob (WASB) file system. 2.12Datanode Resources Hadoop 2.x − Datanode resources are not dedicated to MapReduce, we can use it for other applications. 2.13MR API Compatibility Hadoop 2.x - MR API compatible with Hadoop 1.x programs, executable on Hadoop 2.X 2.14 Support for Microsoft Windows Hadoop 2.x − It can be deployed on Windows. 2.15 Slots/Containers Hadoop 2.x − Hadoop 1 worked on the concept of slots, but Hadoop 2.X works on the concept of containers. Through containers, we can run common tasks. 2.16 Single Point of Failure Hadoop 2.x − Has the feature of SPOF, so whenever Namenode fails, it automatically recovers. 2.17 HDFS Alliance Hadoop 2.x − In Hadoop 1.0, there was only one NameNode to manage all Namespaces, but in Hadoop 2.0, multiple NameNodes are used for multiple Namespaces. 2.18 Scalability Hadoop 2.x - We can scale up to 10,000 nodes per cluster. 2.19 Faster access to data Hadoop 2.x − We can access data quickly due to data node cache. 2.20HDFS Snapshot Hadoop 2.x − Hadoop 2 added support for snapshots. It provides disaster recovery and protection against user errors. 2.21 Platform Hadoop 2.x - can be used as a platform for various data analytics, running event processing, streaming, and real-time operations. 2.22 Cluster Resource Management Hadoop 2.x − For cluster resource management, it uses YARN. It improves scalability, high availability, and multi-tenancy. Improvements of hadoop3.X over hadoop2.x Common major improvements: HDFS improvements: Yarn improvements: MapRduece improvements: Other new features: Conclusion As we have discussed 22 important differences between Hadoop 2.x and Hadoop 3.x and the improvements in 3.x, now we can see which one is better, Hadoop 2 or Hadoop 3. Summarize The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. Thank you for your support of 123WORDPRESS.COM. If you want to learn more about this, please check out the following links You may also be interested in:
|
<<: Detailed tutorial for installing MySQL on Linux
>>: Solution to Element-ui upload file upload restriction
As the demand for front-end pages continues to in...
1. Best left prefix principle - If multiple colum...
1. Download MySQL Workbench Workbench is a graphi...
Table of contents Overview Checking setTimeout() ...
Mini Program Data Cache Related Knowledge Data ca...
Copy code The code is as follows: <html> &l...
1. Download 2. Decompression 3. Add the path envi...
Copy code The code is as follows: a:link { font-s...
The effect to be achieved In many cases, we will ...
1. Oracle is a large database while MySQL is a sm...
Flappy Bird is a very simple little game that eve...
This article shares the specific code of Javascri...
Table of contents TypeScript environment construc...
Link: https://qydev.weixin.qq.com/wiki/index.php?...
How to use css variables in JS Use the :export ke...