This article introduces how to build a high-availability Hadoop 2.10 cluster in CentOS 7. First, prepare 6 machines: 2 NN (namenode); 4 DN (datanode); 3 JNS (journalnodes)
Jps process of each machine: Since I use vmware virtual machines, after configuring one machine, I use clone to clone the remaining machines and modify the hostname and IP, so that the configuration of each machine is unified. Add hdfs users and user groups to each machine configuration, configure the jdk environment, and install hadoop. This time, a high-availability cluster is built under the hdfs user. You can refer to: CentOS 7 builds hadoop 2.10 pseudo-distributed mode Here are some steps and details to install a high availability cluster: 1. Set the hostname and hosts for each machine Modify the hosts file. After the hosts are set, you can use the hostname to access the machine. This is more convenient. Modify as follows: 127.0.0.1 locahost 192.168.30.141 s141 192.168.30.142 s142 192.168.30.143 s143 192.168.30.144 s144 192.168.30.145 s145 192.168.30.146 s146 2. Set up ssh password-free login. Since s141 and s146 are both namenodes, you need to log in to all machines without passwords from these two machines. It is best to set up password-free login for both hdfs users and root users. We set s141 to nn1 and s146 to nn2. We need s141 and s146 to be able to log in to other machines through ssh without a password. To do this, we need to generate a key pair under the hdfs user on the s141 and s146 machines, and send the s141 and s146 public keys to other machines and put them in the ~/.ssh/authorized_keys file. More precisely, we need to add the public keys to all machines (including ourselves). Generate a key pair on the s141 and s146 machines: ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa Append the contents of the id_rsa.pub file to /home/hdfs/.ssh/authorized_keys on the s141-s146 machines. Since other machines do not have authorized_keys files, we can rename id_rsa.pub to authorized_keys. If other machines already have authorized_keys files, we can append the contents of id_rsa.pub to the file. For remote copying, we can use the scp command: Copy the s141 machine public key to other machines scp id_rsa.pub hdfs@s141:/home/hdfs/.ssh/id_rsa_141.pub scp id_rsa.pub hdfs@s142:/home/hdfs/.ssh/id_rsa_141.pub scp id_rsa.pub hdfs@s143:/home/hdfs/.ssh/id_rsa_141.pub scp id_rsa.pub hdfs@s144:/home/hdfs/.ssh/id_rsa_141.pub scp id_rsa.pub hdfs@s145:/home/hdfs/.ssh/id_rsa_141.pub scp id_rsa.pub hdfs@s146:/home/hdfs/.ssh/id_rsa_141.pub Copy the s146 machine public key to other machines scp id_rsa.pub hdfs@s141:/home/hdfs/.ssh/id_rsa_146.pub scp id_rsa.pub hdfs@s142:/home/hdfs/.ssh/id_rsa_146.pub scp id_rsa.pub hdfs@s143:/home/hdfs/.ssh/id_rsa_146.pub scp id_rsa.pub hdfs@s144:/home/hdfs/.ssh/id_rsa_146.pub scp id_rsa.pub hdfs@s145:/home/hdfs/.ssh/id_rsa_146.pub scp id_rsa.pub hdfs@s146:/home/hdfs/.ssh/id_rsa_146.pub On each machine, you can use cat to append the key to the authorized_keys file cat id_rsa_141.pub >> authorized_keys cat id_rsa_146.pub >> authorized_keys At this time, the permissions of the authorized_keys file need to be changed to 644 (note that ssh passwordless login often fails due to this permission problem) chmod 644 authorized_keys 3. Configure the hadoop configuration file (${hadoop_home}/etc/hadoop/) Configuration details: Note: s141 and s146 have exactly the same configuration, especially ssh. 1) Configure nameservice [hdfs-site.xml] <property> <name>dfs.nameservices</name> <value>mycluster</value> </property>
[hdfs-site.xml] <!-- Two ids of the name node under myucluster --> <property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2</value> </property> 3) dfs.namenode.rpc-address.[nameservice ID].[name node ID] [hdfs-site.xml] Configure the rpc address of each nn. <property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>s141:8020</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn2</name> <value>s146:8020</value> </property> 4) dfs.namenode.http-address.[nameservice ID].[name node ID] [hdfs-site.xml] <property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>s141:50070</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value>s146:50070</value> </property> 5) dfs.namenode.shared.edits.dir [hdfs-site.xml] <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://s142:8485;s143:8485;s144:8485/mycluster</value> </property> 6) dfs.client.failover.proxy.provider.[nameservice ID] [hdfs-site.xml] <property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> 7) dfs.ha.fencing.methods [hdfs-site.xml] <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hdfs/.ssh/id_rsa</value> </property> 8) fs.defaultFS [core-site.xml] <property> <name>fs.defaultFS</name> <value>hdfs://mycluster</value> </property> 9) dfs.journalnode.edits.dir [hdfs-site.xml] <property> <name>dfs.journalnode.edits.dir</name> <value>/home/hdfs/hadoop/journal</value> </property> Complete configuration file: core-site.xml <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://mycluster/</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/hdfs/hadoop</value> </property> </configuration> hdfs-site.xml <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.hosts</name> <value>/opt/soft/hadoop/etc/dfs.include.txt</value> </property> <property> <name>dfs.hosts.exclude</name> <value>/opt/soft/hadoop/etc/dfs.hosts.exclude.txt</value> </property> <property> <name>dfs.nameservices</name> <value>mycluster</value> </property> <property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>s141:8020</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn2</name> <value>s146:8020</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>s141:50070</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value>s146:50070</value> </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://s142:8485;s143:8485;s144:8485/mycluster</value> </property> <property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hdfs/.ssh/id_rsa</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/home/hdfs/hadoop/journal</value> </property> </configuration> mapred-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration> yarn-site.xml <?xml version="1.0"?> <configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.resourcemanager.hostname</name> <value>s141</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration> 4. Deployment Details 1) Start the jn processes (s142, s143, s144) on the jn nodes respectively hadoop-daemon.sh start journalnode 2) After starting jn, synchronize disk metadata between the two NNs a) If it is a brand new cluster, format the file system first. This only needs to be executed on one NN. hadoop namenode -format b) If you convert a non-HA cluster to a HA cluster, copy the metadata of the original NN to another NN. 1. Step 1 On the s141 machine, copy the hadoop data to the directory corresponding to s146 scp -r /home/hdfs/hadoop/dfs hdfs@s146:/home/hdfs/hadoop/ 2. Step 2: Run the following command on the new nn (unformatted nn, mine is s146) to boot in standby mode. Note: s141namenode needs to be started (you can execute: hadoop-daemon.sh start namenode ). hdfs namenode -bootstrapStandby If the s141 name node is not started, it will fail, as shown in the figure: After starting the s141 name node, execute the command on s141 hadoop-daemon.sh start namenode Then execute the standby boot command, note: if prompted whether to format, select N, as shown in the figure: 3. Step 3 Execute the following command on one of the NNs to complete the transmission of the edit log to the jn node. hdfs namenode -initializeSharedEdits If the java.nio.channels.OverlappingFileLockException error is reported during execution: Indicates that namenode is starting and needs to be stopped (hadoop-daemon.sh stop namenode) After execution, check whether s142, s143, and s144 have edit data. Here, check that the mycluster directory has been produced, which contains edit log data, as follows: 4. Step 4 Start all nodes. Start the name node and all data nodes on s141: hadoop-daemon.sh start namenode hadoop-daemons.sh start datanode Start the name node on s146 hadoop-daemon.sh start namenode At this time, visit http://192.168.30.141:50070/ and http://192.168.30.146:50070/ in the browser and you will find that both namenodes are standby. At this time, you need to manually use the command to switch one of them to the active state. Here, set s141 (nn1) to active hdfs haadmin -transitionToActive nn1 At this time, s141 is active Common commands of hdfs haadmin: At this point, the manual disaster recovery high availability configuration is complete, but this method is not intelligent and cannot automatically sense disaster recovery, so the following introduces the automatic disaster recovery configuration 5. Automatic disaster recovery configuration Need to introduce two components: zookeeperquarum and zk disaster recovery controller (ZKFC) Build a zookeeper cluster, select three machines s141, s142, and s143, and download zookeeper: http://mirror.bit.edu.cn/apache/zookeeper/zookeeper-3.5.6 1) Unzip zookeeper: tar -xzvf apache-zookeeper-3.5.6-bin.tar.gz -C /opt/soft/zookeeper-3.5.6 2) Configure environment variables, add zk environment variables in /etc/profile, and recompile the /etc/profile file Copy the code as follows: source /etc/profile 3) Configure the zk configuration file, and unify the configuration files of the three machines # The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=/home/hdfs/zookeeper # the port at which the clients will connect clientPort=2181 # the maximum number of client connections. # increase this if you need to handle more clients #maxClientCnxns=60 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 server.1=s141:2888:3888 server.2=s142:2888:3888 server.3=s143:2888:3888 4) Respectively Create a myid file in the /home/hdfs/zookeeper directory of s141 (the dataDir path configured in the zoo.cfg configuration file) with a value of 1 (corresponding to server.1 in the zoo.cfg configuration file) Create a myid file in the /home/hdfs/zookeeper directory of s142 (the dataDir path configured in the zoo.cfg configuration file) with a value of 2 (corresponding to server.2 in the zoo.cfg configuration file) Create a myid file in the /home/hdfs/zookeeper directory of s143 (the dataDir path configured in the zoo.cfg configuration file) with a value of 3 (corresponding to server.3 in the zoo.cfg configuration file) 5) Start zk on each machine separately zkServer.sh start If the startup is successful, the zk process will appear: Configure hdfs related configuration: 1) Stop all hdfs processes stop-all.sh 2) Configure hdfs-site.xml and enable automatic disaster recovery. [hdfs-site.xml] <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> 3) Configure core-site.xml and specify the connection address of zk. <property> <name>ha.zookeeper.quorum</name> <value>s141:2181,s142:2181,s143:2181</value> </property> 4) Distribute the above two files to all nodes. 5) In one of the NNs (s141), initialize the HA state in ZK hdfs zkfc -formatZK The following result indicates success: You can also check it in zk: 6) Start the HDFS cluster start-dfs.sh View the processes of each machine: Startup successful, take a look at the webui s146 is activated s141 is in standby mode At this point, the Hadoop automatic disaster recovery HA is built Summarize The above is what I introduced to you about building hadoop2.10 high availability (HA) on centos7. I hope it will be helpful to you! You may also be interested in:
|
<<: About the problem of writing plugins for mounting DOM in vue3
>>: Analysis of the principle of Mybatis mapper dynamic proxy
Monitoring method in Vue watch Notice Name: You s...
This article shares the specific code for JavaScr...
Table of contents principle Network environment p...
Unix/Linux Services systemd services Operation pr...
I haven't worked with servers for a while. No...
Table of contents Project Background start Create...
Install Filebeat has completely replaced Logstash...
This article uses examples to describe the basic ...
The img tag in XHTML should be written like this:...
I searched a lot online and found that many of th...
JSON (JavaScript Object Notation, JS Object Notat...
vertical-align attribute is mainly used to change...
The task of concurrency control in a database man...
Table of contents Preface What to use if not jQue...
Data Sheet: Column to row: using max(case when th...