Detailed graphic explanation of hadoop installation and configuration based on Linux7

Detailed graphic explanation of hadoop installation and configuration based on Linux7

insert image description hereinsert image description here

Prepare the ingredients as shown above (ps: hadoop-3.1.2-src is changed to hadoop-3.1.2

What does src mean for source file? Anyway, I have changed it. Please pay attention to the screenshots below. If there are any mistakes, I will correct them when I have time.)

Install centos7

insert image description here

Right click on the desktop to open the terminal - enter ifconfig - check the IP address of ens33 - remember and open xftp6

insert image description here

Click New

insert image description here

insert image description here

insert image description here

Select multiple ingredients and right-click to transfer. The transmission speed of the intranet is neither fast nor slow.

It's perfect.

insert image description hereinsert image description hereinsert image description here

Unzip the hadoop installation package tar -zxvf hadoop-3.1.2-src.tar.gz

insert image description here

I reinstalled centos7 and divided it into folders when I unzipped it

insert image description hereinsert image description here

Write as above

insert image description hereinsert image description hereinsert image description hereinsert image description here

Open xshell and create a new

insert image description here

Enter your host ip and write your username and password on the user authentication

insert image description here

insert image description hereinsert image description here

Yes, that's it - then all three machines need to be renamed

insert image description here

Time synchronization time zone is consistent. To ensure that the host time is set accurately, the time zone of each machine must be consistent. In the experiment, we need to synchronize the network time, so we must first select the same time zone. First make sure the time zones are the same, otherwise there will be a time zone difference after synchronization. You can use the date command to check your machine time. Select the time zone: tzselect

insert image description here

1. Turn off the firewall

When its status is dead, the firewall is closed. Close the firewall: systemctl stop firewalld View the status: systemctl status firewalld

2. Hosts file configuration (three machines) Enter the IP address of each node as shown below

insert image description here

3. The master acts as the ntp server and modifies the ntp configuration file. (Executed on the master)

vi /etc/ntp.conf
	server 127.127.1.0 # local clock
	fudge 127.127.1.0 stratum 10 #It is also possible to set stratum to other values, the range is 0~15 

insert image description here

	Restart the ntp service.
	/bin/systemctl restart ntpd.service

	Other machines synchronize (slave1, slave2)
	Wait for about five minutes, and then synchronize the master server time on other machines.
	ntpdate master

	If the configuration platform has no external network connection, you can set the three machines to the same time by entering the command:
	date -s 10:00 (time) 

insert image description here

Finally getting to the point? ? ? Don't panic.

1. SSH without password

(1) Each node generates a public and private key:

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa (three machines)
The key generation directory is in the .ssh directory under the user's home directory. Enter the corresponding directory to view:
cd .ssh/

insert image description here

(2) Id_dsa.pub is the public key, id_dsa is the private key, then copy the public key file to the authorized_keys file: (master only)

cat id_dsa.pub >> authorized_keys (note that this is done in the .ssh/ path)

insert image description here

Connecting to itself on the host is also called ssh loopback.
ssh master

insert image description here

(3) Allow the master node to log in to the two slave nodes via SSH without a password. (Operation in slave)

In order to achieve this function, the public key files of the two slave nodes must contain the public key information of the master node, so that the master can successfully and safely access the two slave nodes.
The slave1 node uses the scp command to remotely log in to the master node, copy the master's public key file to the current directory, and rename it to master_das.pub. This process requires password verification.

scp master:~/.ssh/id_dsa.pub ./master_das.pub

insert image description here

Append the master node's public key file to the authorized_keys file:

cat master_das.pub >> authorized_keys

(1) Generate public and private keys for each node: ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa (three machines) The key generation directory is in the .ssh directory under the user's home directory. Enter the corresponding directory to view: cd .ssh/ (2) Id_dsa.pub is the public key, id_dsa is the private key, and then copy the public key file to the authorized_keys file: (master only) cat id_dsa.pub >> authorized_keys (note that the operation is under the .ssh/ path) Connecting to itself on the host is also called ssh internal loop. ssh master

insert image description here At this time,

The master can now connect to slave1.

insert image description here

When slave1 node is connected for the first time, it needs to confirm the connection with "yes", which means that the master node needs to be manually asked when connecting to slave1 node, and cannot connect automatically. After entering yes, the connection is successful, and then log out and exit to the master node.

The same operation is done in slave2

JDK has been installed before, so we configure the environment directly, just like configuring environment variables in Windows (three machines)

Modify environment variables: vi /etc/profile
> Add the following content:
> export JAVA_HOME=/usr/java/jdk1.8.0_241
> export CLASSPATH=$JAVA_HOME/lib/export
> PATH=$PATH:$JAVA_HOME/bin
> export PATH JAVA_HOME CLASSPATH

To enable environment variables: source /etc/profile

Insert a little trick scp

insert image description here

scp /etc/profile slave1:/etc/profile ##This way it can be passed to slave1 and slave2

Finally got to hadoop? ? ? Congratulations Ning!

Configure environment variables:
vi /etc/profile
export HADOOP_HOME=/usr/hadoop/hadoop-3.1.2
export CLASSPATH=$CLASSPATH:$HADOOP_HOME/lib
export PATH=$PATH:$HADOOP_HOME/bin

Tell me loudly what the step I often forget is!

Use the following command to make the profile effective: source /etc/profile

Warm reminder: Below is the content of the configuration file. This article will not explain the content for the time being, but I have prepared a standard configuration file for you.

insert image description here

Edit the hadoop environment configuration file hadoop-env.sh

export JAVA_HOME=/usr/java/jdk1.8.0_241
There will be a lot of comments in this file. Find the template you want to configure and delete the pound sign.

Then comes the part where I get lazy! ! ! ! ! I uploaded several configuration files. We can copy them to this folder. When the system prompts whether to overwrite, just enter y.

core-site.xml yarn-site.xml hdfs-site.xml mapred-site.xml 

insert image description here

You also need to write the slave file and add slave1 slave2 as shown below

insert image description here

There is also a master file

insert image description here

(9) Distribute Hadoop:
scp -r /usr/hadoop root@slave1:/usr/
scp -r /usr/hadoop root@slave2:/usr/

Format hadoop in the master hadoop namenode -format If an error is reported, check if there is a solution in the error in the following link

Summarize

The above is the detailed graphic description of the installation and configuration of Hadoop based on Linux 7 introduced by the editor. I hope it will be helpful to everyone!

You may also be interested in:
  • Detailed steps to install Hadoop cluster under Linux
  • Hadoop 2.7.3 installation and setup process under Linux
  • How to install the standalone version of spark in linux environment without using hadoop
  • Steps to build Hadoop service in Centos7 in Linux
  • Detailed steps to install and configure hadoop cluster in Linux
  • Sharing the steps of building a hadoop environment under Linux
  • Detailed explanation of installing Hadoop true distributed cluster on Linux system

<<:  How to install babel using npm in vscode

>>:  How to run multiple MySQL instances in Windows

Recommend

How to insert Emoji expressions into MySQL

Preface Today, when I was designing a feedback fo...

Introduction to Javascript DOM, nodes and element acquisition

Table of contents DOM node Element node: Text nod...

Example to explain the size of MySQL statistics table

Counting the size of each table in each database ...

How to maintain MySQL indexes and data tables

Table of contents Find and fix table conflicts Up...

WiFi Development | Introduction to WiFi Wireless Technology

Table of contents Introduction to WiFi Wireless T...

Detailed analysis of classic JavaScript recursion case questions

Table of contents What is recursion and how does ...

Vue simulates the shopping cart settlement function

This article example shares the specific code of ...

How to install Zookeeper service on Linux system

1. Create the /usr/local/services/zookeeper folde...

10 Best Practices for Building and Maintaining Large-Scale Vue.js Projects

Table of contents 1. Use slots to make components...

JS+CSS to realize dynamic clock

This article example shares the specific code of ...

Detailed explanation of HTML page header code example

Knowledge point 1: Set the base URL of the web pa...

How to install docker using YUM

As shown in the following figure: If the version ...

JS realizes the front-end paging effect

This article example shares the specific code of ...

Vue+webrtc (Tencent Cloud) practice of implementing live broadcast function

Table of contents 1. Live broadcast effect 2. Ste...

JavaScript modularity explained

Table of contents Preface: 1. Concept 2. The bene...