Detailed steps to install Hadoop cluster under Linux

1. Create a Hadoop directory in the usr directory, import the installation package into the directory and decompress the file
2. Enter the vim /etc/profile file and edit the configuration file
3. Make the file effective
4. Enter the Hadoop directory
5. Edit the configuration file
6. Enter slaves to add master nodes and slave nodes
7. Copy each file to another virtual machine
8.Format Hadoop (operate only in the master node)
9. Return to the Hadoop directory (operate only on the master node)

1. Create a Hadoop directory in the usr directory, import the installation package into the directory and decompress the file

2. Enter the vim /etc/profile file and edit the configuration file

#hadoop
export HADOOP_HOME=/usr/hadoop/hadoop-2.6.0
export CLASSPATH=$CLASSPATH:$HADOOP_HOME/lib
export PATH=$PATH:$HADOOP_HOME/bin

3. Make the file effective

source /etc/profile

4. Enter the Hadoop directory

cd /usr/hadoop/hadoop-2.6.0/etc/hadoop

5. Edit the configuration file

(1) Enter vim hadoop-env.sh file and add (the location of the java jdk file)

export JAVA_HOME=/usr/java/jdk1.8.0_181

(2) Enter vim core-site.xml (z1: the IP or mapping name of the master node (change it to your own))

<configuration>
<property>
        <name>hadoop.tmp.dir</name>
        <value>file:/root/hadoop/tmp</value>
</property>
<!--Port number 9000-->
<property>
        <name>fs.default.name</name>
        <value>hdfs://z1:9000</value>
</property>
<!--Turn on the trash can mechanism in minutes-->
<property>
    <name>fs.trash.insterval</name>
    <value>10080</value>
</property>
<!--Buffer size, actual work depends on server performance-->
<property>
    <name>io.file.buffer.sizei</name>
    <value>4096</value>
</property>
</configuration>
                                                                                                                                                                  39,9 bottom

(3) Hadoop does not have a mapred-site.xml file. Copy the file here and then enter mapred-site.xml

cp mapred-site.xml.template mapred-site.xml
vim mapred-site.xml

(z1: the IP or mapping name of the master node (change to your own))

<configuration>
<property>
<!--Specify Mapreduce to run on yarn-->
   <name>mapreduce.framework.name</name>
   <value>yarn</value>
 </property>
<!--Start MapReduce's small task mode-->
<property>
      <name>mapred.job.ubertask.enable</name>
      <value>true</value>
</property>
<property>
      <name>mapred.job.tracker</name>
      <value>z1:9001</value>
</property>
 
<property>
<name>mapreduce.jobhistory.address</name>
<value>CMaster:10020</value>
</property>
</configuration>

(4) Enter yarn-site.xml

vim yarn-site.xml

(z1: the IP or mapping name of the master node (change to your own))

<configuration>
 
<!-- Site specific YARN configuration properties -->
 
<!--Configure the location of the yarn master node-->
<property>
        <name>yarn.resourcemanager.hostname</name>
        <value>z1</value>
</property>
<property>
<!-- mapreduce, the way to get data when executing shuff1e.-->
<description>The address of the applications manager interface in the RM.</description>
     <name>yarn.resourcemanager.address</name>
     <value>z1:8032</value>
</property>
<property>
  <name>yarn.resourcemanager.scheduler.address</name>
  <value>z1:8030</value>
</property>
 
<property>
  <name>yarn.resourcemanager.webapp.address</name>
  <value>z1:8088</value>
</property>
 
<property>
  <name>yarn.resourcemanager.webapp.https.address</name>
  <value>z1:8090</value>
</property>
<property>
  <name>yarn.resourcemanager.resource-tracker.address</name>
  <value>z1:8031</value>
</property>
<property>
  <name>yarn.resourcemanager.admin.address</name>
  <value>z1:8033</value>
</property>
<property><!--The way to get data when mapreduce executes shuff1e, -->
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_shuffle</value>
</property>
<property>
<!--Set memory, memory allocation of yarn-->
  <name>yarn.scheduler.maximum-a11ocation-mb</name>
  <value>2024</value>
  <discription>Available memory per node, unit: M, default: 8182MB</discription>
</property>
<property>
  <name>yarn.nodemanager.vmem-pmem-ratio</name>
  <value>2.1</value>
</property>
<property>
  <name>yarn.nodemanager.resource.memory-mb</name>
  <value>1024</value>
</property>
<property>
  <name>yarn.nodemanager.vmem-check-enabled</name>
  <value>false</value>
</property>
 
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
 
 
</configuration>

(5) Enter hdfs-site.xml

vim hdfs-site.xml

<configuration>
<property>
        <name>dfs.namenode.name.dir</name>
        <value>file:/usr/hadoop/hadoop-2.6.0/hadoopDesk/namenodeDatas</value>
</property>
 <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:/usr/hadoop/hadoop-2.6.0/hadoopDatas/namenodeDatas</value>
    </property>
<property>
<!--Number of copies-->
<name>dfs.replication</name>
<value>3</value>
</property>
<!--Set hdfs file permissions-->
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<!--Set the size of a file slice: 128m-->
<property>
<name>dfs.bloksize</name>
<value>134217728</value>
</property>
</configuration>

6. Enter slaves to add master nodes and slave nodes

vim slaves

Add your own master node and slave node (mine are z1, z2, z3)

7. Copy each file to another virtual machine

scp -r /etc/profile root@z2:/etc/profile #Distribute the environment variable profile file to the z2 node scp -r /etc/profile root@z3:/etc/profile #Distribute the environment variable profile file to the z3 node scp -r /usr/hadoop root@z2:/usr/ #Distribute the hadoop file to the z2 node scp -r /usr/hadoop root@z3:/usr/ #Distribute the hadoop file to the z3 node

The environment variables of the two slave nodes take effect

source /etc/profile

8.Format Hadoop (operate only in the master node)

First check whether jps has started hadoop

hadoop namenode -format

When you see Exiting with status 0, it means the formatting is successful.

9. Return to the Hadoop directory (operate only on the master node)

cd /usr/hadoop/hadoop-2.6.0
sbin/start-all.sh starts Hadoop and operates only on the master node

The effect of inputting jps on the master node is as follows:

The effect of inputting jps from the node:

This is the end of this article about the detailed steps of installing Hadoop cluster under Linux. For more relevant content about installing Hadoop cluster under Linux, please search for previous articles on 123WORDPRESS.COM or continue to browse the related articles below. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:

Hadoop 2.7.3 installation and setup process under Linux
Detailed graphic explanation of hadoop installation and configuration based on Linux7
How to install the standalone version of spark in linux environment without using hadoop
Steps to build Hadoop service in Centos7 in Linux
Detailed steps to install and configure hadoop cluster in Linux
Sharing the steps of building a hadoop environment under Linux
Detailed explanation of installing Hadoop true distributed cluster on Linux system

<<: JavaScript to implement voice queuing system

>>: MySQL uses the Partition function to implement horizontal partitioning strategy

Solution to web page confusion caused by web page FOUC problem

Blog

Handtrack.js library for real-time monitoring of hand movements (recommended)

Blog

MySQL database introduction: detailed explanation of database backup operation

Blog

jQuery realizes the shuttle box function

Blog

Analyze the problem of pulling down the Oracle 11g image configuration in Docker

Blog

Detailed explanation of asynchronous programming knowledge points in nodejs

Blog

How to implement mask layer in HTML How to use mask layer in HTML

Blog

Linux 6 steps to change the default remote port number of ssh

Blog

How to check disk usage in Linux

Blog

Sample code for partitioning and formatting a disk larger than 20TB on centos6

Blog

Recommend

Detailed explanation of Linux system software installation commands based on Debian (recommended)

Introduction to Debian Debian in a broad sense re...

MySQL 8.0.20 installation and configuration method graphic tutorial under Windows 10

Win10 system locally installed MySQL8.0.20, perso...

MySQL database master-slave replication and read-write separation

Table of contents 1. Master-slave replication Mas...

Four ways to combine CSS and HTML

(1) Each HTML tag has an attribute style, which c...

Detailed explanation of how to create an updateable view in MySQL

This article uses an example to describe how to c...

Unable to define IE6 font: 13px size is invalid, IE6 automatically displays a larger font solution

A few days ago, when I was adjusting a module of a...

Detailed process of zabbix monitoring process and port through agent

Environment Introduction Operating system: centos...

Analysis of the usage of process control functions/statistical functions/grouping queries in MySql

The road ahead is long and arduous, but I will co...

MySQL character set garbled characters and solutions

Preface A character set is a set of symbols and e...

Some key points of website visual design

From handicraft design to graphic design to web de...

HTML Tutorial: title attribute and alt attribute

XHTML is the basis of CSS layout. jb51.net has al...

Solve the problem that the commonly used Linux command "ll" is invalid or the command is not found

question: The commonly used command "ll"...

How to change MySQL character set utf8 to utf8mb4

For MySQL 5.5, if the character set is not set, t...

MySQL primary key naming strategy related

Recently, when I was sorting out the details of d...

Pay attention to the use of HTML tags in web page creation

HTML has attempted to move away from presentation...