Detailed steps to install Hadoop cluster under Linux

Detailed steps to install Hadoop cluster under Linux

1. Create a Hadoop directory in the usr directory, import the installation package into the directory and decompress the file

2. Enter the vim /etc/profile file and edit the configuration file

#hadoop
export HADOOP_HOME=/usr/hadoop/hadoop-2.6.0
export CLASSPATH=$CLASSPATH:$HADOOP_HOME/lib
export PATH=$PATH:$HADOOP_HOME/bin 

3. Make the file effective

source /etc/profile 

4. Enter the Hadoop directory

cd /usr/hadoop/hadoop-2.6.0/etc/hadoop 

5. Edit the configuration file

(1) Enter vim hadoop-env.sh file and add (the location of the java jdk file)

export JAVA_HOME=/usr/java/jdk1.8.0_181 

(2) Enter vim core-site.xml (z1: the IP or mapping name of the master node (change it to your own))

<configuration>
<property>
        <name>hadoop.tmp.dir</name>
        <value>file:/root/hadoop/tmp</value>
</property>
<!--Port number 9000-->
<property>
        <name>fs.default.name</name>
        <value>hdfs://z1:9000</value>
</property>
<!--Turn on the trash can mechanism in minutes-->
<property>
    <name>fs.trash.insterval</name>
    <value>10080</value>
</property>
<!--Buffer size, actual work depends on server performance-->
<property>
    <name>io.file.buffer.sizei</name>
    <value>4096</value>
</property>
</configuration>
                                                                                                                                                                  39,9 bottom 

(3) Hadoop does not have a mapred-site.xml file. Copy the file here and then enter mapred-site.xml

cp mapred-site.xml.template mapred-site.xml
vim mapred-site.xml

(z1: the IP or mapping name of the master node (change to your own))

<configuration>
<property>
<!--Specify Mapreduce to run on yarn-->
   <name>mapreduce.framework.name</name>
   <value>yarn</value>
 </property>
<!--Start MapReduce's small task mode-->
<property>
      <name>mapred.job.ubertask.enable</name>
      <value>true</value>
</property>
<property>
      <name>mapred.job.tracker</name>
      <value>z1:9001</value>
</property>
 
<property>
<name>mapreduce.jobhistory.address</name>
<value>CMaster:10020</value>
</property>
</configuration> 

(4) Enter yarn-site.xml

vim yarn-site.xml

(z1: the IP or mapping name of the master node (change to your own))

<configuration>
 
<!-- Site specific YARN configuration properties -->
 
<!--Configure the location of the yarn master node-->
<property>
        <name>yarn.resourcemanager.hostname</name>
        <value>z1</value>
</property>
<property>
<!-- mapreduce, the way to get data when executing shuff1e.-->
<description>The address of the applications manager interface in the RM.</description>
     <name>yarn.resourcemanager.address</name>
     <value>z1:8032</value>
</property>
<property>
  <name>yarn.resourcemanager.scheduler.address</name>
  <value>z1:8030</value>
</property>
 
<property>
  <name>yarn.resourcemanager.webapp.address</name>
  <value>z1:8088</value>
</property>
 
<property>
  <name>yarn.resourcemanager.webapp.https.address</name>
  <value>z1:8090</value>
</property>
<property>
  <name>yarn.resourcemanager.resource-tracker.address</name>
  <value>z1:8031</value>
</property>
<property>
  <name>yarn.resourcemanager.admin.address</name>
  <value>z1:8033</value>
</property>
<property><!--The way to get data when mapreduce executes shuff1e, -->
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_shuffle</value>
</property>
<property>
<!--Set memory, memory allocation of yarn-->
  <name>yarn.scheduler.maximum-a11ocation-mb</name>
  <value>2024</value>
  <discription>Available memory per node, unit: M, default: 8182MB</discription>
</property>
<property>
  <name>yarn.nodemanager.vmem-pmem-ratio</name>
  <value>2.1</value>
</property>
<property>
  <name>yarn.nodemanager.resource.memory-mb</name>
  <value>1024</value>
</property>
<property>
  <name>yarn.nodemanager.vmem-check-enabled</name>
  <value>false</value>
</property>
 
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
 
 
</configuration>
                                                    

(5) Enter hdfs-site.xml

vim hdfs-site.xml 

<configuration>
<property>
        <name>dfs.namenode.name.dir</name>
        <value>file:/usr/hadoop/hadoop-2.6.0/hadoopDesk/namenodeDatas</value>
</property>
 <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:/usr/hadoop/hadoop-2.6.0/hadoopDatas/namenodeDatas</value>
    </property>
<property>
<!--Number of copies-->
<name>dfs.replication</name>
<value>3</value>
</property>
<!--Set hdfs file permissions-->
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<!--Set the size of a file slice: 128m-->
<property>
<name>dfs.bloksize</name>
<value>134217728</value>
</property>
</configuration>

6. Enter slaves to add master nodes and slave nodes

vim slaves

Add your own master node and slave node (mine are z1, z2, z3)

7. Copy each file to another virtual machine

scp -r /etc/profile root@z2:/etc/profile #Distribute the environment variable profile file to the z2 node scp -r /etc/profile root@z3:/etc/profile #Distribute the environment variable profile file to the z3 node scp -r /usr/hadoop root@z2:/usr/ #Distribute the hadoop file to the z2 node scp -r /usr/hadoop root@z3:/usr/ #Distribute the hadoop file to the z3 node

The environment variables of the two slave nodes take effect

source /etc/profile

8.Format Hadoop (operate only in the master node)

First check whether jps has started hadoop

hadoop namenode -format

When you see Exiting with status 0, it means the formatting is successful.

9. Return to the Hadoop directory (operate only on the master node)

cd /usr/hadoop/hadoop-2.6.0
sbin/start-all.sh starts Hadoop and operates only on the master node 

The effect of inputting jps on the master node is as follows:

The effect of inputting jps from the node:

This is the end of this article about the detailed steps of installing Hadoop cluster under Linux. For more relevant content about installing Hadoop cluster under Linux, please search for previous articles on 123WORDPRESS.COM or continue to browse the related articles below. I hope everyone will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Hadoop 2.7.3 installation and setup process under Linux
  • Detailed graphic explanation of hadoop installation and configuration based on Linux7
  • How to install the standalone version of spark in linux environment without using hadoop
  • Steps to build Hadoop service in Centos7 in Linux
  • Detailed steps to install and configure hadoop cluster in Linux
  • Sharing the steps of building a hadoop environment under Linux
  • Detailed explanation of installing Hadoop true distributed cluster on Linux system

<<:  JavaScript to implement voice queuing system

>>:  MySQL uses the Partition function to implement horizontal partitioning strategy

Recommend

CSS to implement QQ browser functions

Code Knowledge Points 1. Combine fullpage.js to a...

How to use CSS to write different styles according to sub-elements

The effect we need to achieve: What is needed The...

In-depth analysis of the role of HTML <!--...--> comment tags

When we check the source code of many websites, w...

Tutorial analysis of quick installation of mysql5.7 based on centos7

one. wget https://dev.mysql.com/get/mysql57-commu...

About the problem of dynamic splicing src image address of img in Vue

Let's take a look at the dynamic splicing of ...

Detailed tutorial on how to install MySQL 5.7.18 in Linux (CentOS 7) using YUM

The project needs to use MySQL. Since I had alway...

Detailed explanation of custom events of Vue components

Table of contents Summarize <template> <...

Common CSS Errors and Solutions

Copy code The code is as follows: Difference betw...

MySQL 5.7.18 MSI Installation Graphics Tutorial

This article shares the MySQL 5.7.18 MSI installa...

jQuery achieves the shutter effect (using li positioning)

This article shares the specific code of jQuery t...

Html and CSS Basics (Must Read)

(1) HTML: HyperText Markup Language, which mainly...

How to handle concurrent updates of MySQL data

Will UPDATE lock? Will the SQL statement be locke...

Implementation of nacos1.3.0 built with docker

1. Resume nacos database Database name nacos_conf...

How to use vue3+TypeScript+vue-router

Table of contents Easy to use Create a project vu...

Detailed explanation of various methods of Vue component communication

Table of contents 1. From father to son 2. From s...