Reinventing the wheel, here we use repackaging to generate a Docker-based Hadoop image; The software that Hadoop cluster depends on are: jdk, ssh, etc., so as long as these two items and Hadoop related packages are packaged into the image; Configuration file preparation 1. Hadoop related configuration files: core-site.xml, hdfs-site.xml, mapred-site.xml, yarn-site.xml, slaves, hadoop-env.sh Make an image 1. Installation dependencies RUN apt-get update && \ apt-get install -y openssh-server openjdk-8-jdk wget 2. Download Hadoop package RUN wget http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.10.0/hadoop-2.10.0.tar.gz && \ tar -xzvf hadoop-2.10.0.tar.gz && \ mv hadoop-2.10.0 /usr/local/hadoop && \ rm hadoop-2.10.0.tar.gz && \ rm /usr/local/hadoop/share/doc -rf 3. Configure environment variables ENV JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 ENV HADOOP_HOME=/usr/local/hadoop ENV PATH=$PATH:/usr/local/hadoop/bin:/usr/local/hadoop/sbin 4. Generate SSH key for password-free node login RUN ssh-keygen -t rsa -f ~/.ssh/id_rsa -P '' && \ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys 5. Create Hadoop related directories, copy related configuration files, add execution permissions to related files, and finally format the namenode node. When each node starts, start the ssh service; RUN mkdir -p ~/hdfs/namenode && \ mkdir -p ~/hdfs/datanode && \ mkdir $HADOOP_HOME/logs COPY config/* /tmp/ #Copy ssh, hadoop configuration related RUN mv /tmp/ssh_config ~/.ssh/config && \ mv /tmp/hadoop-env.sh /usr/local/hadoop/etc/hadoop/hadoop-env.sh && \ mv /tmp/hdfs-site.xml $HADOOP_HOME/etc/hadoop/hdfs-site.xml && \ mv /tmp/core-site.xml $HADOOP_HOME/etc/hadoop/core-site.xml && \ mv /tmp/mapred-site.xml $HADOOP_HOME/etc/hadoop/mapred-site.xml && \ mv /tmp/yarn-site.xml $HADOOP_HOME/etc/hadoop/yarn-site.xml && \ mv /tmp/slaves $HADOOP_HOME/etc/hadoop/slaves && \ mv /tmp/start-hadoop.sh ~/start-hadoop.sh && \ mv /tmp/run-wordcount.sh ~/run-wordcount.sh #Add execution permission RUN chmod +x ~/start-hadoop.sh && \ chmod +x ~/run-wordcount.sh && \ chmod +x $HADOOP_HOME/sbin/start-dfs.sh && \ chmod +x $HADOOP_HOME/sbin/start-yarn.sh #format namenode RUN /usr/local/hadoop/bin/hdfs namenode -format Running Hadoop Cluster in Docker After the image is generated through the above Dockerfile, you can use the image generated above to build a Hadoop cluster; here start a master and two slave nodes; Add a bridge network: docker network create --driver=bridge solinx-hadoop Start the Master node: docker run -itd --net=solinx-hadoop -p 10070:50070 -p 8088:8088 --name solinx-hadoop-master --hostname solinx-hadoop-master solinx/hadoop:0.1 Start the Slave1 node: docker run -itd --net=solinx-hadoop --name solinx-hadoop-slave1 --hostname solinx-hadoop-slave1 solinx/hadoop:0.1 Start the Slave2 node: docker run -itd --net=solinx-hadoop --name solinx-hadoop-slave2 --hostname solinx-hadoop-slave1 solinx/hadoop:0.1 Enter the Master node and execute the script to start the Hadoop cluster: Summarize The above is what I introduced to you about running Hadoop and image creation in Docker. I hope it will be helpful to you. If you have any questions, please leave me a message and I will reply to you in time. I would also like to thank everyone for their support of the 123WORDPRESS.COM website! You may also be interested in:
|
<<: Summary of MySQL commonly used type conversion functions (recommended)
Introduction EXISTS is used to check whether a su...
Linux File System Common hard disks are shown in ...
This article shares with you a book flipping effe...
This article shares the installation and configur...
Table of contents 1.1. Enable MySQL binlog 1.2. C...
The browser is probably the most familiar tool fo...
Non-orthogonal margins When margin is used, it wi...
Three types of message boxes can be created in Ja...
Background Recently, when writing SQL statements,...
This article shares the specific method of instal...
1. Change the Host field value of a record in the...
Table of contents Preface sql_mode explained The ...
MySQL 5.7.9 version sql_mode=only_full_group_by i...
Table of contents Add traffic function to github+...
This article example shares the specific code of ...