Building a Hadoop Yarn without Kerberos authentication |
SuperMap iServer distributed analysis supports using the Hadoop Yarn cluster, you can build it yourself by following the process below. This chapter describes how to set up a Hadoop Yarn cluster without Kerberos authentication.
To set up a Hadoop Yarn cluster environment, you need to configure the Java environment (JDK download address http://www.oracle.com/technetwork/java/javase/downloads/index-jsp-138363.html#javasejdk, JDK 8 and above is recommended), SSH and hadoop.
The softwares used in this example are:
Hadoop installation package: hadoop-2.7.3.tar.gz stored in: /home/iserver
JDK installation package: jdk-8u131-linux-x64.tar.gz
In this example, a Hadoop Yarn cluster with one master and one worker is built on two ubuntus (each has 12 g of memory). The steps are shown as follows:
export JAVA_HOME=/home/supermap/java/jdk1.8.0_131
export PATH=${JAVA_HOME}/bin:$PATH
Execute source /etc/profile to make the environment variables take effect.
Execute ssh-keygen -t rsa -P'' on master and worker respectively (-P indicates password, can be ignored, defaults to enter three times of "Enter" key). After executing the command, the private key file (id_rsa) and public key file (id_rsa.pub) will be generated in /home/hdfs/.ssh directory. Then execute the following command on the master and worker respectively:
ssh-copy-id -i /home/hdfs/.ssh/id_rsa.pub ip
When execute on master, wirite the ip of the worker node; and write the ip of the master node if executing on worker node. Then run ssh worker/master on the master and worker to verify if the configuration is successful.
Place a complete hadoop package in the master directory and configure it as follows:
Enter /home/supermap/hadoop-2.7.6/etc/hadoop directory:
export JAVA_HOME=/home/supermap/java/jdk1.8.0_131
export JAVA_HOME=/home/supermap/java/jdk1.8.0_131
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>192.168.112.131</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>192.168.112.131:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>192.168.112.131:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>192.168.112.131:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>192.168.112.131:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>192.168.112.131:8088</value>
</property>
<!--<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>8192</value>
</property>-->
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>4096</value>
</property>
</configuration>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://192.168.112.131:9000</value>
</property>
<!-- <property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>-->
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/supermap/hadoop/tmp</value>
<description>Abasefor other temporary directories.</description>
</property>
</configuration>
<configuration>
<property>
<name>dfs.namenode.http-address</name>
<value>192.168.112.131:50070</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>192.168.112.131:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/supermap/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/supermap/hadoop/hdfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>192.168.112.131:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>192.168.112.131:19888</value>
</property>
</configuration>
Scp –r /home/supermap/hadoop-2.7.6/etc/hadoop root@worker:/home/supermap/hadoop-2.7.6/etc/hadoop
In this way, the worker becomes a child of the yarn cluster, and the master and slaves files are also needed to be modified in the child nodes. Up to now, a simple yarn cluster has been set up.