Hadoop3.1.0完全分布式集群部署,三台服务器部署结构如下
#部署完成后
root@servera:/opt/hadoop/hadoop-3.1.0# jps
14056 SecondaryNameNode
14633 Jps
13706 NameNode
14317 ResourceManager
root@serverb:~# jps
5288 NodeManager
5162 DataNode
5421 Jps
root@serverc:~# jps
4545 NodeManager
4371 DataNode
4678 Jps
如上图,一共三台机器作为集群,servera作为master,其他两台作为worker。 2.开始部署-前期准备(三台机器都需要进行如下操作)vim /etc/hosts
10.80.80.110 servera
10.80.80.111 serverb
10.80.80.112 serverc
2.2.jdk 安装【三台】
tar -zxf jdk-8u172-linux-x64.tar.gz
mv jdk1.8.0_172/ /opt/java/
vim /etc/profile.d/jdk-1.8.sh
export JAVA_HOME=/opt/java/jdk1.8.0_172
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
# 使环境变量生效
source /etc/profile
# 查看 Java
java --version
ssh localhost(三台都需要执行首次需输入yes)
$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys
- 2.4.servera免密码登录其他机器(master免密码登录worker)【单台,只需在servera上执行】
ssh-copy-id -i ~/.ssh/id_rsa.pub servera
ssh-copy-id -i ~/.ssh/id_rsa.pub serverb
ssh-copy-id -i ~/.ssh/id_rsa.pub serverc
3.hadoop3+配置文件 共需要配置/opt/hadoop/hadoop-3.1.0/etc/hadoop/下的六个个文件,分别是 hadoop-env.sh、core-site.xml、hdfs-site.xml、yarn-site.xml、mapred-site.xml、workers - 3.1. hadoop-env.sh 添加如下内容
export JAVA_HOME=/opt/java/jdk1.8.0_172/
<configuration>
<!-- 指定hdfs的nameservice为ns1 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://servera:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
</configuration>
<configuration>
<!-- Configurations for NameNode: -->
<property>
<name>dfs.namenode.name.dir</name>
<value>/var/lib/hadoop/hdfs/name/</value>
</property>
<property>
<name>dfs.blocksize</name>
<value>268435456</value>
</property>
<property>
<name>dfs.namenode.handler.count </name>
<value>100</value>
</property>
<!-- Configurations for DataNode: -->
<property>
<name>dfs.datanode.data.dir</name>
<value>/var/lib/hadoop/hdfs/data/</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>servera</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.resource.memorymb</name>
<value>800</value>
</property>
<!-- Configurations for History Server (Needs to be moved elsewhere): -->
</configuration>
添加 mapreduce.application.classpath,否则报错:找不到或无法加载主类org.apache.hadoop.mapreduce.v2.app.MRAppMaster。
<configuration>
<!-- Configurations for MapReduce Applications: -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>
/opt/hadoop/hadoop-3.1.0/etc/hadoop,
/opt/hadoop/hadoop-3.1.0/share/hadoop/common/*,
/opt/hadoop/hadoop-3.1.0/share/hadoop/common/lib/*,
/opt/hadoop/hadoop-3.1.0/share/hadoop/hdfs/*,
/opt/hadoop/hadoop-3.1.0/share/hadoop/hdfs/lib/*,
/opt/hadoop/hadoop-3.1.0/share/hadoop/mapreduce/*,
/opt/hadoop/hadoop-3.1.0/share/hadoop/mapreduce/lib/*,
/opt/hadoop/hadoop-3.1.0/share/hadoop/yarn/*,
/opt/hadoop/hadoop-3.1.0/share/hadoop/yarn/lib/*
</value>
</property>
</configuration>
4. 复制Hadoop文件到其他集群、配置Hadoop环境变量、格式化hdfs、开启集群、查看、关闭、重置集群- 4.1. 将步骤3配置好的hadoop文件复制到其他同样位置的机器上
/opt/hadoop/hadoop-3.1.0 - 4.2 配置Hadoop环境变量【三台机器都操作】
vim /etc/profile.d/hadoop-3.1.0.sh
export HADOOP_HOME="/opt/hadoop/hadoop-3.1.0"
export PATH="$HADOOP_HOME/bin:$PATH"
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
source /etc/profile- 4.3. 格式化HDFS [只有首次部署才可使用]【谨慎操作,只在servera上操作】
/opt/hadoop/hadoop-3.1.0/bin/hdfs namenode -format myClusterName
/opt/hadoop/hadoop-3.1.0/sbin/start-dfs.sh
/opt/hadoop/hadoop-3.1.0/sbin/start-yarn.sh
jps- 4.6. web端localhost:8088查看【localhost只定servera的localhost,也可以换成外网ip,在详见步骤3.4. yarn-site.xml 】
/opt/hadoop/hadoop-3.1.0/sbin/stop-dfs.sh
/opt/hadoop/hadoop-3.1.0/sbin/stop-yarn.sh
- 4.8. 重置hadoop环境 [移除hadoop hdfs log文件] 【谨慎操作,只在servera上操作】
rm -rf /opt/hadoop/hadoop-3.1.0/logs/*
rm -rf /var/lib/hadoop/
5.遇到的坑 pdsh@servera: servera: connect: Connection refusedroot@servera:/opt/hadoop/hadoop-3.1.0# sbin/start-dfs.sh
Starting namenodes on [servera]
pdsh@servera: servera: connect: Connection refused
Starting datanodes
pdsh@servera: serverc: connect: Connection refused
pdsh@servera: serverb: connect: Connection refused
Starting secondary namenodes [servera]
pdsh@servera: servera: connect: Connection refused echo ssh>/etc/pdsh/rcmd_default 【转载地址】https://blog.csdn.net/sade1231/article/details/80921142
|