A股上市公司传智教育(股票代码 003032)旗下技术交流社区北京昌平校区

© 不二晨 金牌黑马   /  2018-9-28 09:41  /  986 人查看  /  1 人回复  /   0 人收藏 转载请遵从CC协议 禁止商业使用本文

1.准备虚假机,安装Linuxvmware12 ,centos 6.5
2.修改Ip 静态化

3.将zkpk账号添加到sudoers(如果使用Root安装没有此步)
修改sudoers的权限


  • chmod u+w /etc/sudoers







  • vim /etc/sudoers



将zkpk添加到文件中
zkpk    ALL=(ALL)   ALL

4.修改 networksudo vim /etc/sysconfig/network
修改如下内容并保存


  • NETWORKING=yes



  • HOSTNAME=master



5.修改hostname


  • sudo vim /etc/hosts



  • 192.168.130.160 master



  • hostname master



6.关闭虚拟机的防火墙


  • service iptables stop



  • chkconfig iptables off



以上6步分在在5个节点上依次进行,节点全部准备好后,hosts文件内容应该如下

7.验证网络分别在不同的节点上ping其他节点
8.免密登陆分别在每个节点进行密钥生成与装载(以master为例)


  • ssh-keygen -t rsa(生成过程一直回车)







  • cd ~/.ssh







  • cp id_rsa_pub authorized_keys



相互分发密钥(以master为例)


  • cd ~/.ssh



  • ssh-copy-id -i slave1



  • ssh-copy-id -i slave2



  • ssh-copy-id -i slave3



  • ssh-copy-id -i slave4



验证效果


  • [root@master hadoop]# ssh slave1



  • Last login: Wed Sep 26 15:03:15 2018 from 192.168.23.1



9.安装并分发JDK在master节点上做(注意使用root账号):
在xshell中使用rz命令(如果没有安装可以使用yum -y install lrzsz安装)传送JDK


  • mv jdk-8u111-linux-x64.tar.gz /usr/local







  • tar -zxvf jdk-8u111-linux-x64.tar.gz







  • rm jdk-8u111-linux-x64.tar.gz







  • vim ~/.bashrc





  • export JAVA_HOME=/usr/local/jdk1.8.0_111/



  • export PATH=$PATH:$JAVA_HOME/bin



source ~/.bashrc
分发JDK与.bashrc


  • scp -r /usr/local/jdk-8u111 root@slave1:/usr/local



  • scp -r /usr/local/jdk-8u111 root@slave2:/usr/local



  • scp -r /usr/local/jdk-8u111 root@slave3:/usr/local



  • scp -r /usr/local/jdk-8u111 root@slave4:/usr/local



  • scp ~/.bashrc root@slave1:~/



  • scp ~/.bashrc root@slave2:~/



  • scp ~/.bashrc root@slave3:~/



  • scp ~/.bashrc root@slave4:~/



在不同节上使环境变量生效
source ~/.bashrc
10.Zookeeper分布式安装在master节点上做(注意使用root账号):
在xshell中使用rz命令(如果没有安装可以使用yum -y install lrzsz安装)传送zookeeper


  • mv zookeeper-3.4.9.tar.gz /usr/local







  • tar -zxvf zookeeper-3.4.9.tar.gz







  • rm zookeeper-3.4.9.tar.gz



将conf下的zoo_sample.cfg副本,改名为zoo.cfg
mv zoo_sample.cfg zoo.cfg
修改里边的dataDir修改成自己的目录 用于存放zookeeper的数据文件
dataDir=/usr/local/zookeeper/data/zData
修改节点


  • server.1=slave2:2888:3888



  • server.2=slave3:2888:3888



  • server.3=slave4:2888:3888




在slave2上的/usr/local/zookeeper/data/zData创建myid文件,内容是1(echo 1 >> myid)
分发到其他机器


  • scp  -r /usr/local/zookeeper root@slave3:/usr/local



  • scp  -r /usr/local/zookeeper root@slave4:/usr/local



在其它节点上修改myid文件 内容与Cfg文件对应,是2改成2,是3改成3
启动zookeeper (三台都要启动)
zkServer.sh start
jps查看进程,如果存在QuorumPeerMain表示启动成功
11.HDFS分布式安装先在Master进行安装
在xshell中使用rz命令(如果没有安装可以使用yum -y install lrzsz安装)传送hadoop-2.7.3.tar.gz


  • mv hadoop-2.7.3.tar.gz /usr/local







  • tar -zxvf hadoop-2.7.3.tar.gz







  • mv hadoop-2.7.3 hadoop







  • rm hadoop-2.7.3.tar.gz




修改~/.bashrc


  • export HADOOP_HOME=/usr/local/hadoop/



  • export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin



使bashrc生效
source ~/.bashrc
配置Hdfs-site.xml


  • <?xml version="1.0" encoding="UTF-8"?>



  • <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>



  • <!--



  •   Licensed under the Apache License, Version 2.0 (the "License");



  •   you may not use this file except in compliance with the License.



  •   You may obtain a copy of the License at







  •     http://www.apache.org/licenses/LICENSE-2.0







  •   Unless required by applicable law or agreed to in writing, software



  •   distributed under the License is distributed on an "AS IS" BASIS,



  •   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.



  •   See the License for the specific language governing permissions and



  •   limitations under the License. See accompanying LICENSE file.



  • -->







  • <!-- Put site-specific property overrides in this file. -->



  • <configuration>  



  •     <!--指定hdfs的nameservice为ns1,需要和core-site.xml中的保持一致 -->  



  •     <property>  



  •         <name>dfs.nameservices</name>  



  •         <value>ns1</value>  



  •     </property>  



  •     <!-- ns1下面有两个NameNode,分别是nn1,nn2 -->  



  •     <property>  



  •         <name>dfs.ha.namenodes.ns1</name>  



  •         <value>nn1,nn2</value>  



  •     </property>  



  •     <!-- nn1的RPC通信地址 -->  



  •     <property>  



  •         <name>dfs.namenode.rpc-address.ns1.nn1</name>  



  •         <value>master:9000</value>  



  •     </property>  



  •     <!-- nn1的http通信地址 -->  



  •     <property>  



  •         <name>dfs.namenode.http-address.ns1.nn1</name>  



  •         <value>master:50070</value>  



  •     </property>  



  •     <!-- nn2的RPC通信地址 -->  



  •     <property>  



  •         <name>dfs.namenode.rpc-address.ns1.nn2</name>  



  •         <value>slave1:9000</value>  



  •     </property>  



  •     <!-- nn2的http通信地址 -->  



  •     <property>  



  •         <name>dfs.namenode.http-address.ns1.nn2</name>  



  •         <value>slave1:50070</value>  



  •     </property>  



  •     <!-- 指定NameNode的元数据在JournalNode上的存放位置 -->  



  •     <property>  



  •         <name>dfs.namenode.shared.edits.dir</name>  



  •         <value>qjournal://slave2:8485;slave3:8485;slave4:8485/ns1</value>  



  •     </property>  



  •     <!-- 指定JournalNode在本地磁盘存放数据的位置 -->  



  •     <property>  



  •         <name>dfs.journalnode.edits.dir</name>  



  •         <value>/usr/local/hadoop/journaldata</value>  



  •     </property>  



  •     <!-- 开启NameNode失败自动切换 -->  



  •     <property>  



  •         <name>dfs.ha.automatic-failover.enabled</name>  



  •         <value>true</value>  



  •     </property>  



  •     <!-- 配置失败自动切换实现方式 -->  



  •     <property>  



  •         <name>dfs.client.failover.proxy.provider.ns1</name>  



  • <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>  



  •     </property>  



  •     <!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->  



  •     <property>  



  •         <name>dfs.ha.fencing.methods</name>  



  •         <value>  



  •             sshfence  



  •             shell(/bin/true)  



  •         </value>  



  •     </property>  



  •     <!-- 使用sshfence隔离机制时需要ssh免登陆 -->  



  •     <property>  



  •         <name>dfs.ha.fencing.ssh.private-key-files</name>  



  •         <value>/root/.ssh/id_rsa</value>  



  •     </property>  



  •     <!-- 配置sshfence隔离机制超时时间 -->  



  •     <property>  



  •         <name>dfs.ha.fencing.ssh.connect-timeout</name>  



  •         <value>30000</value>  



  •     </property>  



  • </configuration>



配置core-site.xml


  • <?xml version="1.0" encoding="UTF-8"?>



  • <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>



  • <!--



  •   Licensed under the Apache License, Version 2.0 (the "License");



  •   you may not use this file except in compliance with the License.



  •   You may obtain a copy of the License at







  •     http://www.apache.org/licenses/LICENSE-2.0







  •   Unless required by applicable law or agreed to in writing, software



  •   distributed under the License is distributed on an "AS IS" BASIS,



  •   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.



  •   See the License for the specific language governing permissions and



  •   limitations under the License. See accompanying LICENSE file.



  • -->







  • <!-- Put site-specific property overrides in this file. -->



  • <configuration>  



  • <!-- 指定hdfs的nameservice为ns1 -->  



  • <property>  



  • <name>fs.defaultFS</name>  



  • <value>hdfs://ns1/</value>  



  • </property>  



  • <!-- 指定hadoop临时目录 -->  



  • <property>  



  • <name>hadoop.tmp.dir</name>  



  • <value>/usr/local/hadoop/data/tmp</value>  



  • </property>  



  • <!-- 指定zookeeper地址 -->  



  • <property>  



  • <name>ha.zookeeper.quorum</name>  



  • <value>slave2:2181,slave3:2181,slave4:2181</value>  



  • </property>  



  • </configuration>



配置yarn-site.xml


  • <?xml version="1.0"?>



  • <configuration>



  • <!--rm失联后重新链接的时间-->



  • <property>



  • <name>yarn.resourcemanager.connect.retry-interval.ms</name>



  • <value>2000</value>



  • </property>







  • <!--开启resourcemanagerHA,默认为false-->



  • <property>



  • <name>yarn.resourcemanager.ha.enabled</name>



  • <value>true</value>



  • </property>







  • <!--配置resourcemanager-->



  • <property>



  • <name>yarn.resourcemanager.ha.rm-ids</name>



  • <value>rm1,rm2</value>



  • </property>







  • <property>



  • <name>ha.zookeeper.quorum</name>



  • <value>slave2:2181,slave3:2181,slave4:2181</value>



  • </property>







  • <!--开启故障自动切换-->



  • <property>



  • <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>



  • <value>true</value>



  • </property>







  • <property>



  • <name>yarn.resourcemanager.hostname.rm1</name>



  • <value>master</value>



  • </property>







  • <property>



  • <name>yarn.resourcemanager.hostname.rm2</name>



  • <value>slave1</value>



  • </property>







  • <!--



  • 在hadoop001上配置rm1,在hadoop002上配置rm2,



  • 注意:一般都喜欢把配置好的文件远程复制到其它机器上,但这个在YARN的另一个机器上一定要修改



  • -->



  • <property>



  • <name>yarn.resourcemanager.ha.id</name>



  • <value>rm1</value>



  • <description>If we want to launch more than one RM in single node,we need this configuration</description>



  • </property>







  • <!--开启自动恢复功能-->



  • <property>



  • <name>yarn.resourcemanager.recovery.enabled</name>



  • <value>true</value>



  • </property>







  • <!--配置与zookeeper的连接地址-->



  • <property>



  • <name>yarn.resourcemanager.zk-state-store.address</name>



  • <value>slave2:2181,slave3:2181,slave4:2181</value>



  • </property>







  • <property>



  • <name>yarn.resourcemanager.store.class</name>



  • <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>



  • </property>







  • <property>



  • <name>yarn.resourcemanager.zk-address</name>



  • <value>slave2:2181,slave3:2181,slave4:2181</value>



  • </property>







  • <property>



  • <name>yarn.resourcemanager.cluster-id</name>



  • <value>appcluster-yarn</value>



  • </property>







  • <!--schelduler失联等待连接时间-->



  • <property>



  • <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name>



  • <value>5000</value>



  • </property>







  • <!--配置rm1-->







  • <property>



  • <name>yarn.resourcemanager.address.rm1</name>



  • <value>master:8032</value>



  • </property>







  • <property>



  • <name>yarn.resourcemanager.scheduler.address.rm1</name>



  • <value>master:8030</value>



  • </property>







  • <property>



  • <name>yarn.resourcemanager.webapp.address.rm1</name>



  • <value>master:8088</value>



  • </property>







  • <property>



  • <name>yarn.resourcemanager.resource-tracker.address.rm1</name>



  • <value>master:8031</value>



  • </property>







  • <property>



  • <name>yarn.resourcemanager.admin.address.rm1</name>



  • <value>master:8033</value>



  • </property>







  • <property>



  • <name>yarn.resourcemanager.ha.admin.address.rm1</name>



  • <value>master:23142</value>



  • </property>







  • <!--配置rm2-->



  • <property>



  • <name>yarn.resourcemanager.address.rm2</name>



  • <value>slave1:8032</value>



  • </property>







  • <property>



  • <name>yarn.resourcemanager.scheduler.address.rm2</name>



  • <value>slave1:8030</value>



  • </property>







  • <property>



  • <name>yarn.resourcemanager.webapp.address.rm2</name>



  • <value>slave1:8088</value>



  • </property>







  • <property>



  • <name>yarn.resourcemanager.resource-tracker.address.rm2</name>



  • <value>slave1:8031</value>



  • </property>







  • <property>



  • <name>yarn.resourcemanager.admin.address.rm2</name>



  • <value>slave1:8033</value>



  • </property>







  • <property>



  • <name>yarn.resourcemanager.ha.admin.address.rm2</name>



  • <value>hadoop002:23142</value>



  • </property>







  • <property>



  • <name>yarn.nodemanager.aux-services</name>



  • <value>mapreduce_shuffle</value>



  • </property>







  • <property>



  • <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>



  • <value>org.apache.hadoop.mapred.ShuffleHandler</value>



  • </property>







  • <property>



  • <name>mapreduce.shuffle.port</name>



  • <value>23080</value>



  • </property>







  • <!--故障处理类-->



  • <property>



  • <name>yarn.client.failover-proxy-provider</name>



  • <value>org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider</value>



  • </property>







  • <property>



  • <name>yarn.resourcemanager.ha.automatic-failover.zk-base-path</name>



  • <value>/yarn-leader-election</value>



  • <description>Optionalsetting.Thedefaultvalueis/yarn-leader-election</description>



  • </property>



  • </configuration>



配置mapred-site.xml


  • <?xml version="1.0"?>



  • <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>



  • <!--



  •   Licensed under the Apache License, Version 2.0 (the "License");



  •   you may not use this file except in compliance with the License.



  •   You may obtain a copy of the License at







  •     http://www.apache.org/licenses/LICENSE-2.0







  •   Unless required by applicable law or agreed to in writing, software



  •   distributed under the License is distributed on an "AS IS" BASIS,



  •   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.



  •   See the License for the specific language governing permissions and



  •   limitations under the License. See accompanying LICENSE file.



  • -->







  • <!-- Put site-specific property overrides in this file. -->



  • <configuration>  



  •     <!-- 指定mr框架为yarn方式 -->  



  •     <property>  



  •             <name>mapreduce.framework.name</name>  



  •             <value>yarn</value>  



  •     </property>  



  • </configuration>



修改slaves


  • slave2



  • slave3



  • slave4



将Hadoop目录下的所有文件分发到其他节点上


  • scp -r /usr/lcoal/hadoop root@slave1:/usr/local



  • scp -r /usr/lcoal/hadoop root@slave2:/usr/local



  • scp -r /usr/lcoal/hadoop root@slave3:/usr/local



  • scp -r /usr/lcoal/hadoop root@slave4:/usr/local



将~/.bashrc分发到其他节点上,并在不同的节点上使用source ~/.bashrc ,使其生效


  • scp -r ~/.bashrc root@slave1:~



  • scp -r ~/.bashrc root@slave2:~



  • scp -r ~/.bashrc root@slave3:~



  • scp -r ~/.bashrc root@slave4:~



12.启动集群在slave2,slave3,slave4上启动Zookeeper
zkServer.sh start
在slave2,slave3,slave4上启动journalnode
hadoop-daemon.sh start journalnode
以上两步完成后的进程如下(slave2,slave3,slave4都一样)


  • 11216 Jps



  • 9616 JournalNode



  • 7902 QuorumPeerMain



格式化hdfs(master进行)
hdfs namenode -format
格式化ZKFC(master进行)
hdfs zkfc -formatZK
启动namenode(master和slave1)
hadoop-daemon.sh start namenode
在slave1上同步Namenode数据
hdfs namenode -bootstrapStandby
在master启动HDFS,YARN


  • start-dfs.sh



  • start-yarn.sh



在slave1记动redourcemanager
yarn-daemon.sh start resourcemanager
注意 准备好以后可以使用start-all.sh系统会按配置文件在不同的节点启动相应的进程


  • [root@master hadoop]# start-all.sh



  • This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh



  • Starting namenodes on [master slave1]



  • master: starting namenode, logging to /usr/local/hadoop/logs/hadoop-root-namenode-master.out



  • slave1: starting namenode, logging to /usr/local/hadoop/logs/hadoop-root-namenode-slave1.out



  • slave4: starting datanode, logging to /usr/local/hadoop/logs/hadoop-root-datanode-slave4.out



  • slave3: starting datanode, logging to /usr/local/hadoop/logs/hadoop-root-datanode-slave3.out



  • slave2: starting datanode, logging to /usr/local/hadoop/logs/hadoop-root-datanode-slave2.out



  • Starting journal nodes [slave2 slave3 slave4]



  • slave3: starting journalnode, logging to /usr/local/hadoop/logs/hadoop-root-journalnode-slave3.out



  • slave2: starting journalnode, logging to /usr/local/hadoop/logs/hadoop-root-journalnode-slave2.out



  • slave4: starting journalnode, logging to /usr/local/hadoop/logs/hadoop-root-journalnode-slave4.out



  • Starting ZK Failover Controllers on NN hosts [master slave1]



  • slave1: starting zkfc, logging to /usr/local/hadoop/logs/hadoop-root-zkfc-slave1.out



  • master: starting zkfc, logging to /usr/local/hadoop/logs/hadoop-root-zkfc-master.out



  • starting yarn daemons



  • starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-root-resourcemanager-master.out



  • slave3: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-root-nodemanager-slave3.out



  • slave2: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-root-nodemanager-slave2.out



  • slave4: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-root-nodemanager-slave4.out



13.验证HDFS部分的HA如下





Yarn部分的HA如下



这时可以在master上杀掉NameNode进程,再查看StandbyNN是否可以自动切换到Active状态
还可以使用Hadoop自带的pi例子,在运行当中杀掉主RM来查看JOB是否可以在别一个RM中继续运行


【转】https://blog.csdn.net/trans_1010 ... 655?utm_source=copy

1 个回复

倒序浏览
奈斯
回复 使用道具 举报
您需要登录后才可以回帖 登录 | 加入黑马