先写完全分布式,在写伪分布 :
hadoop2.6.1 完全分布模式
三台虚拟机进行完全分布式部署:
master:ip 192.168.0.162 hostname lin162
slaves: ip 192.168.0.163 hostname lin163
slaves: ip 192.168.0.164 hostname lin164
安装步骤:
1、安装jdk 配置环境变量
安装过程如下: http://blog.csdn.net/linlinv3/article/details/45060705
2、配置ssh
若系统没有装配ssh 则使用下面语句安装ssh:
$ sudo apt-get install openssh-server
$ ps -e|grep ssh
出现以下语句证明成功安装 2228 ? 00:00:00 ssh-agent 5027 ? 00:00:00 sshd
创建秘钥:
$ ssh-keygen -t rsa
$ cd .ssh
$ cp id_rsa.pub authorized_keys
三台机器分别执行此命名 ,然后把每个节点的authorized_keys组合成一个大文件,然后分别覆盖到每个节点的authorized_keys
3、配置hosts和hostname
$ sudo vim /etc/hostname
三个电脑分别修改为lin162,lin163 ,lin164 (根据自己情况进行修改)
配置host 把IP地址对应到hostname
$ sudo vim /etc/hosts
三个电脑分别修改:
127.0.0.1 localhost
192.168.0.162 lin162
192.168.0.163 lin163
192.168.0.164 lin164
4、配置Hadoop 相关文件
hadoop版本下载地址:http://hadoop.apache.org/releases.html
选择自己需要的版本进行下载:
需要修改的文件有:
(1) 、hadoop-env.sh 修改以下配置项 (java环境)
export JAVA_HOME=/usr/soft/jdk1.7.0_79
(2) 、core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://lin162:8020</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:///home/lin/hadoop/hadoop-2.6.1/data/tmp</value>
</property>
</configuration>
(3)、hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/lin/hadoop/hadoop-2.6.1/data/nn</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///home/lin/hadoop/hadoop-2.6.1/data/dn</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
(4) 、mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>lin162:10020</value>
<description>MapReduce JobHistory Server IPC host:port</description>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>lin162:19888</value>
<description>MapReduce JobHistory Server Web UI host:port</description>
</property>
</configuration>
(5)、 yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<description>The hostname of the RM.</description>
<name>yarn.resourcemanager.hostname</name>
<value>lin162</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>${yarn.resourcemanager.hostname}:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>${yarn.resourcemanager.hostname}:8030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>${yarn.resourcemanager.hostname}:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.https.address</name>
<value>${yarn.resourcemanager.hostname}:8090</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>${yarn.resourcemanager.hostname}:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>${yarn.resourcemanager.hostname}:8033</value>
</property> </configuration>
(6) 、slaves文件
lin163
lin164
5、配置hadoop环境变量
$ sudo vim /etc/profile
加入如下内容
export HADOOP_HOME=/home/lin/hadoop/hadoop-2.6.1
export PATH=$PATH:$HADOOP_HOME/bin
$ source /etc/profile
6、复制 hadoop2.6.1 到 lin163 和 lin164
$ scp -r /home/lin/hadoop/hadoop-2.6.1 lin@192.168.0.163:/home/lin/hadoop/hadoop-2.6.1
scp -r /home/lin/hadoop/hadoop-2.6.1 lin@192.168.0.164:/home/lin/hadoop/hadoop-2.6.1
6、在162格式化namenode
$ hdfs namenode -format
若出现
15/10/22 09:35:01 INFO common.Storage: Storage directory /home/lin/hadoop/hadoop-2.6.1/data/nn has been successfully formatted.
证明格式化成功
7、在162启动hadoop
$ hdfs dfs start-all.sh
每台机器 jps 后
lin162 出现
lin163 出现
lin164 出现
证明启动成功;
Web Interfaces
Daemon | Web Interface | Notes |
NameNode | http://nn_host:port/ | Default HTTP port is 50070. |
ResourceManager | http://rm_host:port/ | Default HTTP port is 8088. |
MapReduce JobHistory Server | http://jhs_host:port/ | Default HTTP port is 19888. |
通过web 查看系统
hadoop 2.x伪分布安装
启动 history 与 WebAppProxyServer
./mr-jobhistory-daemon.sh start historyserver
./yarn-daemon.sh --config $HADOOP_CONF_DIR start proxyserver
CIO之家 www.ciozj.com 公众号:imciow