0%

Hadoop单点部署

准备

硬件环境
CPU: Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz 双核四线程
MEM: 8G

系统环境
CentOS Linux release 7.3.1611 (Core)

软件环境
hadoop-3.0.0-beta1
jdk8
在linux环境下推荐使用axel下载工具

安装依赖

1
2
yum install openssh
yum install pdsh

解压安装

1
2
tar xvf hadoop-3.0.0-beta1.tar.gz -C /usr/local/
cd /usr/local/hadoop-3.0.0-beta1

接下来验证安装

1
./bin/hadoop

如果安装成功,那么可以看到hadoop的使用说明文档

单点模式

hadoop默认安装后就是单点模式,我们来执行例子来测试单点模式的执行

1
2
3
4
mkdir input
cp etc/hadoop/*.xml input
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-beta1.jar grep input output 'dfs[a-z.]+'
cat output/*

如果执行成功,可以看到如下结果,则说明单点模式搭建成功

1
1       dfsadmin

伪分布式模式

hadoop能够在单点机器上运行伪分布式模式,使每个hadoop守护进程在独立的java进程中执行。
配置
备份

1
2
cp etc/hadoop/core-site.xml etc/hadoop/core-site.xml_bak
cp etc/hadoop/hdfs-site.xml etc/hadoop/hdfs-site.xml_bak

编辑etc/hadoop/core-site.xml

1
2
3
4
5
6
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

编辑etc/hadoop/hdfs-site.xml

1
2
3
4
5
6
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

配置免密的ssh
先检查是否能够免密ssh到本机

1
ssh localhost

如果不能免密登陆,那么重新生成免密的密钥,然后再次测试

1
2
3
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys

执行mapreduce

  1. 格式化文件系统
    1
    bin/hdfs namenode -format
  2. 启动namenode与datanode守护进程
    1
    sbin/start-dfs.sh
    验证启动
    1
    2
    3
    4
    5
    # jps
    28375 DataNode
    28759 Jps
    28619 SecondaryNameNode
    28237 NameNode
    登陆到http://localhost:9870/查看

问题

  • 执行sbin/start-dfs.sh后遇到伪分布式没法启动
    1
    2
    3
    4
    5
    6
    7
    8
    9
    Starting namenodes on [localhost]
    ERROR: Attempting to operate on hdfs namenode as root
    ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.
    Starting datanodes
    ERROR: Attempting to operate on hdfs datanode as root
    ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation.
    Starting secondary namenodes [mowo]
    ERROR: Attempting to operate on hdfs secondarynamenode as root
    ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation.
  • 解决*
    编辑etc/hadoop/hadoop-env.sh
    1
    2
    3
    4
    export HDFS_NAMENODE_USER=root
    export HDFS_DATANODE_USER=root
    export HDFS_JOURNALNODE_USER=root
    export HDFS_SECONDARYNAMENODE_USER=root