您的位置：首页 > 运维架构

hadoop学习笔记

2015-06-24 16:29 453 查看

一、增加新的数据节点

1.1 配置数据节点相关的所有信息(slaves,master, core-site, hdfs-site,mapred-site,yarn-site 等）

1.2 /etc/hosts 设置

1.3 /etc/hostname 设置

1.4 启动 bin/hadoop-deamon.sh start datanode

1.5 启动 bin/hadoop-deamon.sh start tasktracker

通过相关日志检查，是否成功

二、负载均衡

启动 bin/start-balancer.sh –threshold 15

三、卸载数据节点

在名称节点上操作

conf/hdfs-site.xml中增加

<property>
<name>dfs.hosts.exclude</name>
<value>[FULL_PATH_TO_THE_EXCLUDE_FILE]</value>
<description>Names a file that contains a list of hosts thatare
not permitted to connect to the namenode. The full pathname of
the file must be specified. If the value is empty, no hosts are
excluded.</description>
</property>

执行 bin/hadoop dfsadmin -refreshNodes

监控

四、 Using multiple disks/volumes and limiting HDFS disk usage

1.1 指定多分区

conf/hdfs-site.xml

<property>
<name>dfs.data.dir</name>
<value>/u1/hadoop/data,/u2/hadoop/data</value>
</property>

1.2 限定磁盘大小

<property>
<name>dfs.datanode.du.reserved</name>
<value>6000000000</value>
<description>Reserved space in bytes per volume. Always leave
this much space free for non dfs use.
</description>
</property>

五、 Setting HDFS block size

方法1、修改conf/hdfs-site.xml .

<property>
<name>dfs.block.size</name>
<value>134217728</value>
</property>

方法2：有些特定场景使用，比如上传

bin/hadoop fs -Ddfs.blocksize=134217728 -put data.in /user/foo

六、Setting the file replication factor

方法1：修改conf/hdfs-site.xml

<property>
<name>dfs.replication</name>
<value>2</value>
</property>

方法2：上传时设置

bin/hadoop fs -D dfs.replication=1 -copyFromLocal non-critical-file.txt /user/foo

方法3：改变复制因子

bin/hadoop fs -setrep 2 non-critical-file.txt

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航