您的位置:首页 > 运维架构

How to reuse old PCs for Solr Search Platform?

2015-06-05 11:32 393 查看
家裡或公司的舊電腦不夠力? 效能慢到想砸爛它們? 朋友或同事有電腦要丟嗎? 我有一個廢物利用的方法, 我收集了四台舊電腦, 組了一個Fully Distributed Mode的Hadoop Cluster, 在Hadoop上架了Hbase, 執行Nutch, 儲存Solr的資料在Hbase。

PC Specs

NameCPURAM
pigpigpig-client2T2400 1.82GHz2GB
pigpigpig-client4E7500 2.93GHz4GB
pigpigpig-client5E2160 1.80GHz4GB
pigpigpig-client6T7300 2.00GHz2GB

Roles

NameRoles
pigpigpig-client2HQuorumPeer, SecondaryNameNode, ResourceManager, Solr
pigpigpig-client4NodeManager, HRegionServer, DataNode
pigpigpig-client5NodeManager, HRegionServer, DataNode
pigpigpig-client6NameNode, HMaster, Nutch

Version

Hadoop 2.7.0

Hbase 0.98.8-hadoop2

Gora 0.6.1-SNAPSHOT (P.S. 此時此刻官網尚未正式release, 請參考Build Apache Gora With Solr 5.1.0)

Nutch 2.4-SNAPSHOT (P.S. 此時此刻官網尚未正式release, 請參考Build Apache Nutch with Solr 5.1.0)

Configuration

剛開始執行Nutch時, 並沒有特別修改預設的設定檔, 每次經過大約10小時, RegionServer一定會發生隨機crash, 錯誤訊息大概都是Out Of Memory之類的, 我們的限制是資源有限, 舊電腦已經無法升級, 不像EC2是資源不夠就能升級, 所以performance tuning對我們是很重要的議題。

in hadoop-env.sh

記憶體很珍貴, 因為只有兩個DATANODE, 不需要預設的512MB那麼多, 全部減半

export HADOOP_NAMENODE_OPTS=”-Dhadoop.security.logger=HADOOPSECURITYLOGGER:−INFO,RFAS−Dhdfs.audit.logger={HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS -Xmx256m”

export HADOOP_DATANODE_OPTS=”-Dhadoop.security.logger=ERROR,RFAS $HADOOP_DATANODE_OPTS -Xmx256m”

export HADOOP_SECONDARYNAMENODE_OPTS=”-Dhadoop.security.logger=HADOOPSECURITYLOGGER:−INFO,RFAS−Dhdfs.audit.logger={HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_SECONDARYNAMENODE_OPTS -Xmx256m”

export HADOOP_PORTMAP_OPTS=”-Xmx256m $HADOOP_PORTMAP_OPTS”

export HADOOP_CLIENT_OPTS=”-Xmx256m $HADOOP_CLIENT_OPTS”

in hdfs-site.xml

為了避免hdfs timeout errors, 延長timeout的時間

<property>
<name>dfs.datanode.socket.write.timeout</name>
<value>1200000</value>
</property>

<property>
<name>dfs.socket.timeout</name>
<value>1200000</value>
</property>

<property>
<name>dfs.client.socket-timeout</name>
<value>1200000</value>
</property>


in mapred-env.sh

export HADOOP_JOB_HISTORYSERVER_HEAPSIZE=256

in mapred-site.xml

CPU效能不好, node不夠多, mapred.task.timeout調高一點, 免得mapreduce來不及做完, 尤其nutch inject、generate、fetch、parse、updatedb執行幾輪之後, 每次處理的資料都幾百萬筆, timeout太低會做不完。

<property>
<name>mapred.task.timeout</name>
<value>216000000</value> <!-- 60 hours -->
</property>

<property>
<name>mapreduce.map.output.compress</name>
<value>true</value>
</property>

<property>
<name>mapreduce.map.output.compress.codec</name>
<value>com.hadoop.compression.lzo.LzoCodec</value>
</property>

<property>
<name>mapreduce.map.memory.mb</name>
<value>1024</value>
</property>

<property>
<name>mapreduce.reduce.memory.mb</name>
<value>1024</value>
</property>

<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx200M</value>
</property>

<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx200M</value>
</property>

<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>1024</value>
</property>

<property>
<name>yarn.app.mapreduce.am.command-opts</name>
<value>-Xmx200M</value>
</property>


in yarn-env.sh

JAVA_HEAP_MAX=-Xmx256m

YARN_HEAPSIZE=256

in yarn-site.xml

4GB的RAM要分配給OS、NodeManager、HRegionServer和DataNode, 資源實在很緊。分派一半的記憶體給YARN, 所以yarn.nodemanager.resource.memory-mb設成2048; 每個CPU有2個core, 所以mapreduce.map.memory.mb、mapreduce.reduce.memory.mb和yarn.scheduler.maximum-allocation-mb設成1024。yarn.nodemanager.vmem-pmem-ratio設高一點避免出現類似 “running beyond virtual memory limits. Killing container”之類的錯誤。

<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>2048</value>
</property>

<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>128</value>
</property>

<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>1024</value>
</property>

<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>2</value>
</property>

<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>true</value>
</property>

<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>3.15</value>
</property>


in hbase-env.sh

# export HBASE_HEAPSIZE=1000

export HBASE_MASTER_OPTS=”HBASEMASTEROPTSHBASE_JMX_BASE -Xmx192m -Xms192m -Xmn72m”

export HBASE_REGIONSERVER_OPTS=”HBASEREGIONSERVEROPTSHBASE_JMX_BASE -Xmx1024m -Xms1024m -verbose:gc -Xloggc:/mnt/hadoop-2.4.1/hbase/logs/hbaseRgc.log -XX:+PrintAdaptiveSizePolicy -XX:+PrintGC -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/mnt/hadoop-2.4.1/hbase/logs/java_pid{$$}.hprof”

export HBASE_ZOOKEEPER_OPTS=”HBASEZOOKEEPEROPTSHBASE_JMX_BASE -Xmx192m -Xms72m”

in hbase-site.xml

RegionServer發生out of memory閃退跟hbase.hregion.max.filesize、hbase.hregion.memstore.flush.size和hbase.hregion.memstore.block.multiplier有關。

hbase.hregion.max.filesize太小的缺點

每台ResrionServer的Regions會太多 (P.S. 每個region的每個ColumnFamily會占用2MB的MSLAB)

造成頻繁的split和compact

開啟的storefile數量太多 (P.S. Potential Number of Open Files = (StoreFiles per ColumnFamily) x (regions per RegionServer))

hbase.hregion.max.filesize太大的缺點

太少Region, 沒有Distributed Mode的效果了

split和compact時的pause也會過久

write buffer在server-side memory-used是(hbase.client.write.buffer) * (hbase.regionserver.handler.count), 所以hbase.client.write.buffer和hbase.regionserver.handler.count太高會吃掉太多記憶體, 但是太少會增加RPC的數量。

hbase.zookeeper.property.tickTime和zookeeper.session.timeout太短會造成ZooKeeper SessionExpired。hbase.ipc.warn.response.time設長一點可以suppress responseTooSlow warning。

hbase.hregion.memstore.flush.size和hbase.hregion.memstore.block.multiplier也會影響split和compact的頻率。

<property>
<name>hbase.client.scanner.timeout.period</name>
<value>1200000</value>
</property>

<property>
<name>hbase.zookeeper.property.tickTime</name>
<value>60000</value>
</property>

<property>
<name>zookeeper.session.timeout</name>
<value>1200000</value>
</property>

<property>
<name>hbase.rpc.timeout</name>
<value>1800000</value>
</property>

<property>
<name>hbase.ipc.warn.response.time</name>
<value>1200000</value>
</property>

<property>
<name>hbase.regionserver.handler.count</name>
<value>15</value>
</property>

<property>
<name>hbase.hregion.max.filesize</name>
<value>10737418240</value>
</property>

<property>
<name>hbase.hregion.memstore.flush.size</name>
<value>67108864</value>
</property>

<property>
<name>hbase.hregion.memstore.block.multiplier</name>
<value>8</value>
</property>


Start Servers

run hdfs namenode -format on pigpigpig-client6

run start-dfs.sh on pigpigpig-client6

run start-yarn.sh on pigpigpig-client2

run start-yarn.sh on pigpigpig-client4

run start-hbase.sh on pigpigpig-client6

run java -Xmx1024m -Xms1024m -XX:+UseConcMarkSweepGC -jar start.jar in solr folder on pigpigpig-client2

run hadoop fs -mkdir /user;hadoop fs -mkdir /user/pigpigpig;hadoop fs -put urls /user/pigpigpig in nutch folder on pigpigpig-client6

run hadoop jar apache-nutch-2.4-SNAPSHOT.job org.apache.nutch.crawl.InjectorJob urls -crawlId webcrawl in nutch folder on pigpigpig-client6

run hadoop jar apache-nutch-2.4-SNAPSHOT.job org.apache.nutch.crawl.GeneratorJob -crawlId webcrawl in nutch folder on pigpigpig-client6

run hadoop jar apache-nutch-2.4-SNAPSHOT.job org.apache.nutch.fetcher.FetcherJob -all -crawlId webcrawl in nutch folder on pigpigpig-client6

run hadoop jar apache-nutch-2.4-SNAPSHOT.job org.apache.nutch.parse.ParserJob -all -crawlId webcrawl in nutch folder on pigpigpig-client6

run hadoop jar apache-nutch-2.4-SNAPSHOT.job org.apache.nutch.crawl.DbUpdaterJob -all -crawlId webcrawl in nutch folder on pigpigpig-client6

run hadoop jar apache-nutch-2.4-SNAPSHOT.job org.apache.nutch.indexer.IndexingJob -D solr.server.url=http://pigpigpig-client2/solr/nutch/ -all -crawlId webcrawl in nutch folder on pigpigpig-client6

Stop Servers

run stop-hbase.sh on pigpigpig-client6

run stop-yarn.sh on pigpigpig-client2

run stop-yarn.sh on pigpigpig-client4

run stop-dfs.sh on pigpigpig-client6

Screenshots







Resources

完整Configuration files請到https://github.com/EugenePig/Experiment1下載

https://github.com/EugenePig/Gora/tree/Gora-0.6.1-SNAPSHOT-Hadoop27-Solr5

https://github.com/EugenePig/nutch/tree/2.4-SNAPSHOT-Hadoop27-Solr5

https://github.com/EugenePig/ik-analyzer-solr5`
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  hadoop hbase Nutch Solr lucene