The way of enabling HDFS HA by using Cloudera-manager
2016-11-24 11:47
459 查看
Enabling HDFS HA Using Cloudera Manager
Minimum Required Role: ClusterAdministrator (also provided by Full Administrator)
You can use Cloudera Manager to configure your CDH 4 or CDH 5 cluster for HDFS HA and automatic failover. In Cloudera Manager 5, HA is implemented using Quorum-based storage. Quorum-based storage relies upon a set of
JournalNodes, each of which maintains a local edits directory that logs the modifications to the namespace metadata. Enabling HA enables automatic failover as part of the same command.
Important:
Enabling or disabling HA causes the previous monitoring history to become unavailable.
Some parameters will be automatically set as follows once you have enabled
JobTracker HA. If you want to change the value from the default for these parameters, use an advanced configuration snippet.
mapred.jobtracker.restart.recover: true
mapred.job.tracker.persist.jobstatus.active: true
mapred.ha.automatic-failover.enabled: true
mapred.ha.fencing.methods: shell(/bin/true)
Enabling High Availability and Automatic Failover
The Enable High Availability workflow leads you through adding a second (standby) NameNode and configuring JournalNodes. During the workflow,Cloudera Manager creates a federated
namespace.
Perform all the configuration and setup tasks described under Configuring
Hardware for HDFS HA.
Ensure that you have a ZooKeeper service.
Go to the HDFS service.
Select Actions > Enable High Availability. A screen showing
the hosts that are eligible to run a standby NameNode and the JournalNodes displays.
Specify a name for the nameservice or accept the default name nameservice1 and click Continue.
In the NameNode Hosts field, click Select a host. The host selection dialog box displays.
Check the checkbox next to the hosts where you want the standby NameNode to be set up and clickOK. The standby NameNode cannot be on the same host as the active NameNode, and the host that is chosen
should have the same hardware configuration (RAM, disk space, number of cores, and so on) as the active NameNode.
In the JournalNode Hosts field, click Select hosts. The host selection dialog box displays.
Check the checkboxes next to an odd number of hosts (a minimum of three) to act as JournalNodes and click OK. JournalNodes should be hosted on hosts with similar hardware specification
as the NameNodes. Cloudera recommends that you put a JournalNode each on the same hosts as the active and standby NameNodes, and the third JournalNode on similar hardware, such as the JobTracker.
Click Continue.
In the JournalNode Edits Directory property, enter a directory location for the JournalNode edits directory into the fields for each JournalNode host.
You may enter only one directory for each JournalNode. The paths do not need to be the same on every JournalNode.
The directories you specify should be empty, and must have the appropriate permissions.
Extra Options: Decide whether Cloudera Manager should clear existing data in ZooKeeper, standby NameNode, and JournalNodes. If the directories are not empty (for example, you are re-enabling
a previous HA configuration), Cloudera Manager will not automatically delete the contents—you can select to delete the contents by keeping the default checkbox selection. The recommended default is to clear the directories. If you choose not to do so, the
data should be in sync across the edits directories of the JournalNodes and should have the same version data as the NameNodes.
Click Continue.
Cloudera Manager executes a set of commands that will stop the dependent services, delete, create, and configure roles and directories as appropriate, create a nameservice and failover controller, and restart the dependent services and deploy the new client
configuration.
If you want to use other services in a cluster with HA configured, follow the procedures in Configuring
Other CDH Components to Use HDFS HA.
If you are running CDH 4.0 or 4.1, the standby NameNode may fail at the bootstrapStandby command
with the error Unable to read transaction ids 1-7 from the configured shared edits storage. Use rsync or
a similar tool to copy the contents of the dfs.name.dir directory from the active NameNode to
the standby NameNode and start the standby NameNode.
Important: If you change the NameNode Service RPC Port (dfs.namenode.servicerpc-address)
while automatic failover is enabled, this will cause a mismatch between the NameNode address saved in the ZooKeeper /hadoop-ha znode
and the NameNode address that the Failover Controller is configured with. This will prevent the Failover Controllers from restarting. If you need to change the NameNode Service RPC Port after Auto Failover has been enabled, you must do the following to re-initialize
the znode:
Stop the HDFS service.
Configure the service RPC port:
Go to the HDFS service.
Click the Configuration tab.
Select Scope > NameNode.
Select Category > Ports and Addresses.
Locate the NameNode Service RPC Port property or search for it by typing its name in the Search box.
Change the port value as needed.
If more than one role group applies to this configuration, edit the value for the appropriate role group. See Modifying
Configuration Properties Using Cloudera Manager.
On a ZooKeeper server host, run zookeeper-client.
Execute the following to remove the configured nameservice. This example assumes the name of the nameservice is nameservice1. You can identify the nameservice from the Federation
and High Availability section on the HDFS Instances tab:
rmr /hadoop-ha/nameservice1
Click the Instances tab.
Select Actions > Initialize High Availability State in ZooKeeper.
Start the HDFS service.
Fencing Methods
In order to ensure that only one NameNode is active at a time, a fencing method is required for the shared edits directory. During a failover, the fencing method is responsible for ensuring that the previous active NameNodeno longer has access to the shared edits directory, so that the new active NameNode can safely proceed writing to it.
By default, Cloudera Manager configures HDFS to use a shell fencing method (shell(./cloudera_manager_agent_fencer.py))
that takes advantage of the Cloudera Manager Agent. However, you can configure HDFS to use the sshfence method,
or you can add your own shell fencing scripts, instead of or in addition to the one Cloudera Manager provides.
The fencing parameters are found in the Service-Wide > High
Availability category under the configuration properties for your HDFS service.
For details of the fencing methods supplied with CDH 5, and how fencing is configured, see Fencing
Configuration.
相关文章推荐
- 云计算(七)-HDFS利用QJM实现HA(HDFS High Availability Using the Quorum Journal Manager)
- Enabling HDFS/YARN HA and Other CDH Components to Use HDFS HA with Cloudera Manager 5.2
- HDFS利用QJM实现HA(HDFS High Availability Using the Quorum Journal Manager)
- An introduction to the Java 2 Platform, Enterprise Edition specification by way of BEA's WebLogic Server
- Using the Events Manager of the InfoPath Hosted Control
- HDFS High Availability体系介绍(Using the Quorum Journal Manager)
- get paid from、 to the taste of、expand my horizons、available、by the way
- Enabling the Oozie Web Console and use MySQL Database in Cloudera Manager 5.2
- How to open the dialog out of iframe by using jquery dialog
- How to enable the use of 'Ad Hoc Distributed Queries' by using sp_configure
- HDFS High Availability Using the Quorum Journal Manager
- hadoop大数据平台手动搭建(八)HDFS High Availability Using the Quorum Journal Manager
- HDFS High Availability Using the Quorum Journal Manager
- WebLogic - Enabling Auto Login by Using the Boot Identity File
- [HDFS Manual] CH4 HDFS High Availability Using the Quorum Journal Manager
- 105.Which statements are true regarding the creation of an incident package file by using the EM Wor
- 1.4 Dynamically change the look of an application by using view states,transitions and effects
- HDFS High Availability Using the Quorum Journal Manager
- 164 Using the LIST command in Recovery Manager (RMAN), which two pieces of information from the RMAN
- The way of using SqlCacheDependency in .net 2.0