Experience on Namenode backup and restore --- checkpoint
2017-08-08 08:18
330 查看
Hadoop version: Hadoop 2.2.0.2.0.6.0-0009
Well, We can do this by building Secondary Namenode, Checkpoint node or Backup node.
Example:
Assuming you have a Secondary Namenode.
1. Check secondary namenode checkpoint status:
dfs.namenode.secondary.http-address in %HADOOP_CONF_DIR%/hdfs-site.xml
fs.namenode.checkpoint.dir in %HADOOP_CONF_DIR%/hdfs-site.xml
dfs.namenode.checkpoint.edits.dir in %HADOOP_CONF_DIR%/hdfs-site.xml
dfs.namenode.checkpoint.period in %HADOOP_CONF_DIR%/hdfs-site.xml
2. Backup your real time checkpoint by hand:
On Secondary namenode, Stop Hadoop secondary namenode service.
Run cmd.exe by user hadoop ( or some users have full permission )
[plain] view
plaincopy
![](https://oscdn.geek-share.com/Uploads/Images/Content/201611/a7c8e286f463007e2a900848b93dd72c.png)
![](https://oscdn.geek-share.com/Uploads/Images/Content/201611/9e12f1d3e499fc949c886e7c9e0484f9)
Runas /user:hadoop cmd.exe
You must have user hadoop password.
Backup real time checkpoint:
[plain] view
plaincopy
![](https://oscdn.geek-share.com/Uploads/Images/Content/201611/a7c8e286f463007e2a900848b93dd72c.png)
![](https://oscdn.geek-share.com/Uploads/Images/Content/201611/9e12f1d3e499fc949c886e7c9e0484f9)
cmd>%hadoop_home%/bin/hadoop secondarynamenode -checkpoint force
Start Hadoop secondary namenode service. and check secondary namenode checkpoint status ( see step 1)
3. Stop Namenode services or reboot Namenode ( if hadoop service set to booting manual ,the services would all stop after reboot )
As for test, I backup my dfs.namenode.name.dir (i.e C:\hdpdata\hdfs\nn) first for my next test ( restore from my namenode dir backup ) .
Delete all files in C:\hdpdata\hdfs\nn ,
Open dfs.namenode.checkpoint.dir (see %HADOOP_CONF_DIR%/hdfs-site.xml ) in secondary namenode (i.e. c:\hdpdata\hdfs\snn )
Copy all secondary checkpoint files( except the lock file) from this folder to your namenode's checkpoint dir (dfs.namenode.checkpoint.dir the same as secondary namenode)
Make sure namenode's checkpoint dir is empty already !
4. Restore from checkpoint dir
Run cmd.exe by user hadoop ( or some users have full permission )
[plain] view
plaincopy
![](https://oscdn.geek-share.com/Uploads/Images/Content/201611/a7c8e286f463007e2a900848b93dd72c.png)
![](https://oscdn.geek-share.com/Uploads/Images/Content/201611/9e12f1d3e499fc949c886e7c9e0484f9)
Runas /user:hadoop cmd.exe
You must have user hadoop password.
Use this command to start hadoop service and import checkpoint from checkpoint dir
[plain] view
plaincopy
![](https://oscdn.geek-share.com/Uploads/Images/Content/201611/a7c8e286f463007e2a900848b93dd72c.png)
![](https://oscdn.geek-share.com/Uploads/Images/Content/201611/9e12f1d3e499fc949c886e7c9e0484f9)
cmd>%hadoop_home%/bin/hdfs namenode -importcheckpoint
Use ctrl+C to stop service which is completed. and Delete your namenode's checkpoint dir (dfs.namenode.checkpoint.dir the same as secondary namenode)
Start service by this command:
[plain] view
plaincopy
![](https://oscdn.geek-share.com/Uploads/Images/Content/201611/a7c8e286f463007e2a900848b93dd72c.png)
![](https://oscdn.geek-share.com/Uploads/Images/Content/201611/9e12f1d3e499fc949c886e7c9e0484f9)
cmd>start_local_hdp_services.cmd
Levae safemode
[plain] view
plaincopy
![](https://oscdn.geek-share.com/Uploads/Images/Content/201611/a7c8e286f463007e2a900848b93dd72c.png)
![](https://oscdn.geek-share.com/Uploads/Images/Content/201611/9e12f1d3e499fc949c886e7c9e0484f9)
cmd>%hadoop_home%/bin/hdfs dfsadmin -safemode leave
Balance you HDFS:
[plain] view
plaincopy
![](https://oscdn.geek-share.com/Uploads/Images/Content/201611/a7c8e286f463007e2a900848b93dd72c.png)
![](https://oscdn.geek-share.com/Uploads/Images/Content/201611/9e12f1d3e499fc949c886e7c9e0484f9)
cmd>%hadoop_home%/bin/hdfs balancer -threshold 5
5. Confirm your Hadoop service is restored successfully.
Open URL http://namenode:50070/ to check if there are some missing block. If yes. Please kindly check where they are and what they are.
Because restore from secondary namenode isn't a real time restore solution. It may lost the last time what you do in the jobtracker. It doesn't matter. Just delete them.
Tips: If you want to restore a real time backup, please use multiplicate namenode dir mode. see next post... ...
Well, We can do this by building Secondary Namenode, Checkpoint node or Backup node.
Example:
Assuming you have a Secondary Namenode.
1. Check secondary namenode checkpoint status:
dfs.namenode.secondary.http-address in %HADOOP_CONF_DIR%/hdfs-site.xml
fs.namenode.checkpoint.dir in %HADOOP_CONF_DIR%/hdfs-site.xml
dfs.namenode.checkpoint.edits.dir in %HADOOP_CONF_DIR%/hdfs-site.xml
dfs.namenode.checkpoint.period in %HADOOP_CONF_DIR%/hdfs-site.xml
2. Backup your real time checkpoint by hand:
On Secondary namenode, Stop Hadoop secondary namenode service.
Run cmd.exe by user hadoop ( or some users have full permission )
[plain] view
plaincopy
![](https://oscdn.geek-share.com/Uploads/Images/Content/201611/a7c8e286f463007e2a900848b93dd72c.png)
Runas /user:hadoop cmd.exe
You must have user hadoop password.
Backup real time checkpoint:
[plain] view
plaincopy
![](https://oscdn.geek-share.com/Uploads/Images/Content/201611/a7c8e286f463007e2a900848b93dd72c.png)
cmd>%hadoop_home%/bin/hadoop secondarynamenode -checkpoint force
Start Hadoop secondary namenode service. and check secondary namenode checkpoint status ( see step 1)
3. Stop Namenode services or reboot Namenode ( if hadoop service set to booting manual ,the services would all stop after reboot )
As for test, I backup my dfs.namenode.name.dir (i.e C:\hdpdata\hdfs\nn) first for my next test ( restore from my namenode dir backup ) .
Delete all files in C:\hdpdata\hdfs\nn ,
Open dfs.namenode.checkpoint.dir (see %HADOOP_CONF_DIR%/hdfs-site.xml ) in secondary namenode (i.e. c:\hdpdata\hdfs\snn )
Copy all secondary checkpoint files( except the lock file) from this folder to your namenode's checkpoint dir (dfs.namenode.checkpoint.dir the same as secondary namenode)
Make sure namenode's checkpoint dir is empty already !
4. Restore from checkpoint dir
Run cmd.exe by user hadoop ( or some users have full permission )
[plain] view
plaincopy
![](https://oscdn.geek-share.com/Uploads/Images/Content/201611/a7c8e286f463007e2a900848b93dd72c.png)
Runas /user:hadoop cmd.exe
You must have user hadoop password.
Use this command to start hadoop service and import checkpoint from checkpoint dir
[plain] view
plaincopy
![](https://oscdn.geek-share.com/Uploads/Images/Content/201611/a7c8e286f463007e2a900848b93dd72c.png)
cmd>%hadoop_home%/bin/hdfs namenode -importcheckpoint
Use ctrl+C to stop service which is completed. and Delete your namenode's checkpoint dir (dfs.namenode.checkpoint.dir the same as secondary namenode)
Start service by this command:
[plain] view
plaincopy
![](https://oscdn.geek-share.com/Uploads/Images/Content/201611/a7c8e286f463007e2a900848b93dd72c.png)
cmd>start_local_hdp_services.cmd
Levae safemode
[plain] view
plaincopy
![](https://oscdn.geek-share.com/Uploads/Images/Content/201611/a7c8e286f463007e2a900848b93dd72c.png)
cmd>%hadoop_home%/bin/hdfs dfsadmin -safemode leave
Balance you HDFS:
[plain] view
plaincopy
![](https://oscdn.geek-share.com/Uploads/Images/Content/201611/a7c8e286f463007e2a900848b93dd72c.png)
cmd>%hadoop_home%/bin/hdfs balancer -threshold 5
5. Confirm your Hadoop service is restored successfully.
Open URL http://namenode:50070/ to check if there are some missing block. If yes. Please kindly check where they are and what they are.
Because restore from secondary namenode isn't a real time restore solution. It may lost the last time what you do in the jobtracker. It doesn't matter. Just delete them.
Tips: If you want to restore a real time backup, please use multiplicate namenode dir mode. see next post... ...
相关文章推荐
- Experience on Namenode backup and restore --- checkpoint
- Experience on Namenode backup and restore --- checkpoint......
- Backup and restore the disk table on linux
- backup and restore database on Microsoft SQL Server 2005
- Author name disambiguation using a graph model with node splitting and merging based on bibliographic information
- Under the Hood: Hadoop Distributed Filesystem reliability with Namenode and Avatarnode
- org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No leas e on
- 启动rabbitmq,提示ERROR: node with name "rabbit" already running on "U57..."
- DataBase_backup 、Restore and Delete
- How To Backup And Recovery On IBM AIX
- BackupAndRestoreSmallWorks.sql
- Install MongoDB and Node.js on a Raspberry Pi
- Deleting backup_label on restore will corrupt your database!
- veeam.Backup.and.Replication 6 测试之三--vm copy和restore功能
- Information centric network (icn) node based on switch and network process using the node
- Which two are the uses of the ASM metadata backup and restore (AMBR) feature?
- Backup and Restore MySQL Database Using mysqldump
- SecondaryNameNode对NameNode的checkpoint流程的源码分析
- Backup and Restore MySQL Database Using mysqldump
- HDFS学习– Namenode and Datanode