Saturday, November 15, 2014
HADOOP ADMINISTRATION - Backup and Restoration methodologies
Data Backup:
hadoop distcp hdfs://nn1.cluster1.com:9000/ hdfs://nn1.cluster2.com:9000/
distcp will run a mapreduce job.
Application Backup : scripts/map reduce programs etc.
hadoop dfsadmin -saveNameSpace
hdfs dfsadmin -metasave filename.txt
hdfs oiv -i /data/namenode/current/fsimage -o fsimage.txt
hdfs dfsadmin -metasave filename.txt
hdfs oiv -i /data/namenode/current/fsimage -o fsimage.txt
HADOOP ADMINISTRATION - Check point node /Secondary Name node configuration
Secondary Name Node or Check point Node :
1. The secondary asks the primary to roll its edits file, so new edits go to a new file.
2. The secondary retrieves fsimage and edits from the primary (using HTTP GET).
3. The secondary loads fsimage into memory, applies each operation from edits, then creates a new consolidated fsimage file.
4. The secondary sends the new fsimage back to the primary (using HTTP POST).
5. The primary replaces the old fsimage with the new one from the secondary, and the old edits file with the new one it started in step 1. It also updates the fstime file to record the time that the checkpoint was taken.
At the end of the process, the primary has an up-to-date fsimage file and a shorter edits file (it is not necessarily empty, as it may have received some edits while the checkpoint was being taken). It is possible for an administrator to run this process manually while the namenode is in safe mode,
using the hadoop dfsadmin -saveNamespace command.
This procedure make clear that we need secondary Name node has same capacity as Name Node.
Secondary namenode directory structure :
directory structure of Secondary Name Node will be
${fs.checkpoint.dir}/current/VERSION
/edits
/fsimage
/fstime
/previous.checkpoint/VERSION
/edits
/fsimage
/fstime
The layout of this directory and of the secondary’s current directory is identical to the namenode’s. This is by design, since in the event of total namenode failure (when there are no recoverable backups, even from NFS), it allows recovery from a secondary.
hdfs-site.xml
<property>
<name>fs.checkpoint.dir</name>
<value></value>
</property>
Making multiple copies of name space
<property>
<name>dfs.name.dir</name>
<value>/disk1/copy1,/disk2/copy2</value>
</property>
if name node gone down due to 1 disk got failed , how to make use of another copy
Add
<property>
<name>fs.checkpoint.dir</name>
<value>/disk2/copy2</value>
</property>
then instead of namenode format command use below command to copy name space back to disk1.
hadoop namenode -importCheckpoint
Manually do checkpoint
hadoop secondarynamenode -checkpoint force
Secondary Name Node related configuration parameters :
fs.checkpoint.period - 3600
fs.checkpoint.dir - /data/sec
fs.checkpoint.edits.dir - /data/scedits
dfs.secondary.http.address - ip:port
- Secondary Name Node take metadata/name space backup of a name node in hadoop 1.
- Not a hot standby of Name Node.
- Connects to Name Node every hour*(time can be configured).
- Saved meta data can build a failed name node.
A newly formatted namenode creates the following directory structure:
${dfs.name.dir}/current/VERSION
/edits
/fsimage
/fstime
Filesystem image and edit log :
When a filesystem client performs a write operation (such as creating or moving a file),it is first recorded in the edit log. The namenode also has an in-memory representation
of the filesystem metadata, which it updates after the edit log has been modified. The
in-memory metadata is used to serve read requests.
The edit log is flushed and synced after every write before a success code is returned to the client. For namenodes that write to multiple directories, the write must be flushed and synced to every copy before returning successfully. This ensures that no operation is lost due to machine failure.The fsimage file is a persistent checkpoint of the filesystem metadata. However, it is not updated for every filesystem write operation, since writing out the fsimage file, which can grow to be gigabytes in size, would be very slow. This does not compromise resilience, however, because if the namenode fails, then the latest state of its metadata
can be reconstructed by loading the fsimage from disk into memory, then applying each of the operations in the edit log.In fact, this is precisely what the namenode does when it starts up.
The fsimage file contains a serialized form of all the directory and file inodes in the filesystem. Each inode is an internal representation of a file or directory’s metadata and contains such information as the file’s replication level, modification and access times, access permissions,block size, and the blocks a file is made up of. For directories, the modification time, permissions, and quota metadata is stored.
The fsimage file does not record the datanodes on which the blocks are stored. Instead the namenode keeps this mapping in memory, which it constructs by asking the datanodes for their block lists when they join the cluster and periodically afterward to ensure the namenode’s block mapping is up-to-date.
As described, the edits file would grow without bound. Though this state of affairs would have no impact on the system while the namenode is running, if the namenode were restarted, it would take a long time to apply each of the operations in its (very long) edit log. During this time, the filesystem would be offline, which is generally
undesirable.
The solution is to run the secondary namenode, whose purpose is to produce checkpoints of the primary’s in-memory filesystem metadata.
The checkpointing process proceeds as shown in the figure..
2. The secondary retrieves fsimage and edits from the primary (using HTTP GET).
3. The secondary loads fsimage into memory, applies each operation from edits, then creates a new consolidated fsimage file.
4. The secondary sends the new fsimage back to the primary (using HTTP POST).
5. The primary replaces the old fsimage with the new one from the secondary, and the old edits file with the new one it started in step 1. It also updates the fstime file to record the time that the checkpoint was taken.
At the end of the process, the primary has an up-to-date fsimage file and a shorter edits file (it is not necessarily empty, as it may have received some edits while the checkpoint was being taken). It is possible for an administrator to run this process manually while the namenode is in safe mode,
using the hadoop dfsadmin -saveNamespace command.
This procedure make clear that we need secondary Name node has same capacity as Name Node.
Secondary namenode directory structure :
directory structure of Secondary Name Node will be
${fs.checkpoint.dir}/current/VERSION
/edits
/fsimage
/fstime
/previous.checkpoint/VERSION
/edits
/fsimage
/fstime
The layout of this directory and of the secondary’s current directory is identical to the namenode’s. This is by design, since in the event of total namenode failure (when there are no recoverable backups, even from NFS), it allows recovery from a secondary.
hdfs-site.xml
<property>
<name>fs.checkpoint.dir</name>
<value></value>
</property>
Making multiple copies of name space
<property>
<name>dfs.name.dir</name>
<value>/disk1/copy1,/disk2/copy2</value>
</property>
if name node gone down due to 1 disk got failed , how to make use of another copy
Add
<property>
<name>fs.checkpoint.dir</name>
<value>/disk2/copy2</value>
</property>
then instead of namenode format command use below command to copy name space back to disk1.
hadoop namenode -importCheckpoint
Manually do checkpoint
hadoop secondarynamenode -checkpoint force
Secondary Name Node related configuration parameters :
fs.checkpoint.period - 3600
fs.checkpoint.dir - /data/sec
fs.checkpoint.edits.dir - /data/scedits
dfs.secondary.http.address - ip:port
Subscribe to:
Posts (Atom)