dfsadmin
":dfsadmin
performs DFS administrative commands.hdfs dfsadmin COMMAND [GENERIC-OPTIONS] [COMMAND-OPTIONS]
-conf <configuration file> specify an application configuration file -D <property=value> define a value for a given property -fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations. -jt <local|resourcemanager:port> specify a ResourceManager -files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster -libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath -archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines
hdfs dfsadmin
print help when invoked without parameters or with "-help
" parameter: hdfs dfsadmin -help
.hdfs dfsadmin -help COMMAND
.$ hdfs dfsadmin -help help -help [cmd] Displays help for the given command or all commands if none is specified.
$ hdfs dfsadmin -help report -report [-live] [-dead] [-decommissioning] [-enteringmaintenance] [-inmaintenance] Reports basic filesystem information and statistics. The dfs usage can be different from "du" usage, because it measures raw space used by replication, checksums, snapshots and etc. on all the DNs. Optional options (live, dead, decommissioning, enteringmaintenance, inmaintenance) may be used to filter the list of displayed DNs.
$ hdfs dfsadmin -help safemode -safemode <enter|leave|get|wait|forceExit> Safe mode maintenance command. Safe mode is a Namenode state in which it: 1. does not accept changes to the name space (read-only), 2. does not replicate or delete blocks. Safe mode is entered automatically at Namenode startup, and leaves safe mode automatically when the configured minimum percentage of blocks satisfies the minimum replication condition. Safe mode can also be entered manually, but then it can only be turned off manually as well.
$ hdfs dfsadmin -help saveNamespace -saveNamespace [-beforeShutdown] Save current namespace into storage directories and reset edits log. Requires safe mode. If the "beforeShutdown" option is given, the NameNode does a checkpoint if and only if there is no checkpoint done during a time window (a configurable number of checkpoint periods). This is usually used before shutting down the NameNode to prevent potential fsimage/editlog corruption.
$ hdfs dfsadmin -help rollEdits -rollEdits Rolls the edit log.
$ hdfs dfsadmin -help restoreFailedStorage -restoreFailedStorage Set/Unset/Check option to attempt restore of failed storage replicas if they become available.
$ hdfs dfsadmin -help refreshNodes -refreshNodes Updates the namenode with the set of datanodes allowed to connect to the namenode. Namenode re-reads datanode hostnames from the file defined by dfs.hosts, dfs.hosts.exclude configuration parameters. Hosts defined in dfs.hosts are the datanodes that are part of the cluster. If there are entries in dfs.hosts, only the hosts in it are allowed to register with the namenode. Entries in dfs.hosts.exclude are datanodes that need to be decommissioned. Datanodes complete decommissioning when all the replicas from them are replicated to other datanodes. Decommissioned nodes are not automatically shutdown and are not chosen for writing new replicas.
$ hdfs dfsadmin -help finalizeUpgrade -finalizeUpgrade Finalize upgrade of HDFS. Datanodes delete their previous version working directories, followed by Namenode doing the same. This completes the upgrade process.
$ hdfs dfsadmin -help rollingUpgrade -rollingUpgrade [<query|prepare|finalize>] query: query the current rolling upgrade status. prepare: prepare a new rolling upgrade. finalize: finalize the current rolling upgrade.
$ hdfs dfsadmin -help upgrade -upgrade <query | finalize> query: query the current upgrade status. finalize: finalize the upgrade of HDFS (equivalent to -finalizeUpgrade).
$ hdfs dfsadmin -help metasave -metasave <filename> Save Namenode’s primary data structures to <filename> in the directory specified by hadoop.log.dir property. <filename> is overwritten if it exists. <filename> will contain one line for each of the following: 1. Datanodes heart beating with Namenode, 2. Blocks waiting to be replicated, 3. Blocks currrently being replicated, 4. Blocks waiting to be deleted.
$ hdfs dfsadmin -help setQuota -setQuota <quota> <dirname>...<dirname> Set the quota <quota> for each directory <dirName>. The directory quota is a long integer that puts a hard limit on the number of names in the directory tree. For each directory, attempt to set the quota. An error will be reported if: 1. quota is not a positive integer, 2. or, User is not an administrator, 3. or, The directory does not exist or is a file. Note: A quota of 1 would force the directory to remain empty.
$ hdfs dfsadmin -help clrQuota -clrQuota <dirname>...<dirname> Clear the quota for each directory <dirName>. For each directory, attempt to clear the quota. An error will be reported if: 1. the directory does not exist or is a file, 2. or, user is not an administrator. It does not fault if the directory has no quota.
$ hdfs dfsadmin -help setSpaceQuota -setSpaceQuota <quota> [-storageType <storagetype>] <dirname>...<dirname> Set the space quota <quota> for each directory <dirName>. The space quota is a long integer that puts a hard limit on the total size of all the files under the directory tree. The extra space required for replication is also counted. E.g. a 1GB file with replication of 3 consumes 3GB of the quota. Quota can also be specified with a binary prefix for terabytes, petabytes etc (e.g. 50t is 50TB, 5m is 5MB, 3p is 3PB). For each directory, attempt to set the quota. An error will be reported if 1. quota is not a positive integer or zero, 2. or, user is not an administrator, 3. or, the directory does not exist or is a file. The storage type specific quota is set when -storageType option is specified. Available storageTypes are: - RAM_DISK - DISK - SSD - ARCHIVE
$ hdfs dfsadmin -help clrSpaceQuota -clrSpaceQuota [-storageType <storagetype>] <dirname>...<dirname> Clear the space quota for each directory <dirName>. For each directory, attempt to clear the quota. An error will be reported if: 1. the directory does not exist or is a file, 2. or, user is not an administrator. It does not fault if the directory has no quota. The storage type specific quota is cleared when -storageType option is specified. Available storageTypes are: - RAM_DISK - DISK - SSD - ARCHIVE
$ hdfs dfsadmin -help refreshServiceAcl -refreshServiceAcl Reload the service-level authorization policy file Namenode will reload the authorization policy file.
$ hdfs dfsadmin -help refreshUserToGroupsMappings -refreshUserToGroupsMappings Refresh user-to-groups mappings.
$ hdfs dfsadmin -help refreshSuperUserGroupsConfiguration -refreshSuperUserGroupsConfiguration Refresh superuser proxy groups mappings.
$ hdfs dfsadmin -help refreshCallQueue -refreshCallQueue Reload the call queue from config.
$ hdfs dfsadmin -help refresh -refresh <hostname:ipc_port> <resource_identifier> [arg1..argn] Triggers a runtime-refresh of the resource specified by <resource_identifier> on <hostname:ipc_port>. All other args after are sent to the host. The ipc_port is determined by 'dfs.datanode.ipc.address',default is DFS_DATANODE_IPC_DEFAULT_PORT.
$ hdfs dfsadmin -help reconfig -reconfig <namenode|datanode> <host:ipc_port> <start|status|properties> Starts or gets the status of a reconfiguration operation, or gets a list of reconfigurable properties. The second parameter specifies the node type.
$ hdfs dfsadmin -help printTopology -printTopology Print a tree of the racks and their nodes as reported by the Namenode.
$ hdfs dfsadmin -help refreshNamenodes -refreshNamenodes datanode_host:ipc_port For the given datanode: reloads the configuration files, stops serving the removed block-pools, and starts serving new block-pools. The ipc_port is determined by 'dfs.datanode.ipc.address', default is DFS_DATANODE_IPC_DEFAULT_PORT.
$ hdfs dfsadmin -help getVolumeReport -getVolumeReport datanode_host:ipc_port For the given datanode,get the volume report. The ipc_port is determined by 'dfs.datanode.ipc.address', default is DFS_DATANODE_IPC_DEFAULT_PORT.
$ hdfs dfsadmin -help deleteBlockPool -deleteBlockPool datanode_host:ipc_port blockpoolId [force] If force is passed, block pool directory for the given blockpool id on the given datanode is deleted along with its contents, otherwise the directory is deleted only if it is empty. The command will fail if datanode is still serving the block pool. Refer to refreshNamenodes to shutdown a block pool service on a datanode. The ipc_port is determined by 'dfs.datanode.ipc.address', default is DFS_DATANODE_IPC_DEFAULT_PORT.
$ hdfs dfsadmin -help setBalancerBandwidth -setBalancerBandwidth <bandwidth> Changes the network bandwidth used by each datanode during HDFS block balancing. <bandwidth> is the maximum number of bytes per second that will be used by each datanode. This value overrides the dfs.datanode.balance.bandwidthPerSec parameter. --- NOTE: The new value is not persistent on the DataNode.---
$ hdfs dfsadmin -help getBalancerBandwidth -getBalancerBandwidth <datanode_host:ipc_port> Get the network bandwidth for the given datanode. This is the maximum network bandwidth used by the datanode during HDFS block balancing. --- NOTE: This value is not persistent on the DataNode.---
$ hdfs dfsadmin -help fetchImage -fetchImage <local directory> Downloads the most recent fsimage from the Name Node and saves it in the specified local directory.
$ hdfs dfsadmin -help allowSnapshot -allowSnapshot <snapshotDir> Allow snapshots to be taken on a directory.
$ hdfs dfsadmin -help disallowSnapshot -disallowSnapshot <snapshotDir> Do not allow snapshots to be taken on a directory any more.
$ hdfs dfsadmin -help shutdownDatanode -shutdownDatanode <datanode_host:ipc_port> [upgrade] Submit a shutdown request for the given datanode. If an optional "upgrade" argument is specified, clients accessing the datanode will be advised to wait for it to restart and the fast start-up mode will be enabled. When the restart does not happen in time, clients will timeout and ignore the datanode. In such case, the fast start-up mode will also be disabled.
$ hdfs dfsadmin -help evictWriters -evictWriters <datanode_host:ipc_port> Make the datanode evict all clients that are writing a block. This is useful if decommissioning is hung due to slow writers.
$ hdfs dfsadmin -help getDatanodeInfo -getDatanodeInfo <datanode_host:ipc_port> Get the information about the given datanode. This command can be used for checking if a datanode is alive.
$ hdfs dfsadmin -help triggerBlockReport -triggerBlockReport [-incremental] <datanode_host:ipc_port> Trigger a block report for the datanode. If 'incremental' is specified, it will be an incremental block report; otherwise, it will be a full block report.
$ hdfs dfsadmin -help listOpenFiles -listOpenFiles [-blockingDecommission] [-path <path>] List all open files currently managed by the NameNode along with client name and client machine accessing them. If 'blockingDecommission' option is specified, it will list the open files only that are blocking the ongoing Decommission.