Wednesday, December 1, 2010

How to see the HDFS statistics of a Hadoop cluster

When you maintain a cluster then often you want to know the stats of HDFS, things like total capacity of HDFS, percentage of HDFS used and free, list of all data nodes that are active, list of all datanodes that are not active etc.

The following command provides this information
hadoop dfsadmin -report


If your hadoop cluster has a lot of datanodes (which most likely will be the case in a production environment) then this will be a huge report. You will have to create some type of summary report on your own.

No comments:

Post a Comment