Define your choice of ports by setting properties
dfs.http.address
for Namenode and mapred.job.tracker.http.address
for Jobtracker in conf/core-site.xml:<configuration>
<property>
<name>dfs.http.address</name>
<value>50070</value>
</property>
<property>
<name>mapred.job.tracker.http.address</name>
<value>50030</value>
</property>
</configuration>
Web UIs for the Common User
The default Hadoop ports are as follows:Daemon | Default Port | Configuration Parameter | |
---|---|---|---|
HDFS | Namenode | 50070 | dfs.http.address |
Datanodes | 50075 | dfs.datanode.http.address | |
Secondarynamenode | 50090 | dfs.secondary.http.address | |
Backup/Checkpoint node? | 50105 | dfs.backup.http.address | |
MR | Jobracker | 50030 | mapred.job.tracker.http.address |
Tasktrackers | 50060 | mapred.task.tracker.http.address | |
? Replaces secondarynamenode in 0.21. |
- /logs
- Exposes, for downloading, log files in the Java system property hadoop.log.dir.
- /logLevel
- Allows you to dial up or down log4j logging levels. This is similar to hadoop daemonlog on the command line.
- /stacks
- Stack traces for all threads. Useful for debugging.
- /metrics
- Metrics for the server. Use /metrics?format=json to retrieve the data in a structured form. Available in 0.21.
The Namenode exposes:
- /
- Shows information about the namenode as well as the HDFS. There’s a link from here to browse the filesystem, as well.
- /dfsnodelist.jsp?whatNodes=(DEAD|LIVE)
- Shows lists of nodes that are disconnected from (DEAD) or connected to (LIVE) the namenode.
- /fsck
- Runs the “fsck” command. Not recommended on a busy cluster.
- /listPaths
- Returns an XML-formatted directory listing. This is useful if you wish (for example) to poll HDFS to see if a file exists. The URL can include a path (e.g., /listPaths/user/philip) and can take optional GET arguments: /listPaths?recursive=yes will return all files on the file system; /listPaths/user/philip?filter=s.* will return all files in the home directory that start with s; and /listPaths/user/philip?exclude=.txt will return all files except text files in the home directory. Beware that filter and exclude operate on the directory listed in the URL, and they ignore the recursive flag.
- /data and /fileChecksum
- These forward your HTTP request to an appropriate datanode, which in turn returns the data or the checksum.
- /browseBlock.jsp, /browseDirectory.jsp, tail.jsp, /streamFile, /getFileChecksum
- These are the endpoints that the namenode redirects to when you are browsing filesystem content. You probably wouldn’t use these directly, but this is what’s going on underneath.
- /blockScannerReport
- Every datanode verifies its blocks at configurable intervals. This endpoint provides a listing of that check.
The jobtracker‘s UI is commonly used to look at running jobs, and, especially, to find the causes of failed jobs. The UI is best browsed starting at /jobtracker.jsp. There are over a dozen related pages providing details on tasks, history, scheduling queues, jobs, etc.
Tasktrackers have a simple page (/tasktracker.jsp), which shows running tasks. They also expose /taskLog?taskid=<id> to query logs for a specific task. They use /mapOutput to serve the output of map tasks to reducers, but this is an internal API.