While copying between two clusters that are running different versions of Hadoop, it is
generally recommended to use HftpFileSystem as the source. HftpFileSystem is
a read-only filesystem. The distcp command has to be run from the destination server:

hadoop distcp hftp://namenodeA:port/data/weblogs hdfs://namenodeB/data/
weblogs


In the preceding command, port is defined by the dfs.http.address property in the
hdfs-site.xml configuration file.
Previous Post Next Post