mapred.tasktracker.map.tasks.maximum 2
mapred.tasktracker.reduce.tasks.maximum 2
If you want to change them, then you should change the file
{$HADOOP_HOME}/conf/mapred-site.xml
, where ${HADOOP_HOME}
is the path of hadoop.
For example, if you determine that you want 8 reducers (this can be done by setting
conf.setNumReduceTasks(8);
in your code) and you keep these default values, assuming that you have
2 nodes in the cluster, each node will run 2 map tasks at the
beginning, so, in overall, 2x2 = 4 map tasks will be running in your
cluster. When any of these map tasks finishes, the node will run the
next map task in the queue. At any point, 4 map tasks (maximum) will be
running in your cluster.
EDIT: I found the mistake. In the first link it says:
The right number of reduces seems to be 0.95 or 1.75 * (nodes * mapred.tasktracker.tasks.maximum).
It should be:
The right number of reduces seems to be 0.95 or 1.75 * (nodes * mapred.tasktracker.reduce.tasks.maximum).
Increasing io.sort.mb
1 down vote
according to the article here io.sort.mb should be 10 * io.sort.factor incase you have ram.
"core-site.xml"
|