HiBench benchmark using random-text-writer have issue in yarn, new hadoop 2.x
I had problems using Hibench 2.2 in new hadoop cluster, yarn.
when I used prepare.sh to make data used in wordcount and sort, DATASIZE and NUM_MAPS in configure.sh aren't recognized correctly.
So I edited some lines which looks like old code, after that, it was executed correctly.
it is original source code
#line: 39
# generate data$HADOOP_EXECUTABLE jar $HADOOP_EXAMPLES_JAR randomtextwriter \-D test.randomtextwrite.bytes_per_map=$((${DATASIZE} / ${NUM_MAPS})) \-D test.randomtextwrite.maps_per_host=${NUM_MAPS} \$COMPRESS_OPT \$INPUT_HDFS
wordcount/bin/prepare.sh
and i changed it to
$HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-examples*.jar randomtextwriter \
$COMPRESS_OPT \
-D mapreduce.randomtextwriter.totalbytes=${DATASIZE} \
-D mapreduce.randomtextwriter.bytespermap=$((${DATASIZE} / ${NUM_MAPS})) \
$INPUT_HDFS
Naturally, test.randomtextwrite.maps_per_host can be changed to mapreduce.randomtextwriter.mapsperhost.
I hope that this post will help some people using hibench in yarn cluster.