You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Hadoop - my first MapReduce (M/R) code
  1. install VMware on the local machine
  2. import Hadoop server from http://www.cloudera.com/hadoop-training-virtual-machine
  3. fire VM up
  4. To use Streamers add system variable:
    export SJAR=/usr/lib/hadoop/contrib/streaming/hadoop-0.20.1+133-streaming.jar
    Now you can writhe M?R in any language
  5. upload Shakespeare text to HDFS (Hadoop Distributed File System)
    cd ~/git/data
    tar vzxf shakespeare.tar.gz
    check nothig is in HDFS
    hadoop fs -ls /user/training
    add unpacked text gile to HDFS and check again
    hadoop fs -put input /user/training/inputShak
  6. (source) (target)
    hadoop fs -ls /user/training
  7. Execute M/R job using 'cat' & 'wc'
    hadoop jar $SJAR \
    -mapper cat \
    -reducer wc \
    -input inputShak \
    -output outputShak
    1. inspect output in
      hadoop fs -cat outputShak/p*
  • No labels