Page History

...

This

...

was

...

4

...

hours

...

workshop,

...

I

...

attended

...

remotely.

...

Instruction

...

for

...

hands-on

...

exercise

...

is

...

here:

...

http://tinyurl.com/nerschadoopoct

...

*

Hadoop admin page is:

http://maghdp01.nersc.gov:50030/jobtracker.jsp

...

my Notes

my shell was not bash, I changed it by typing bash -l , type echo $SHELL
module load tig hadoop
generic hadoop command: hadoop command [genericOptions] [commandOptions]
Create hadoop FS: hadoop fs -mkdir /user/balewski
List its content (should be nothing now, but no error) : hadoop fs -ls

Exercise 1: create , load, read back text file to HFS

Code Block

 Notes*

* &nbsp;my shell was not bash, I changed it by typing {color:#0000ff}bash \-l{color}{color:#3366ff}&nbsp;{color}, type {color:#0000ff}echo $SHELL{color}
* {color:#0000ff}module load tig hadoop{color}
* generic hadoop command:&nbsp;*hadoop command \[genericOptions\] \[commandOptions\]*
* Create hadoop FS:{color:#0000ff}&nbsp;hadoop fs \-mkdir /user/balewski{color}
* List its content (should be nothing now, but no error) :&nbsp;{color:#0000ff}hadoop &nbsp;fs \-ls{color}

&nbsp;Exercise 1: &nbsp;create , load, read back text file to HFS&nbsp;
{code}
$   vi testfile1
This is file 1
This is to test HDFS
$   vi testfile2
This is file 2
This is to test HDFS again
$   hadoop fs -mkdir input
$   hadoop fs -put testfile* input/
$   hadoop fs -cat input/testfile1
$   hadoop fs -cat input/testfile*
$  hadoop fs -get input input
$   ls input/
{code}

Exercise

...

2:

...

run

...

hadoop

...

job

...

from

...

the

...

package

{

Code Block

}
$   hadoop fs -mkdir wordcount-in
$   hadoop fs -put /global/scratch/sd/lavanya/hadooptutorial/wordcount/* wordcount-in/
$   hadoop jar /usr/common/tig/hadoop/hadoop-0.20.2+228/hadoop-0.20.2+228-examples.jar wordcount wordcount-in wordcount-op
$   hadoop fs -ls wordcount-op
$   hadoop fs -cat wordcount-op/p* | grep Darcy
{code}

Monitor

...

its

...

progress

...

form

...

URL:

...

http://maghdp01.nersc.gov:50030/

...

http://maghdp01.nersc.gov:50070/

...

To

...

re-run

...

a

...

job

...

you

...

must

...

first

...

CLEANUP

...

old

...

output

...

files:

...

hadoop dfs -rmr

...

wordcount-opd

...

run

...

Hadoop

...

on

...

4

...

reducers

...

: hadoop jar /usr/common/tig/hadoop/hadoop-0.20.2+228/hadoop-0.20.2+228-examples.jar

...

wordcount

...

-Dmapred.reduce.tasks=4

...

wordcount-in

...

wordcount-op

...

Some

...

suggestion:

...

change

...

user

...

permision

...

to

...

allow

...

me

...

to

...

read

...

the

...

Hadoop

...

output

...

because

...

Hadopp

...

owns

...

all

...

by

...

default

...

on

...

the

...

Scratch

...

disk

...

Or

...

use

...

provided

...

script:

...

fixperms.sh

...

/global/scratch/sd/balewski/hadoop/wordcount

...

-

...

gpfs/

...

d
d

Child pages

Versions Compared

Old Version 7

New Version 8

Key