Submit a job request

Open the job.hadoop.new with vi:

login1% ls

hadoop-0.20.2-new hadoop-0.20.2.tar.gz job.hadoop.new

hadoop-0.20.2-new.tar.gz job.hadoop

login1% vi job.hadoop.new

The first part of job.hadoop.new looks like the following:

#!/bin/bash

#-----------------------------------------------------------------------------

# This Longhorn job script is designed to create a vnc session on

# visualization nodes through the SGE batch system. Once the job

# is scheduled, check the output of your job (which by default is

# stored in your home directory in a file named vncserver.out)

# and it will tell you the port number that has been setup for you so

# that you can attach via a separate VNC client to the longhorn login

# node (login1.longhorn.tacc.utexas.edu).

# Note that for security, we recommend setting up a tunneled VNC

# session in order to connect via a client (more information on doing

# this is available at the User Guide link below). Once you connect,

# you should see a single xterm running which you can use to launch

# any X application (eg. Paraview or VisIt)

# Note: you can fine tune the SGE submission variables below as

# needed. Typical items to change are the runtime limit, location of

# the job output, and the allocation project to submit against (it is

# commented out for now, but is required if you have multiple

# allocations).

# To submit the job, issue: "qsub /share/tacc_scripts/job.vnc"

# For more information, please consult the User Guide at:

# http://services.tacc.utexas.edu/index.php/longhorn-user-guide

#-----------------------------------------------------------------------------

#$ -V # Inherit the submission environment

#$ -cwd # Start job in submission dir

#$ -N Hadoop-test # Job name

#$ -j y # Combine stderr and stdout into stdout

#$ -o $HOME/$JOB_NAME.out # Name of the output file

#$ -pe 1way 128 # Request 1 vis node (Max: 8way 384)

#$ -q hadoop # Queue name

#$ -P data

#$ -l h_rt=6:00:00 # runtime (hh:mm:ss) - 6 hours (currently limited to 24 hours).

#$ -A **** # Replace **** with your Project Name

#--------------------------------------------------------------------------

# ---- You normally should not need to edit anything below this point -----

#--------------------------------------------------------------------------

...

[Line 132-138]

# we need vglclient to run to have graphics across multi-node jobs

vglclient >& /dev/null &

VGL_PID=$!

export HADOOP_HOME=/home/00791/xwj/Longhorn/hadoop-0.20.2-new

export JAVA_HOME=/share/apps/teragrid/jdk1.6.0_19-64bit/

export PATH=$HADOOP_HOME/bin:${PATH}

Change highlighted values to fit your setting.
Line 136
- Change the following line:
you can find your hadoop home by doing pwd.

Options:

export HADOOP_HOME=/home/00791/xwj/Longhorn/hadoop-0.20.2-new

For Hadoop job wayness should be 1. Thus, choose among:
- -pe 1way 128 (preferred)
- -pe 1way 256
- -pe 1way 384 (other users cannot use the system)

-l: Change if you want to use Hadoop cluster for a longer time.
-A: Your project name. You can check your project name at login.

When you ssh to longhorn server, you can see:

Note: Ranger filesystems are not currently available on Longhorn

---------------------- Project balances for user user_id -----------------------

| Name Avail SUs Expires |

| ProjectName 49975 |

Submit a job request typing:

login1% qsub job.hadoop.new

-------------------------------------------------------------------------------

-- Welcome to TACC's Longhorn Visualization System, an NSF TeraGrid Resource --

-------------------------------------------------------------------------------

--> Checking that you specified -V...

--> Checking that you specified a time limit...

--> Checking that you specified a queue...

--> Testing that the specified project type is valid...

--> Setting Longhorn project...

--> Checking that you specified a parallel environment...

--> Checking that you specified a valid parallel environment name...

--> Checking that the minimum and maximum PE counts are the same...

--> Checking that the number of PEs requested is valid...

--> Ensuring absence of dubious h_vmem,h_data,s_vmem,s_data limits...

--> Requesting valid memory configuration (mt=31.3G)...

--> Verifying HOME file-system availability...

--> Verifying SCRATCH file-system availability...

--> Checking ssh setup...

--> Checking that you didn't request more cores than the maximum...

--> Checking that you don't already have the maximum number of jobs...

--> Checking that your time limit isn't over the maximum...

--> Checking available allocation...

--> Submitting job...

Your job 38286 ("Hadoop-test") has been submitted

You will see the output file YourJobName.out in your home folder. With the sample setting, you can find ~/Hadoop-test.out. Find your designated port number by typing:

tail Hadoop-test.out

Find your port number:

Congratulations! Now, you are ready to run hadoop on your TACC account. If you do not need to view web interface of JobTracker and HadoopAdmic, you do not need to see the next page: VNC to TACC.

>> If you want a visual interface, with your port number, proceed to the next step: VNC to TACC

>> Control your submitted jobs

>> You can use ssh -X or -Y to run firefox remotely on your own Linux machine.

login1% tail Hadoop-test.out

running on node c202-122

using default VNC server /opt/apps/tightvnc/1.3.10/bin/vncserver

memory limit set to 46960806 kilobytes

set wayness to 1

got VNC display :1

local (compute node) VNC port is 5901

got login node VNC port 10222

Your VNC server is now running!

To connect via VNC client: SSH tunnel port 10222 to login1.longhorn.tacc.utexas.edu:10222

Then connect to localhost::10222

Page updated

Google Sites

Report abuse