Computing on coyote

This is all about how to compute on coyote. There's another site with other information at:

https://wikis-mit-edu.ezproxyberklee.flo.org/confluence/display/ParsonsLabMSG/Coyote

Gaining access:

I'm pretty sure that [greg] still helps out with this, you would probably need to ask him for access. Also, if you are working off campus, you need to log into your on-campus account first then ssh onto coyote. If you have questions about this, just ask me. Also, put yourself on the mailing list by checking out the other link above.

What is available:

You can find out with module avail

modules of interest (to me):

module add python/2.7.3
module add atlas/3.8.3
module add suitesparse/20110224
module add numpy/1.5.1
module add scipy/0.11.0
module add biopython

module add matplotlib/1.1.0

module add matlab

(matlab above one above is 2009)

module add matlab/2012b

QIIME has been installed:

module add python/2.7.6
module add qiime/1.8.0

Interactive computing:

I've was able to get onto a node with:

qsub -I -l nodes=1

But when I tried to use matlab with (module add matlab) it didn't work (although it did work upon ssh'ing)

To run matlab with the window, first log in with X11:

ssh -X user@coyote.mit.edu

ssh -X super-genius

module add matlab/2012b

matlab

Submitting multiple jobs:

Before running the program below, make sure to load the following modules (I tried without and I got an error loading argparse):

module add python/2.7.3
module add atlas/3.8.3
module add suitesparse/20110224
module add biopython
module add java/1.6.0_21

You can also just source csmillie's .bashrc to make sure it works (if you didn't do anything else to yours that you need).

Also, there are different timed queues, so make sure if you get this working that it submits to the right queue. From what I can tell there is short, long and ultra. Not sure what the exact compute times are for each.

From Mark and Chris-

I've been using a script chris wrote which works pretty well:/home/csmillie/bin/ssub
What it does
It streamlines job submission. If you give it a list of commands, it will (1) create scripts for them, and (2) submit them as a job array. You can give it the list of commands as command line arguments or through a pipe.
Quick examples
1. Submit a single command to the cluster ssub "python /path/to/script.py > /path/to/output.txt"
2. Submit multiple commands to the cluster (use semicolon separator) ssub "python /path/to/script1.py; python /path/to/script2.py"
3. Submit a list of commands to the cluster (newline separator) cat /list/of/commands.txt | ssub
Detailed example /home/csmillie/alm/mammals/aln/95/
In this directory, I have 12,352 fasta files I want to align. I can do this on 100 nodes quite easily:
1. First, I create a list of commands: for x in `ls *fst`; do y=${x%.*}; echo muscle -in $x -out $y.aln; done > commands.txt

The output looks like this:
...
muscle -in O95_9990.fst -out O95_9990.aln
muscle -in O95_9991.fst -out O95_9991.aln
muscle -in O95_9992.fst -out O95_9992.aln
muscle -in O95_9993.fst -out O95_9993.aln
...
2. Then I submit these commands as a job array: cat commands.txt | ssub
How to configure it
Copy it to your ~/bin (or wherever). Then edit the top of the script:uname = your username tmpdir = directory where scripts are created max_size = number of nodes you want to use
Other things

It automatically creates random filenames for your scripts and job arrays. These files are created in the directory specified by "tmpdir" It can also submit individual scripts instead of a job array.

Coyote queue

qstat -Qf | more

This will tell you the specifics of each job. There is also no priority allocation, so please be polite and choose the right queue for your job.

Submitting multiple files to process in the same job

Example from Spence -

I wanted to write bash files that would submit multiple files for the same analysis command on coyote. I used PBS_ARRAYID, which will take on values that you designate with the -t option of qsub.

I got access to qiime functions by adding the following line to the bottom of my .bashrc file:

export PATH="$PATH:/srv/pkg/python/python-2.7.6/pkg/qiime/qiime-1.8.0/qiime-1.8.0-release/bin"

Then I made my .bashrc file the source in my submission script (see below). The DATA variable just shortens a directory where I store my data.

To run all this, I created the file below then typed the following at the command line:
$ module add python/2.7.6
$ module add qiime/1.8.0
$ qsub -q long -t 1-10 pickRepSet.sh

(the -t option will vary my PBS_ARRAYID variable from 1 to 10, iterating through my 10 experimental files).

#!/bin/sh
#filename: pickRepSet.sh
#
# PBS script to run a job on the myrinet-3 cluster.
# The lines beginning #PBS set various queuing parameters.
#
#    -N Job Name
#PBS -N pickRepSet
#
#    -l resource lists that control where job goes
#PBS -l nodes=1
#
#       Where to write output
#PBS -e stderr
#PBS -o stdout
#
#    Export all my environment variables to the job
#PBS -V
#
source /home/sjspence/.bashrc

DATA="/net/radiodurans/alm/sjspence/data/140509_ML1/"

pick_rep_set.py -i ${DATA}fwd_otu/uclust_picked_otus${PBS_ARRAYID}/ML1-${PBS_ARRAYID}_filt_otus.txt -f ${DATA}fwd_filt/dropletBC/ML1-${PBS_ARRAYID}_filt.fna -o ${DATA}fwd_otu/repSet/ML1-${PBS_ARRAYID}_rep.fna

Blog

Computing on coyote