(Site Map)

(SC)2 logo

SABER Quickstart Guide
An (SC)² Grid Computing Project

The (SC)² grid project, SABER, uses grid technology to expand computing capabilities for (SC)² members.

The goal of the project is deployment of a production-quality grid computing platform providing an environment for computational scientists to explore grid computing in their scientific disciplines.

How to Use this Quickstart Guide: This guide is intended to facilitate use of SABER machines and the resources they provide. This is not an exhaustive guide to all the technical aspects of SABER. It is, however, a good starting point. This guide should provide enough basic information to allow the average user to run jobs.

TABLE OF CONTENTS

System Architecture and Configuration

SABER, as a grid resource, is an inherently dynamic "system". The specfic resources currently registered for use in SABER include an Alphaserver ES40 cluster at PSC, Intel-based clusters at WVU and NETL, and a Condor flock, set up on the PCs in the CTC training facility at PSC.

Stay Informed

As a SABER user, it is imperative that you stay informed of changes to the SABER environment. Important information is posted to PSC's general bboards, which can be read through various facilities on most PSC systems.

Getting Help

If you run into problems, contact saber-help@psc.edu.

Access to SABER

Getting an allocation

There are two types of grants available: starter grants and production grants. Starter grants are appropriate as precursors to large requests. Production grants are large awards for users with extensive computational requirements.

See http://sc-2.psc.edu/grants/ for further information on submitting a proposal for a grant.

Logging In

To connect to a SABER system, ssh to ben.psc.edu or energy.cluster.wvu.edu. You can use one of two formats: username@hostname, or hostname -l username.

   % ssh joeuser@ben.psc.edu   (or)
   % ssh joeuser@energy.cluster.wvu.edu

or


   % ssh ben.psc.edu -l joeuser   (or)
   % ssh energy.cluster.wvu.edu -l joeuser

Passwords

Use the kpasswd command to change your password on ben.psc.edu or energy.cluster. Changing your password on one machine will not change it on the other.

Setting the Environment

To use grid functions, you need globus in your path. Type: echo $PATH at the command line. If you see /usr/local/globus/globus-2.4.3/bin and /usr/local/globus/globus-2.4.3/sbin you should be fine.

If not:


   % setenv GLOBUS_LOCATION /usr/local/globus/globus-2.4.3
   % source $GLOBUS_LOCATION/etc/globus-user-env.csh

or, in a Bourne shell:


   % setenv GLOBUS_LOCATION /usr/local/globus/globus-2.4.3
   % source $GLOBUS_LOCATION/etc/globus-user-env.sh

Credentials

To be authenticated, you need 1), a grid proxy created from an X.509 certificate; and 2), to place your Distinguished Name (DN) in the PSC grid mapfiles (only once).

Both ben.psc.edu and energy.cluster.wvu.edu use a kx.509 system along with the existing Kerberos infrastructure in order to issue X.509 user certificates. After obtaining a Kerberos principal using kinit, use the kx509 command to get a short-term X.509 certificate. This short-term certificate is linked to the Kerberos principal (i.e., the X.509 certificate lifetime is limited to the Kerberos ticket lifetime--when your Kerberos ticket dies, so does your grid proxy).


   % kinit joeuser@PSC.EDU

   joeuser@PSC.EDU's Password:

   % kx509
   % klist
   Credentials cache: FILE:/tmp/krb5cc_00000
        Principal: joeuser@PSC.EDU


    Issued                 Expires              Principal       
  Jun 10 14:39:57   Jun 10 21:19:10   krbtgt/PSC.EDU@PSC.EDU          
  Jun 10 14:39:57   Jun 10 21:19:10   afs@PSC.EDU             
  Jun 10 14:40:12   Jun 10 21:19:10   kca_service/pscuxb.psc.edu@PSC.EDU
  Jun 10 14:40:12   Jun 10 21:19:10   kca_service/gridinfo.psc.edu@PSC.EDU
  Jun 10 14:40:13   Jun 10 21:19:10   kx509/certificate@PSC.EDU 

To create your proxy from the X.509 certificate, issue the kxlist -p command. You can then examine or delete your grid proxy with grid-proxy-info and grid-proxy-destroy, respectively.


   % kxlist -p

   Service kx509/certificate
    issuer= /C=US/O=Pittsburgh Supercomputing Center/CN=PSC Kerberos Certification Authority
    subject= /C=US/O=Pittsburgh Supercomputing Center/OU=PSC Kerberos Certification
   Authority/CN=joeuser/UID=joeuser/emailAddress=joeuser@PSC.EDU
    serial=08AA
    hash=67d8fd3f

You will need to put your DN in the PSC mapfile (only once) by visiting this website. It will ask for your DN, which you can find with the grid-proxy-info -subject command. The entire output of the grid-proxy-info -subject command is your DN.


   % grid-proxy-info -subject
    /C=US/O=Pittsburgh Supercomputing Center/OU=PSC Kerberos Certification Authority/CN=joeuser/USERID=joeuser/Email=joeuser@PSC.EDU

grid-proxy-info

Use the grid-proxy-info command to view details about your proxy. Use the -help flag to view options which allow you to specify what information you want to retrieve when using the grid-proxy-info command.


   % grid-proxy-info
   subject  : /C=US/O=Pittsburgh Supercomputing Center/OU=PSC Kerberos Certification Authority/CN=joeuser/USERID=joeuser/Email=joeuser@PSC.EDU
   issuer   : /C=US/O=Pittsburgh Supercomputing Center/CN=PSC Kerberos Certification Authority
   identity : /C=US/O=Pittsburgh Supercomputing Center/OU=PSC Kerberos Certification Authority/CN=joeuser/USERID=joeuser/Email=joeuser@PSC.EDU
   type     : end entity credential
   strength : 512 bits
   path     : /tmp/x509up_u20024
   timeleft : 5:56:46

grid-proxy-destroy

Use the grid-proxy-destroy command to kill your grid proxy. If you want to verify that it has been destroyed, type grid-proxy-info. If it has been killed, you will get an error stating that the proxy could not be found.

File Transfer

We recommend transferring files using scp, gsiscp, or GridFTP's client program globus-url-copy.

scp and gsiscp

scp is the easiest way to copy small, single files to the SABER platforms. It uses public key encryption to authenticate and encrypt communications. gsiscp is a version of scp based instead on the 'Grid Security Infrastructure' (GSI). This method relies on having a valid X.509 certificate.

While scp and gsiscp are easy to use, they provide very poor performance when transferring multiple files or large files. GridFTP-based programs should be used for most TeraGrid file transfers.

The syntax of gsiscp is the same as with scp. The remote file specification can be one of two forms: either user@system:/path-to-file or system:/path-to-file. If the user is not given as part of the remote file specification, the username on the local system is assumed.

To copy myprog.f to joeuser's src directory on energy.cluster:


   gsiscp myprog.f  joeuser@energy.cluster:/~joeuser/src/myprog.f

To copy everything from joeuser's local directory to his home directory on ben:


   % gsiscp * ben.psc.edu:

scp can be used in place of gsiscp in the above example. scp and gsiscp will work well for the majority of small file transfers.

GridFTP

GridFTP is a high-performance, secure, reliable data transfer protocol optimized for high-bandwidth, wide-area networks. GridFTP is based on FTP, the popular internet file transfer protocol.

GridFTP provides the following protocol features:

The source and destination arguments may be URLs, local files, or standard input/output. The following table shows some acceptable values for these arguments:

Source/DestinationDescription
file:<fullpath>

file://<hostfullpath>
For local files -- relative pathnames not allowed
gsiftp://<hostpath> For remote files -- relative paths allowed
http://<path to file>

https://<path to file>
For accessing web files -- relative paths allowed
- (dash) Standard input and output

globus-url-copy

The globus-url-copy client program is a GridFTP client that may be used to transfer files from the command line. The format is:


   % globus-url-copy source_url destination_url

The format for the local filename is file:// followed by the absolute file name. For example, user joeuser would refer to file myfile.txt in his home directory (/home/joeuser) as:


   file:///home/joeuser/myfile.txt

Note that three slashes are necessary. The first two are part of file:// and the third is the beginning of the full file specification /home/joeuser/myfile.txt.

The format for the remote file is gsiftp://machine-name followed by the absolute file name. For example, if user joeuser's home directory on energy.cluster.wvu.edu was /home/joeuser/, then he would specify:


   gsiftp://energy.cluster.wvu.edu/home/joeuser/myfile.txt

for the file myfile.txt in /home/joeuser.

Running Jobs

The Globus toolkit provides commands for submitting and managing grid jobs.

globus-job-run

Executes one command on a remote system and returns. It is similar in function to rsh. The format for globus-job-run is:


   % globus-job-run remote-host[/jobmanager] command

The command must be given with the full path, e.g., /bin/ls instead of just ls. For example, to run uname on ben, use:


   % globus-job-run ben.psc.edu/jobmanager-pbs /bin/uname -a

You can stage in a file to execute with globus-job-run using the -s flag. If you have a script "myscript" which contains:


   #!/bin/csh

   /bin/chmod u+x /home/johndoe/a.out
   /home/johndoe/a.out

then you can use


   % globus-job-run ben.psc.edu/jobmanager-pbs -s myscript

to run the executable a.out from ben.psc.edu. Note that absolute paths to all commands are required (e.g., /bin/chmod, /home/johndoe/a.out). You can get more information about how to use globus-job-run by invoking it with the -help switch:


   % globus-job-run -help

globus-job-submit

Allows you to submit a remote job on a remote machine. This command runs the job in the background--once the job has been submitted, a contact string (used to submit commands to monitor the job) is returned, the connection to the remote host closes, and control is returned to the user. This command is useful when submitting large batch jobs, so you can exit the system and return later to query for the job status. Note that the executable must already be on the remote machine. Unlike globus-job-run, globus-job-submit has no staging function.

The format is:


   % globus-job-submit remote-machine executable-command

Again, you must supply the full path to the executable command. globus-job-submit returns a full contact string, which looks like:


   https://energy.cluster.wvu.edu:nnnnn/nnnn/nnnnnnnnnn/

or


   https://ben.psc.edu:nnnnn/nnnn/nnnnnnnnnn/

This contact string is used as the argument in the commands to query the status of the job (globus-job-status), to retrieve the job output (globus-job-get-output), and to stop a running job and delete the remote copy of the output (globus-job-clean).

globus-job-status

Returns the status as PENDING, ACTIVE, DONE or FAILED.


   % globus-job-status contact-string

globus-job-get-output

Retrieves the output from the remote machine.


   % globus-job-get-output contact-string

Job output is returned to stdout on the local machine. You can get the output multiple times, until it is deleted using globus-job-clean. Redirection can be used to save the output to a file.

globus-job-clean

Stops a job if it is still running, and deletes the remote copy of the output:


   % globus-job-clean contact-string

Examples

Note:

globus-job-submit (for batch jobs primarily) can be used almost interchangeably in the examples below.

Simple job:

File simple.sh is on remotehost. Job is run from localhost, and output is echoed to the screen.


   remotehost% cat /home/joeuser/simple.sh
   #!/bin/sh
   echo "I'm a simple test script"
   remotehost% exit

   localhost% globus-job-run remotehost /home/joeuser/simple.sh
   I'm a simple test script
   localhost%

Local job staged to remote host:

File local.sh exists on localhost. It is staged over to remotehost for execution, and the output is echoed to the screen.


   localhost% cat /home/joeuser/local.sh
   #!/bin/sh
   echo "I'm a local script"
   localhost% ssh remotehost
   remotehost% cat /home/joeuser/local.sh
   cat: /home/joeuser/local.sh: No such file or directory
   remotehost% exit
   localhost% globus-job-run remotehost -s /home/joeuser/local.sh
   I'm a local script

Saving output to remote host:

File simple.sh exists on remote host. The job is submitted from localhost, and the output is saved on remotehost as remote.out with the -stdout switch.

   localhost% globus-job-run  remotehost -stdout /home/joeuser/simple.out /home/joeuser/simple.sh
   localhost% cat /home/joeuser/simple.out
   cat: /home/joeuser/simple.out: No such file or directory
   localhost% ssh remotehost
   remotehost% cat /home/joeuser/simple.out
   I'm a simple test script

Saving output to local host:

File simple.sh is on remotehost. Job is submitted from localhost. The output is written to remote.out, then staged back to localhost with the -stdout -s switches. File remote.out is not saved on remotehost.


   localhost% globus-job-run  remotehost -stdout -s remote.out  /home/joeuser/simple.sh
   localhost% cat /home/joeuser/remote.out
   I'm a simple test script
   localhost% ssh remotehost
   remotehost% cat /home/joeuser/remote.out
   cat: /home/joeuser/remote.out: No such file or directory

MPI job:

File simple.c is on remotehost. It is compiled there, and then globus-job-run is submitted from localhost. Note the use of the -x '(jobtype=mpi)' switch to declare that this is an MPI job, and the use of -np to request the number of processors to use. The output is echoed back to the screen.

   remotehost% cat /home/joeuser/simple.c
   #include <stdio.h>
   #include "mpi.h"

   int main(int argc, char *argv[])
   {
       int rank, size;
       MPI_Init(&argc, &argv);
       MPI_Comm_rank(MPI_COMM_WORLD, &rank);
       MPI_Comm_size(MPI_COMM_WORLD, &size);

       printf("Hello world! I'm %d of %d\n", rank, size);
       MPI_Finalize();
       return 0;
   }
   remotehost% mpicc -o /home/joeuser/simple.x /home/joeuser/simple.c
   remotehost% exit
   localhost% globus-job-run remotehost -x '(jobtype=mpi)' -np 5 /home/joeuser/simple.x
   Hello world! I'm 4 of 5
   Hello world! I'm 3 of 5
   Hello world! I'm 2 of 5
   Hello world! I'm 1 of 5
   Hello world! I'm 0 of 5


(Search) (Feedback) (Home)


© Pittsburgh Supercomputing Center (PSC), SuperComputing Science Consortium.
URL:  http://sc-2.psc.edu/saber/index.html
Revised: Friday, 18-Jan-2008 12:02:36 EST