Getting Started Using the PolyHub Grid

The PolyHub grid is implemented as an Open Science Grid (OSG) Virtual Organization. We are currently in the process of formalizing our affiliation with OSG and creating an OSG PolyHub VO support center. The central PolyHub grid site (located at the University of Tennessee) is now available for PolyHub users. To use these resources, continue and complete the following steps.

Obtaining a Personal Grid Certificate

All communication between users and systems on the Open Science Grid is secured, authenticated, and authorized via a public key infrasturcture and chain of trust system. Both users and compute systems must have a valid grid certificate in order to communicate (sending secure email, executing grid jobs, accessing grid data storage, etc.). PolyHub members must first obtain a grid certificate from the DOE Grids Certificate Service. As a PolyHub EVO member, you should register for your certificate with the following information (specific to the PolyHub VO):

Registration Authority (affiliation) OSG
Virtual Organization name PolyHub*
Name of Sponsor Brian Edwards
Sponsor's Email bedward1@utk.edu
Sponsor's Phone Number (865) 974-9596

* Once the PolyHub OSG VO status is formalized, you will be able to register under the VO name "PolyHub".

You will recieve an email containing information on obtaining your grid certificate file.

PolyHub VO Registration

Once you have your grid certificate, you need to register to be in the PolyHub VO. You do this through the PolyHub VOMS service. You must have the grid certificate imported into your browser along with the DOEGrids certificate authority files. Once your registration is processed, you will have access to all OSG sites that support the PolyHub VO.

Installing the OSG client software

To access grid resources, you will need to install a minimal set of software. This software is distributed by the OSG through the Virtual Data Toolkit. You will need to have access to a computer with the Linux OS. This can be your desktop machine or a shared Linux system (on a compute cluster for instance). The OSG privides a good HOW-TO on installing the OSG client software. Here is an example of installing it on your Linux desktop:

# first install "pacman", a package management system
wget http://physics.bu.edu/pacman/sample_cache/tarballs/pacman-3.16.1.tar.gz
tar --no-same-owner -xzvf pacman-3.16.1.tar.gz
cd pacman-3.16.1
source setup.sh
cd ..

# next use pacman to install the OSG client: "OSG:client"
export VDT_LOCATION=/usr/local/OSG
mkdir $VDT_LOCATION
cd $VDT_LOCATION
pacman -get OSG:ce
# answer some questions
# pacman downloads and installs the software in the cwd
vi vdt/etc/vdt-update-certs.conf # set up the CAs
source setup.sh # or setup.csh

Now you should be ready to use the grid.

Activating your Grid Certificate

To use the grid client software, you need to configure your environment with your grid certificate. This is referred to as obtaining a grid proxy. First, make sure that your grid certificate is located in the directory ~/.globus/ (this is the default location that the software searches). Make sure you have set up your OSG environment (source setup.sh) and execute grid-proxy-init:

> source $VDT_LOCATION/setup.sh
> grid-proxy-init
Enter GRID pass phrase for this identity:
Your identity: /DC=org/DC=doegrids/OU=People/CN=YourName
Creating proxy .............................................. Done
Your proxy is valid until: Wed Jun  4 21:01:27 2008

Now test your installation against the polyhub grid site:

> globusrun -a -r osg.polyhub.org
GRAM Authentication test successful

Using Grid Storage Resources

Once you have an active grid proxy (from your grid certificate) you can use it to copy data to and from the grid site. This is usually required before running a grid job unless you have prearranged for the proper files to be on the grid site. The minimum compatibility protocol for OSG data transfer is !GridFTP. You can use the globus-url-copy utility for this(provided as part of the OSG:client package):

> source $VDT_LOCATION/setup.sh
> grid-proxy-init
> globus-url-copy file:////home/username/job.sh \
     gsiftp://osg.polyhub.org/data/EVO/username/job.sh

Good documentation on OSG data access is available at opensciencegrid.org->Consuming_storage

Using Grid Computing Resources

Once you have data and your application copied to the grid site, you are ready to execute your grid job.

> source $VDT_LOCATION/setup.sh
> grid-proxy-init
# first a test
> globus-job-run osg.polyhub.org/jobmanager /usr/bin/uptime
# now submit the job
> globus-job-submit osg.polyhub.org/jobmanager /path/to/application

Further details are available at opensciencegrid.org->Running Jobs

More Information

The Open Science Grid has a wealth of information available to users to assist in using the OSG for their work. Consult the OSG Technical Documentation for more information.