GridSAM Quick Start Guide

Introduction

The quick start document guides you through the steps to deploy an instance of the GridSAM web service as well as submitting and monitoring your first JSDL job. For advance deployment scenario and configuration, please consult the Deployment Guide in the GridSAM Service module.

Installing GridSAM using the OMII Stack Installer

The OMII distribution bundles GridSAM in the Managed Programme component. Please follow the instruction given in the OMII Distribution 3.0 User Guide Managed Programme section to install GridSAM.

Once you have verified the GridSAM service and client have been setup successfully, you can proceed to the section "Submitting a Job" in this guide.

Installing GridSAM from the binary distribution

If you are installing GridSAM separately from the OMII 3 release. You need to follow the instruction given here.

Pre-requisites

The following pre-requisites and the indirect dependencies must be installed and tested prior to installing GridSAM.

  • Open Middleware Infrastructure Institute - Client (3.3.x or later): Open Middleware Infrastructure Institute - Client (2.3.x or later) must be installed and tested according to the OMII installation procedure. This is ONLY REQUIRED if you intend to use the GridSAM client tools. The installation directory will be referred to as OMIICLIENT_HOME in this document. The GridSAM tools assume the default to be $HOME/OMIICLIENT unless specified.
  • Open Middleware Infrastructure Institute - Server (3.3.x or later): Open Middleware Infrastructure Institute - Server (3.x.x) must be installed and tested according to the OMII installation procedure. This is ONLY REQUIRED if you intend to deploy a GridSAM service instance. This will be referred to as OMII_HOME in this document. The GridSAM tools assume this to be /usr/local/OMII unless specified.
  • Apache Ant - 1.6.5+: Apache Ant (version 1.6.5 or above) must be installed and available in your PATH in order to install the GridSAM client and server distribution without using the OMII 2.0.0 Stack Installer.

Downloading GridSAM

Official GridSAM binary releases can be downloaded here. You can also obtain the continuous build to enjoy the latest bug fixes and cutting-edge features.

Distributions are packaged in .tar.gz and .zip formats. Packages named GridSAM-X.Y.Z-server.[tar.gz|zip] is the server binary distribution, while GridSAM-X.Y.Z-client.[tar.gz|zip] is the client binary distribution. Package named GridSAM-X.Y.Z-src.[tar.gz|zip] contains the source distribution.

To follow the Quick Start Guide, you must download the the server and client distribution

To build a binary release from source code, please consult the Developer Guide for instruction.

Installing the GridSAM service

  1. Please ensure the OMII container is not running.
  2. Place the GridSAM-X.Y.Z-server.[tar.gz|zip] archive in a temporary directory (e.g. /tmp).
  3. Unpack the archive using the GNU tar or the zip command.
    $> cd /tmp
    
    $> tar zxvf GridSAM-X.Y.Z-server.tar.gz
    
    or
    
    $> unzip GridSAM-X.Y.Z-server.zip
    
    $> ls -l gridsam-server
    drwxr-xr-x    8 wwhl  wwhl       272 May 27 12:26 .
    drwx------   18 wwhl  wwhl       612 May 27 13:10 ..
    -rw-r--r--    1 wwhl  wwhl    100872 May 26 15:01 LICENCE.3RDPARTY.txt
    -rw-r--r--    1 wwhl  wwhl      1531 May 26 15:01 LICENCE.txt
    -rw-r--r--    1 wwhl  wwhl      4274 May 27 12:21 build.xml
    drwxr-xr-x    3 wwhl  wwhl       102 May 26 15:01 config
    drwxr-xr-x   43 wwhl  wwhl      1462 May 27 12:26 docs
    -rw-r--r--    1 wwhl  wwhl  16385683 May 27 12:21 gridsam.war
  4. If you have a previous version of GridSAM service installed, you need to first uninstall it using the command,
    $> ant uninstall -Domii.server.home=${OMII_HOME} -Dtomcat.home=${OMII_HOME}/jakarta-tomcat-5.0.25
  5. To install, execute the installation Ant script,
    $> cd gridsam-server
    $> ant install -Domii.server.home=${OMII_HOME} -Dtomcat.home=${OMII_HOME}/jakarta-tomcat-5.0.25
  6. Look for the line "BUILD SUCCESSFUL" in the output which indicates the installation has completed successfully.

Installing the GridSAM client

  1. Place the GridSAM-X.Y.Z-client.[tar.gz|zip] archive in a temporary directory (e.g. /tmp).
  2. Unpack the archive using the GNU tar or the zip command.
    $> cd /tmp
    
    $> tar zxvf GridSAM-X.Y.Z-client.tar.gz
    
    or
    
    $> unzip GridSAM-X.Y.Z-client.zip
    
    $> ls -l gridsam-client
    drwxr-xr-x    9 wwhl  wwhl     306 May 27 12:26 .
    drwx------   18 wwhl  wwhl     612 May 27 13:10 ..
    -rw-r--r--    1 wwhl  wwhl  100872 May 26 15:01 LICENCE.3RDPARTY.txt
    -rw-r--r--    1 wwhl  wwhl    1531 May 26 15:01 LICENCE.txt
    drwxr-xr-x   12 wwhl  wwhl     408 May 27 12:21 bin
    -rw-r--r--    1 wwhl  wwhl    3098 May 27 12:21 build.xml
    drwxr-xr-x    3 wwhl  wwhl     102 May 26 15:01 data
    drwxr-xr-x   43 wwhl  wwhl    1462 May 27 12:26 docs
    drwxr-xr-x   15 wwhl  wwhl     510 May 27 12:21 lib
  3. If you have a previous version of GridSAM client installed, you need to first uninstall it using the command,
    $> ant uninstall -Domii.client.home=${OMIICLIENT_HOME}
  4. To install, execute the installation Ant script,
    $> cd gridsam-client
    $> ant install -Domii.client.home=${OMII_CLIENT_HOME}
  5. Look for the line "BUILD SUCCESSFUL" in the output which indicates the installation has completed successfully.

Running the GridSAM Service

Restart the OMII container after the installation has completed.

$> cd ${OMII_HOME}/jakarta-tomcat-5.0.25/bin
$> ./start_base.sh
Starting up tomcat
Using CATALINA_BASE:   /usr/local/OMII/jakarta-tomcat-5.0.25
Using CATALINA_HOME:   /usr/local/OMII/jakarta-tomcat-5.0.25
Using CATALINA_TMPDIR: /usr/local/OMII/jakarta-tomcat-5.0.25/temp
Using JAVA_HOME:       /usr/local/jdk1.4
Waiting....!....!.... Started.
$>

The default setup provides local fork launching. Please consult the Deployment Guide for advance configuration instruction.

To verify the GridSAM service is deployed successfully, consult the GridSAM log file $OMII_HOME/jakarta-tomcat-5.0.25/logs/gridsam.log and look for the lines

...
2005-04-20 09:29:37,394 INFO  [JobManagerConfigurator] (main:) GridSAM machinery initialising...
2005-04-20 09:29:39,189 INFO  [JobManagerConfigurator] (main:) loading module description from classpath jobmanager.xml
2005-04-20 09:29:46,597 INFO  [JobManagerConfigurator] (main:) GridSAM machinery initialised
...

The GridSAM service is initialised correctly if there is no ERROR messages originated from the gridsam webapp in the log file.

You may use the installation test tool to verify the service has been properly deployed

$> cd /tmp/gridsam-service
$> ant test-install -Dtomcat.port=${YOUR_TOMCAT_PORT}

Submitting a JobSubmitting a Job

Assuming you have access to a GridSAM service (i.e. The OMII container trusts the certificate configured in your OMII Client distribution) and you have been given the URL to a WSDL document describing the remote GridSAM service. You can start submitting jobs to the GridSAM instance.

For example, you have deployed a GridSAM instance on the localhost by following the Quick Start guide. To ensure the service is running, you can use a browser to inspect the WSDL document of the GridSAM instance at http://localhost:18080/gridsam/services/gridsam?WSDL.

Create a Job Submission Description Language (JSDL) document to describe the job you would like to submit, for example

<?xml version="1.0" encoding="UTF-8"?>
<JobDefinition xmlns="http://schemas.ggf.org/jsdl/2005/11/jsdl">
    <JobDescription>
        <Application>
            <POSIXApplication xmlns="http://schemas.ggf.org/jsdl/2005/11/jsdl-posix">
                <Executable>/bin/echo</Executable>
                <Argument>hello world</Argument>
            </POSIXApplication>
        </Application>
    </JobDescription>
</JobDefinition>

The JSDL document describes a POSIX job, which executes the /bin/echo executable with the argument "hello world". Save this file to a directory of your choosing (e.g. $HOME/helloworld.jsdl).

To submit the job to a GridSAM instance, you can use the command gridsam-submit

Tip: The gridsam-* commands are installed in gridsam/bin inside the OMII Client distribution. Add the directory to your PATH environment, so that the gridsam-* commands are available to your shell.

$> ${OMIICLIENT_HOME}/gridsam/bin/gridsam-submit -help
usage: gridsam-submit [options] path-to-jsdl
GridSAM Job Submission Client
 -j <path-to-jsdl>               path to the JSDL file
 -myproxy                        enable myproxy
 -myproxyhost <hostname>         hostname of the MyProxy server
 -myproxypassword                password to access the MyProxy
                                 credential. The user will be prompted if the value is omitted.
 -myproxyport <port-number>      port of the MyProxy server. If omitted,
                                 default to 7512
 -myproxyuser <username>         username of the MyProxy credential. If
                                 omitted, the system property user.name is used.
 -s <service-endpoint-address>   The endpoint address of the GridSAM
                                 service
 -sn <service-name>              Logical name of the GridSAM service
                                 defined in ~/.gridsam/services.properties
                    
                
$> ${OMIICLIENT_HOME}/gridsam/bin/gridsam-submit \
    -s "http://localhost:18080/gridsam/services/gridsam" \
    ${HOME}/helloworld.jsdl
urn:gridsam:ff80808201afccc00101afccc3900001

The gridsam-submit command writes the job identifier to the standard output upon successful submission. The job identifier can then be used to obtain status or control the execution of the submitted job.

Monitoring a Job

Given a GridSAM job identifier, the user who submitted the job (i.e.. identified by the certificate configured in the OMII client installation) can use the gridsam-status command to query the state of the job.

$> ${OMIICLIENT_HOME}/gridsam/bin/gridsam-status -help
usage: gridsam-status [-s service-endpoint-address | -sn service-name] [-j job-id]
                      [-o omii-conf-dir] [-xml] [-wssec on | off] job-id
GridSAM Job Status Client
 -j <job-id>                     ID of the job to be queried
 -s <service-endpoint-address>   The endpoint address of the GridSAM
                                 service
 -sn <service-name>              Logical name of the GridSAM service
                                 defined in ~/.gridsam/services.properties
 -x                              display job status as GridSAM JobStatus
                                 XML


$> ${OMIICLIENT_HOME}/gridsam/bin/gridsam-status -s "http://localhost:18080/gridsam/services/gridsam" \
    -j urn:gridsam:ff80808201afccc00101afccc3900001
Job Progress: pending -> staging-in -> staged-in -> active -> executed -> staging-out -> staged-out -> done

-- pending - 2005-01-26T16:13:47+00:00 --
job is being scheduled
-- staging-in - 2005-01-26T16:13:47+00:00 --
Staging files...
-- staged-in - 2005-01-26T16:13:47+00:00 ---
0 files staged in
-- active - 2005-01-26T16:13:48+00:00 --
'/bin/echo hello world' is being forked
-- executed - 2005-01-26T16:13:48+00:00 --
'/bin/echo hello world' completed with exit code 0
-- staging-out - 2005-01-26T16:13:48+00:00 --
Staging files out...
-- staged-out - 2005-01-26T16:13:48+00:00 --
0 files staged out
-- done - 2005-01-26T16:13:48+00:00 --
Job completed

**************
Job Properties
**************
urn:gridsam:exitcode=0

The gridsam-status command displays the progress and properties associated with a job previously submitted to the GridSAM instance identified by the -j parameter.

To retrieve a machine-parsable output, use the -x parameter to instruct the command to display the job status as a JobStatus/ XML document.

$> ${OMIICLIENT_HOME}/gridsam/bin/gridsam-status \
        -s "http://localhost:18080/gridsam/services/gridsam" \
        -x urn:gridsam:ff80808201afccc00101afccc3900001
<?xml version="1.0" encoding="UTF-8"?>
<ns1:JobStatus xmlns:ns1="http://www.icenigrid.org/service/gridsam">
  <ns1:Stage>
    <ns1:State>pending</ns1:State>
    <ns1:Description>job is being scheduled</ns1:Description>
    <ns1:Time>2005-01-26T16:13:47+00:00</ns1:Time>
  </ns1:Stage>
  <ns1:Stage>
    <ns1:State>staging-in</ns1:State>
    <ns1:Description>Staging files...</ns1:Description>
    <ns1:Time>2005-01-26T16:13:47+00:00</ns1:Time>
  </ns1:Stage>
  <ns1:Stage>
    <ns1:State>staged-in</ns1:State>
    <ns1:Description>0 files staged in</ns1:Description>
    <ns1:Time>2005-01-26T16:13:47+00:00</ns1:Time>
  </ns1:Stage>
  <ns1:Stage>
    <ns1:State>active</ns1:State>
    <ns1:Description>'/bin/echo hello world' is being forked</ns1:Description>
    <ns1:Time>2005-01-26T16:13:48+00:00</ns1:Time>
  </ns1:Stage>
  <ns1:Stage>
    <ns1:State>executed</ns1:State>
    <ns1:Description>'/bin/echo hello world' completed with exit code 0</ns1:Description>
    <ns1:Time>2005-01-26T16:13:48+00:00</ns1:Time>
  </ns1:Stage>
  <ns1:Stage>
    <ns1:State>staging-out</ns1:State>
    <ns1:Description>Staging files out...</ns1:Description>
    <ns1:Time>2005-01-26T16:13:48+00:00</ns1:Time>
  </ns1:Stage>
  <ns1:Stage>
    <ns1:State>staged-out</ns1:State>
    <ns1:Description>0 files staged out</ns1:Description>
    <ns1:Time>2005-01-26T16:13:48+00:00</ns1:Time>
  </ns1:Stage>
  <ns1:Stage>
    <ns1:State>done</ns1:State>
    <ns1:Description>Job completed</ns1:Description>
    <ns1:Time>2005-01-26T16:13:48+00:00</ns1:Time>
  </ns1:Stage>
  <ns1:Property name="urn:gridsam:exitcode">0</ns1:Property>
</ns1:JobStatus>

Terminating a Job

Given a GridSAM job identifier, the user who submitted the job (i.e.. identified by the certificate configured in the OMII client installation) can use the gridsam-terminate command to terminate a running job.

$> ${OMIICLIENT_HOME}/gridsam/bin/gridsam-terminate \
        -s "http://localhost:18080/gridsam/services/gridsam" \
        urn:gridsam:ff80808201afccc00101afccc3900001

Job termination in GridSAM happens asynchronously. You can use the gridsam-status command to determine whether the job has been completely terminated.

Advance Job Submission

File Staging

A job usually involves data files that need to be staged into the remote job execution environment and staged out at the end of the execution.

The GridSAM service supports a variety of data staging methods (e.g. FTP, HTTP, webdav, SFTP). For unsecured usage, the GridSAM distribution bundles a small anonymous FTP server for users to stage files in/out from the machine they are executing the gridsam-submit command.

WARNING: Please ensure you understand the security implication of FTP before considering using this mode of data staging. You are potentially opening up your file system for remote anonymous access by using the gridsam-ftp-server command.

For example, the following JSDL document describes a job that runs the /bin/cat executable on the virtual files dir1/file1.txt and dir2/subdir1/file2.txt made available by the GridSAM service by staging in the files from http:// and ftp:// sources respectively. The standard output and standard error streams of the application are written to the virtual files stdout.txt and stderr.txt. The virtual file stdout.txt will be staged out to the ftp:// target upon successful execution of the job.

<JobDefinition xmlns="http://schemas.ggf.org/jsdl/2005/11/jsdl">
    <JobDescription>
        <Application>
            <POSIXApplication xmlns="http://schemas.ggf.org/jsdl/2005/11/jsdl-posix">
                <Executable>/bin/cat</Executable>
                <Argument>dir1/file1.txt dir2/subdir1/file2.txt</Argument>
                <Output>stdout.txt</Output>
                <Error>stderr.txt</Error>
            </POSIXApplication>
        </Application>
        <DataStaging>
            <FileName>dir1/file1.txt</FileName>
            <CreationFlag >overwrite</CreationFlag>
            <Source>
                <URI>http://www.doc.ic.ac.uk/~wwhl/download/helloworld.txt</URI>
            </Source>
        </DataStaging>
        <DataStaging>
            <FileName>dir2/subdir1/file2.txt</FileName>
            <CreationFlag>overwrite</CreationFlag>
            <Source>
                <URI>ftp://anonymous@myhost:19245/input/file.txt</URI>
            </Source>
        </DataStaging>
        <DataStaging>
            <FileName>stdout.txt</FileName>
            <CreationFlag>overwrite</CreationFlag>
            <Target>
                <URI>ftp://anonymous@myhost:19245/output/file.txt</URI>
            </Target>
        </DataStaging>
    </JobDescription>
</JobDefinition>

To use the gridsam-ftp-server command to serve files in the directory /my/datastaging/directory through anonymous FTP,

$> ${OMIICLIENT_HOME}/gridsam/bin/gridsam-ftp-server
    2005-04-20 09:44:29,701 FATAL [GridSAMFTPServer] (main:) Invalid command-line options: -p-d
    usage: gridsam-ftp-server: GridSAM Anonymous FTP Server
     -d <dir>    Root Directory
     -l          Server only available to localhost
     -p <port>   FTP Port

                
$> ${OMIICLIENT_HOME}/gridsam/bin/gridsam-ftp-server -d /tmp/datastaging/directory -p 19245
2005-01-26 17:00:51,958 WARN  [GridSAMFTPServer] (main:) . is exposed through FTP at ftp://anonymous@127.0.0.2:19245/
2005-01-26 17:00:51,963 WARN  [GridSAMFTPServer] (main:) Please make sure you understand the security implication of using anonymous FTP for file staging.
FtpServer.server.config.root.dir = /my/datastaging/directory
FtpServer.server.config.data = /home/testuser/.gridsam/ftp-19106c7:101aff7dd50:-8000
FtpServer.server.config.server.host = myhost.somedomain.net
FtpServer.server.config.port = 19245
Started FTP

The directory /my/datastaging/directory would be exposed as the root directory of the FTP server accessible on the running host at port 19245. It can be accessed with any FTP client, for example on UNIX

$> ftp myhost.somedomain.net 19245
Connected to myhost.somedomain.net.
220 Service ready for new user
Name (myhost.somedomain.net:yourname): anonymous
331 Guest login ok, send your complete e-mail address as password
Password:
230 User logged in, proceed
Remote system type is UNIX.
ftp>

For testing purpose on the same host, you can use the -l parameter to instruct the command to make the FTP server only available to the localhost, otherwise the server would be bounded to the public network interface of the host.

GridFTP File Staging and MyProxy

GridSAM supports GridFTP file transfer (gsiftp://) and GRAM submission. In order to delegate the identity of the user to the GridSAM server, a JSDL extension is introduced in GridSAM to allow a MyProxy credential to be retrieved and used by GridSAM to act on behalf of the user.

User can add the myproxy:MyProxy/ element into the JSDL manually (jsdl:JobDefinition/myproxy:MyProxy), for example

<?xml version="1.0" encoding="UTF-8"?>
<JobDefinition xmlns="http://schemas.ggf.org/jsdl/2005/11/jsdl">
    <JobDescription>
        <POSIXApplication xmlns="http://schemas.ggf.org/jsdl/2005/11/jsdl-posix">
            <Executable>/bin/echo</Executable>
            <Argument>hello world</Argument>
        </POSIXApplication>
    </JobDescription>
    <MyProxy xmlns="urn:gridsam:myproxy">
        <ProxyServer>myproxy.ncsa.uiuc.edu</ProxyServer>
        <ProxyServerDN>/C=US/O=National Center for Supercomputing Applications/CN=bosco.ncsa.uiuc.edu</ProxyServerDN>
        <ProxyServerPort>7512</ProxyServerPort>
        <ProxyServerUserName>myusername</ProxyServerUserName>
        <ProxyServerPassPhrase>mypassphrase</ProxyServerPassPhrase>
        <ProxyServerLifetime>7512</ProxyServerLifetime>
    </MyProxy>
</JobDefinition>

Or let the command-line tool to do the hard work

$> ${OMIICLIENT_HOME}/gridsam/bin/gridsam-submit -sn ngs \
        -myproxy -myproxyuser myusername -myproxyhost myproxy.ncsa.uiuc.edu \
        -j ${HOME}/helloworld.jsdl
MyProxy passphrase: ********
urn:gridsam:ff80808201afccc00101afccc3900001

The credential will be used by the GridSAM service when a GridFTP file staging or a Globus GRAM submission is requested.

MPI Application

GridSAM has implemented a non-standard JSDL extension for executing MPI application. To submit a MPI application, your JSDL needs to specify an mpi:MPIApplication element similar to the jsdl-posix:POSIXApplication.

<JobDefinition xmlns="http://schemas.ggf.org/jsdl/2005/11/jsdl">
    <JobDescription>
        <Application>
            <mpi:MPIApplication xmlns="http://schemas.ggf.org/jsdl/2005/11/jsdl-posix" xmlns:mpi="urn:gridsam:mpi">
                <Executable>/usr/local/applications/chemistry/dlpoly_3.01/DLPOLY.Y</Executable>
                <Output>stdout.txt</Output>
                <Environment name="NGSMODULES">gm:dlpoly:intel-math</Environment>
                <mpi:ProcessorCount>8</mpi:ProcessorCount>
            </mpi:MPIApplication>
        </Application>
        <DataStaging>
            <FileName>stdout.txt</FileName>
            <CreationFlag>overwrite</CreationFlag>
            <DeleteOnTermination>true</DeleteOnTermination>
            <Target>
                <URI>ftp://gridsam.lesc.doc.ic.ac.uk:45521/public/test.txt</URI>
            </Target>
        </DataStaging>
    </JobDescription>
</JobDefinition>

The mpi:MPIApplication element is an extension to the standard jsdl-posix:POSIXApplication element. The mpi:ProcessorCount element specifies the number of processors to be used by the application. Here's an example to execute the DLPOLY application on the National Grid Service. MPI support is currently only implemented by the Globus 2.4.3 plugin. Please consult your service administrator to see whether MPI is supported.

Frequently used GridSAM Services

For most gridsam-* command, the user needs to specify the URL to the WSDL document describing the location of the GridSAM service to be used. This is cumbersome to remember and type.

GridSAM clients allow user to specify the service by name by storing frequently used service locations in a properties file.

Create a directory .gridsam in the user's home directory

$> mkdir ~/.gridsam

Create a file named services.properties in the directory you have just created. The content of the file contains a list of name-value pairs. The value being the endpoint address (URI) of the the GridSAM service. For example

imperial=http://www.ic.ac.uk/gridsam/services/gridsam
ngs=http://www.ngs.ac.uk/gridsam/services/gridsam

Once the entries are defined, user can refer to the service by name. For example, to use the Imperial GridSAM service,

$> ${OMIICLIENT_HOME}/gridsam/bin/gridsam-submit -sn imperial uname.jsdl

Shutting down GridSAM

GridSAM persists and restarts unfinished job upon container restarts. However, it is still wise to shutdown the container gracefully.

$> cd ${OMII_HOME}/jakarta-tomcat-5.0.25/bin
$> ./shutdown_base.sh
Shutting down tomcat
Server running under PID of 17171 detected
Waiting... HTTP server stopped.
Waiting Process stopped.

You should see a message in the GridSAM log file $OMII_HOME/jakarta-tomcat-5.0.25/logs/gridsam.log

..
2005-04-20 09:29:46,597 INFO  [JobManagerConfigurator] (main:) GridSAM machinery initialised
2005-04-20 09:49:13,080 INFO  [JobManagerConfigurator] (main:) GridSAM machinery shutdown properly
..

What next?

By now, You should have successfully deployed the GridSAM Web Service that uses the Fork DRMConnector for launching jobs. You should have successfully submitted a simple job to a GridSAM instance and monitored the state change of a job using the GridSAM tools.

If you will be administering a GridSAM Web Service, you are advised to consult the Deployment Guide in the GridSAM Service module to configure the GridSAM service for production usage according to your architectural requirement and backend infrastructure.

If you are an end-user of a GridSAM Web Service, you are advised to read through the Job Submission Description Language specification to understand the features of the language. Also, you should study the GridSAM JSDL feature matrix to understand what you can/cannot do with the current GridSAM implementations. You can find more information on the GridSAM client-side tools in the User Guide in the GridSAM Client module.