Chapter 9: A Distributed Grid Application
A first test
The example in this first test and the following complete workflow example utilize a 24 minute full quality DV format file named movie.dv. The size of this file makes it prohibitive to offer for download, however it is rather easy to create using various programs on nearly any operating system.
The movie.dv file used to test this tutorial was constructed using iMovie on Mac OS X. A video clip can be imported into iMovie and edited to 24 minutes in length. It may then be exported to a .dv file using the File -> Share -> Quicktime -> Full Quality DV option.
We will run a first test by slicing off the first two minutes of the input movie using the mencoder command. As user jane do the following:
[jane@nodeA dvstream]$ mencoder movie.dv -ss 00:00:00 -endpos 00:02:00 -ovc copy -oac pcm -o slice01
We will now send of the two minute slice to nodeB to be re-encoded as mpeg4. We will begin by first sending the job to the simple fork jobmanager.
Create a job description RSL file so that is looks as shown below:
[jane@nodeA ~]cat encode-slice01.rsl
<job>
<executable>/usr/bin/mencoder</executable>
<directory>${GLOBUS_USER_HOME}</directory>
<argument>slice01</argument>
<argument>-vf</argument>
<argument>harddup</argument>
<argument>-ovc</argument>
<argument>lavc</argument>
<argument>-oac</argument>
<argument>lavc</argument>
<argument>-lavcopts</argument>
<argument>vcodec=mpeg4:vqmax=4:acodec=ac3:abitrate=128</argument>
<argument>-of</argument>
<argument>avi</argument>
<argument>-o</argument>
<argument>mpeg4_slice01</argument>
<stdout>slice01.stdout</stdout>
<stderr>slice01.stderr</stderr>
<fileStageIn>
<transfer>
<sourceUrl>gsiftp://nodea.ps.univa.com/home/jane/slice01</sourceUrl>
<destinationUrl>file:///${GLOBUS_USER_HOME}/slice01</destinationUrl>
<rftOptions>
<parallelStreams>4</parallelStreams>
</rftOptions>
</transfer>
</fileStageIn>
<fileStageOut>
<transfer>
<sourceUrl>file:///${GLOBUS_USER_HOME}/mpeg4_slice01</sourceUrl>
<destinationUrl>gsiftp://nodea.ps.univa.com/home/
jane/mpeg4_slice01</destinationUrl>
<rftOptions>
<parallelStreams>4</parallelStreams>
</rftOptions>
</transfer>
<transfer>
<sourceUrl>file:///${GLOBUS_USER_HOME}/slice01.stdout</sourceUrl>
<destinationUrl>gsiftp://nodea.ps.univa.com/home/
jane/slice01.stdout</destinationUrl>
</transfer>
<transfer>
<sourceUrl>file:///${GLOBUS_USER_HOME}/slice01.stderr</sourceUrl>
<destinationUrl>gsiftp://nodea.ps.univa.com/home/
jane/slice01.stderr</destinationUrl>
</transfer>
</fileStageOut>
<fileCleanUp>
<deletion>
<file>file:///${GLOBUS_USER_HOME}/slice01</file>
</deletion>
<deletion>
<file>file:///${GLOBUS_USER_HOME}/mpeg4_slice01</file>
</deletion>
<deletion>
<file>file:///${GLOBUS_USER_HOME}/slice01.stdout</file>
</deletion>
<deletion>
<file>file:///${GLOBUS_USER_HOME}/slice01.stderr</file>
</deletion>
</fileCleanUp>
</job>
Note that in the .rsl file above we have added options to the RFT service (<rftOptions>) so that 4 parallel data streams are being used to transfer the input files. We are doing this because the files are a bit large and we want to maximize the data transfer rate. We could further tune the transfer parameters but will not do so in this tutorial.
With the .rsl file created, submit the job to the fork jobmanager on nodeB:
[jane@nodeA ~]$ globusrun-ws -submit -S -F https://nodeb.ps.univa.com:8443/wsrf/services/ManagedJobFactoryService -f encode-slice01.rsl
Delegating user credentials...Done.
Submitting job...Done.
Job ID: uuid:9e656590-a575-11da-b131-0011d8b1eb22
Termination time: 02/25/2006 20:39 GMT
Current job state: StageIn
Current job state: Active
Current job state: StageOut
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
Cleaning up any delegated credentials...Done.
After a few minutes you should see the job complete and the output shown below. The stdout from the job will have been transferred back to nodeA along with the stderr:
[jane@nodeA ~]$ ls -l
total 467956
drwxr-xr-x 2 jane users 4096 Feb 24 14:33 dvstream
-rw-r--r-- 1 jane users 1877 Feb 24 14:28 encode-slice01.rsl
-rw-r--r-- 1 jane users 32 Feb 23 14:00 helloworld.txt
-rw-r--r-- 1 jane users 29821376 Feb 24 14:40 mpeg4_slice01
-rw-r--r-- 1 jane users 463 Feb 23 14:04 simple-stage-job.rsl
-rw-r--r-- 1 jane users 448583340 Feb 24 11:44 slice01
-rw-r--r-- 1 jane users 142 Feb 24 14:40 slice01.stderr
-rw-r--r-- 1 jane users 273170 Feb 24 14:40 slice01.stdout
The newly encoded file is in the file named mpeg4_slice01.
You can use 'cat' to check the stderr from the remote job. It should look something like this (ignore the messages):
[jane@nodeA ~]$ cat slice01.stderr
ERROR: Could not open required DirectShow codec qdv.dll.
Skipping frame!
Skipping frame!
Skipping frame!
Skipping frame!
Skipping frame!
The stdout from the job should look something like this:
[jane@nodeA ~]$ head slice01.stdout
MEncoder dev-Fedora-GS-CVS-051128-12:25-4.0.1 (C) 2000-2005 MPlayer Team
CPU: Intel Pentium 4/Celeron D Prescott; Xeon Nocona (Family: 15, Stepping: 4)
CPUflags: Type: 15 MMX: 1 MMX2: 1 3DNow: 0 3DNow2: 0 SSE: 1 SSE2: 1
Compiled with runtime CPU detection - WARNING - this is not optimal!
To get best performance, recompile MPlayer with --disable-runtime-cpudetection.
success: format: 0 data: 0x0 - 0x1abcd6ac
AVI file format detected.
VIDEO: [dvc ] 720x480 24bpp 29.673 fps 28486.3 kbps (3477.3 kbyte/s)
[V] filefmt:3 fourcc:0x20637664 size:720x480 fps:29.67 ftime:=0.0337
Now try running the job via the PBS jobmanager:
[jane@nodeA ~]$ globusrun-ws -submit -S -F https://nodeb.ps.univa.com:8443/wsrf/services/ManagedJobFactoryService -Ft PBS -f encode-slice01.rsl
Delegating user credentials...Done.
Submitting job...Done.
Job ID: uuid:c16dc97e-a589-11da-9d3e-0011d8b1eb22
Termination time: 02/25/2006 23:03 GMT
Current job state: StageIn
Current job state: Pending
Current job state: Active
Current job state: StageOut
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
Cleaning up any delegated credentials...Done.
The output should be the same:
[jane@nodeA ~]$ ls -l
total 467956
drwxr-xr-x 2 jane users 4096 Feb 24 14:46 dvstream
-rw-r--r-- 1 jane users 1877 Feb 24 14:28 encode-slice01.rsl
-rw-r--r-- 1 jane users 32 Feb 23 14:00 helloworld.txt
-rw-r--r-- 1 jane users 29821376 Feb 24 17:04 mpeg4_slice01
-rw-r--r-- 1 jane users 463 Feb 23 14:04 simple-stage-job.rsl
-rw-r--r-- 1 jane users 448583340 Feb 24 14:34 slice01
-rw------- 1 jane users 299 Feb 24 17:04 slice01.stderr
-rw------- 1 jane users 273170 Feb 24 17:04 slice01.stdout
Now submit the job in batch mode and check the status periodically using the job end point reference (EPR) file. The EPR file is essentially the contact string for the job. You can have 'globusrun-ws' save it to a file using the '-o' flag:
[jane@nodeA ~]$ globusrun-ws -submit -batch -o slice01.epr -S -F https://nodeb.ps.univa.com:8443/wsrf/services/ManagedJobFactoryService -Ft PBS -f encode-slice01.rsl
Delegating user credentials...Done.
Submitting job...Done.
Job ID: uuid:3b08bea0-a58b-11da-9773-0011d8b1eb22
Termination time: 02/25/2006 23:13 GMT
To check the status of the submitted job use 'globusrun-ws' with the '-status' and '-job-epr' flags:
[jane@nodeA ~]$ globusrun-ws -status -job-epr-file slice01.epr
Current job state: Active
[jane@nodeA ~]$ globusrun-ws -status -job-epr-file slice01.epr
Current job state: Active
[jane@nodeA ~]$ globusrun-ws -status -job-epr-file slice01.epr
Current job state: Active
[jane@nodeA ~]$ globusrun-ws -status -job-epr-file slice01.epr
Current job state: CleanUp
[jane@nodeA ~]$ globusrun-ws -status -job-epr-file slice01.epr
Current job state: Done
When the job is completed the results should be the same:
[jane@nodeA ~]$ ls -l
total 467960
drwxr-xr-x 2 jane users 4096 Feb 24 14:46 dvstream
-rw-r--r-- 1 jane users 1877 Feb 24 14:28 encode-slice01.rsl
-rw-r--r-- 1 jane users 32 Feb 23 14:00 helloworld.txt
-rw-r--r-- 1 jane users 29821376 Feb 24 17:15 mpeg4_slice01
-rw-r--r-- 1 jane users 463 Feb 23 14:04 simple-stage-job.rsl
-rw-r--r-- 1 jane users 448583340 Feb 24 14:34 slice01
-rw-r--r-- 1 jane users 475 Feb 24 17:13 slice01.epr
-rw------- 1 jane users 299 Feb 24 17:15 slice01.stderr
-rw------- 1 jane users 273170 Feb 24 17:15 slice01.stdout
You can repeat the tests on nodeC:
[jane@nodeA ~]$ globusrun-ws -submit -S -F https://nodec.ps.univa.com:8443/wsrf/services/ManagedJobFactoryService -f encode-slice01.rsl
Delegating user credentials...Done.
Submitting job...Done.
Job ID: uuid:44884840-a58d-11da-b452-0011d8b1eb22
Termination time: 02/25/2006 23:28 GMT
Current job state: StageIn
Current job state: Active
Current job state: StageOut
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
Cleaning up any delegated credentials...Done.
[jane@nodeA ~]$ ls -l
total 467956
drwxr-xr-x 2 jane users 4096 Feb 24 14:46 dvstream
-rw-r--r-- 1 jane users 1877 Feb 24 14:28 encode-slice01.rsl
-rw-r--r-- 1 jane users 32 Feb 23 14:00 helloworld.txt
-rw-r--r-- 1 jane users 29821376 Feb 24 17:29 mpeg4_slice01
-rw-r--r-- 1 jane users 463 Feb 23 14:04 simple-stage-job.rsl
-rw-r--r-- 1 jane users 448583340 Feb 24 14:34 slice01
-rw-r--r-- 1 jane users 142 Feb 24 17:29 slice01.stderr
-rw-r--r-- 1 jane users 273170 Feb 24 17:29 slice01.stdout
Now using SGE:
[jane@nodeA ~]$ globusrun-ws -submit -S -F https://nodec.ps.univa.com:8443/wsrf/services/ManagedJobFactoryService -Ft SGE -f encode-slice01.rsl
Delegating user credentials...Done.
Submitting job...Done.
Job ID: uuid:92c22c06-a58d-11da-95ac-0011d8b1eb22
Termination time: 02/25/2006 23:30 GMT
Current job state: StageIn
Current job state: Active
Current job state: StageOut
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
Cleaning up any delegated credentials...Done.
[jane@nodeA ~]$ ls -l
total 467956
drwxr-xr-x 2 jane users 4096 Feb 24 14:46 dvstream
-rw-r--r-- 1 jane users 1877 Feb 24 14:28 encode-slice01.rsl
-rw-r--r-- 1 jane users 32 Feb 23 14:00 helloworld.txt
-rw-r--r-- 1 jane users 29821376 Feb 24 17:32 mpeg4_slice01
-rw-r--r-- 1 jane users 463 Feb 23 14:04 simple-stage-job.rsl
-rw-r--r-- 1 jane users 448583340 Feb 24 14:34 slice01
-rw-r--r-- 1 jane users 142 Feb 24 17:32 slice01.stderr
-rw-r--r-- 1 jane users 273170 Feb 24 17:32 slice01.stdout
|