Chapter 3: Deploying Sun Grid Engine (SGE)
Testing SGE
As the root user you should make sure that the SGE daemons are running:
| [root@nodeC sge-root]# ps auwwwx|grep sge |
| root |
9159 |
0.0 |
0.3 |
106340 |
3800 |
? |
Sl |
10:43 |
0:00 |
/opt/sge-root/bin/ |
| lx24-x86/sge_qmaster |
| root |
9179 |
0.0 |
0.2 |
48424 |
2400 |
? |
Sl |
10:43 |
0:00 |
/opt/sge-root/bin/ |
| lx24-x86/sge_schedd |
| root |
9610 |
0.0 |
0.1 |
5176 |
1820 |
? |
S |
10:53 |
0:00 |
/opt/sge-root/bin/ |
| lx24-x86/sge_execd |
If the SGE daemons are not running simply run the following three commands as root:
/opt/sge-root/bin/lx24-x86/sge_qmaster
/opt/sge-root/bin/lx24-x86/sge_schedd
/opt/sge-root/bin/lx24-x86/sge_execd
Also as the root user you can check the state of the compute node and the queue:
| [root@nodeC sge-root]# /opt/sge-root/bin/lx24-x86/qstat -f |
| queuename |
qtype |
used/tot. |
load_avg |
arch |
states |
| all.q@nodeC.ps.univa.com |
BIP |
0/2 |
0.00 |
lx24-x86 |
|
Before submitting a job you need to add nodeC as a node from which submitting jobs is allowed. You can do that using the 'qconf' command as shown below:
[root@nodeC sge-root]# /opt/sge-root/bin/lx24-x86/qconf -as nodec
nodeC.ps.univa.com added to submit host list
Next you can submit a simple test job as shown:
[root@nodeC sge-root]# /opt/sge-root/bin/lx24-x86/qsub /opt/sge-root/examples/jobs/simple.sh
Your job 1 ("simple.sh") has been submitted.
You can query for the state of the job using 'qstat' as shown:
| [root@nodeC sge-root]# /opt/sge-root/bin/lx24-x86/qstat |
| job-ID |
prior |
name |
user |
state |
submit/start at |
| 1 |
0.55500 |
simple.sh |
root |
r |
02/13/2006 11:07:36 |
...continued
| qeue |
slots |
ja-task-ID |
| all.q@nodeC.ps.univa.com |
1 |
|
| [root@nodeC sge-root]# /opt/sge-root/bin/lx24-x86/qstat |
| job-ID |
prior |
name |
user |
state |
submit/start at |
| 1 |
0.55500 |
simple.sh |
root |
r |
02/13/2006 11:07:36 |
...continued
| qeue |
slots |
ja-task-ID |
| all.q@nodeC.ps.univa.com |
1 |
|
| [root@nodeC sge-root]# /opt/sge-root/bin/lx24-x86/qstat -f |
| queuename |
qtype |
used/tot. |
load_avg |
arch |
states |
| all.q@nodeC.ps.univa.com |
BIP |
0/2 |
0.00 |
lx24-x86 |
|
Next use the "Jane User" account to test and make sure that a non-root user can submit and run jobs:
[root@nodeC sge-root]# su - jane
Before submitting a job the environment for 'jane' needs to be set up:
[jane@nodeC ~]$ export SGE_ROOT=/opt/sge-root
[jane@nodeC ~]$ source /opt/sge-root/default/common/settings.sh
User jane can check the state of SGE:
| [jane@nodeC ~]$ /opt/sge-root/bin/lx24-x86/qstat -f |
| queuename |
qtype |
used/tot. |
load_avg |
arch |
states |
| all.q@nodeC.ps.univa.com |
BIP |
0/2 |
0.00 |
lx24-x86 |
|
User jane can submit a job as shown:
[jane@nodeC ~]$ /opt/sge-root/bin/lx24-x86/qsub /opt/sge-root/examples/jobs/simple.sh
Your job 2 ("simple.sh") has been submitted.
User jane can query on a job's state as shown:
| [jane@nodeC ~]$ /opt/sge-root/bin/lx24-x86/qstat |
| job-ID |
prior |
name |
user |
state |
submit/start at |
| 1 |
0.00000 |
simple.sh |
jane |
qw |
02/13/2006 11:12:57 |
...continued
| qeue |
slots |
ja-task-ID |
| all.q@nodeC.ps.univa.com |
1 |
|
When the job completes user jane should find two files, one for stdout from the job and one for stderr from the job:
[jane@nodeC ~]$ ls
simple.sh.e2 simple.sh.o2
[jane@nodeC ~]$ cat simple.sh.o2
Mon Feb 13 11:13:06 CST 2006
Mon Feb 13 11:13:26 CST 2006
|