SGE is written and distributed by Sun Microsystems under the Sun Industry Standards Source License, and is available free.
The home page for SGE is located at: http://gridengine.sunsource.net
node000-node019 batch only jobs
node020-node039 batch or parallel jobs
node040-node065 parallel only jobs
Therefore :
The "q" commands are located in /opt/sge/bin/<dir>.
cappl : dir = glinux
hydra : dir = lx24-amd64
After the script has been customized to one's needs, it can be submitted to grid engine for execution. The qsub (/opt/sge/bin/lx24-amd64/qstat on hydra) command is used to submit jobs to the job queue :
qsub sge_script
After running the above command, users will see a message similar to the following:your job 132 ("IMB-MPI1") has been submitted
In the above example, the number "132" represents the SGE job number and "IMB-MPI1" is the name of the job that is being submitted to the queue.The qstat (/opt/sge/bin/lx24-amd64/qstat on hydra) command can be used to display information about the job queues and the running jobs.
qstat ________________________________________________________________________________ job-ID prior name user state submit/start at queue master ja-task-ID -------------------------------------------------------------------------------- 132 0 IMB-MPI1 guest23 r 05/02/2006 09:49:01 cl_name003.q MASTER 132 0 IMB-MPI1 guest23 r 05/02/2006 09:49:01 cl_name003.q SLAVE 132 0 IMB-MPI1 guest23 r 05/02/2006 09:49:01 cl_name005.q SLAVE 132 0 IMB-MPI1 guest23 r 05/02/2006 09:49:01 cl_name007.q SLAVE 132 0 IMB-MPI1 guest23 r 05/02/2006 09:49:01 cl_name009.q SLAVE 132 0 IMB-MPI1 guest23 r 05/02/2006 09:49:01 cl_name015.q SLAVE ________________________________________________________________________________ where cl_name = for cappl : cappl for hydra : node for kong : nodeThe above information can appear convoluted to someone who just wants a quick look at the number of processors their job is running on and the length of time it has been running. The userstat command can be used instead of qstat. The userstat command displays information about specific jobs: (Note that this example was run on the cluster "cappl.njit.edu." On other clusters, such as hydra.njit.edu and kong.njit.edu, the applicable node names will appear in "Host" column and applicable information will appear the total number of CPUs, Memory, etc...)
_________________________________________________________________________
BATCH QUEUE Total Jobs: 1 Active Jobs: 1
Job-ID Prior Name User State Submit/Start CPUs
132 0 IMB-MPI1 guest23 r 05/02/2006 09:49:01 5
HOSTS Total Nodes: 17 Down nodes: 0
Host CPUs Load Memory Memory Use Swap Swap Use
cluster 34 0.75 16.8G 1.1G 2.0G 160.0K
cappl 2 0.01 1010.3M 103.5M 2.0G 160.0K
cappl000 2 0.00 1010.3M 59.6M 0.0K 0.0K
cappl001 2 0.00 1010.3M 59.4M 0.0K 0.0K
cappl002 2 0.00 1010.3M 59.4M 0.0K 0.0K
cappl003 2 0.11 1010.3M 71.2M 0.0K 0.0K
cappl004 2 0.00 1010.3M 59.1M 0.0K 0.0K
cappl005 2 0.14 1010.3M 70.6M 0.0K 0.0K
cappl006 2 0.00 1010.3M 59.6M 0.0K 0.0K
cappl007 2 0.13 1010.3M 70.6M 0.0K 0.0K
cappl008 2 0.00 1010.3M 59.6M 0.0K 0.0K
cappl009 2 0.09 1010.3M 70.2M 0.0K 0.0K
cappl010 2 0.00 1010.3M 59.2M 0.0K 0.0K
________________________________________________________________________
For additional information on qstat and userstat, see their corresponding man
pages.
If a job is running in the queue and removal of the job is desired, the qdel command can be used to delete the job from the queue.
qdel 132
The above command will print a message similar to the following:guest23 has registered the job 132 for deletion
After running qdel, the job will no longer appear in the queue, since it has been removed.lynx /usr/share/doc/bsc-doc-1.0/sge/SGE.html
The URL above gives example scripts for single jobs and parallel jobs including :If you start userstat, you will see all your SGE jobs. The first indication that soemthing is wrong, is that userstat is reporting a down node (SGE lost contact with it). If you move the cursor to the job number (down arrow) and enter <Return> you will see only the nodes being used by your jobs. Next, enter "n" to move to the lower nodes window. Scrolling down will show nodes that are down.
You can enter "h" to get a userstat help screen, or "man userstat".
Grid Engine HOWTOs
Grid Engine Documents
Online Manual Pages