University Computing Systems
HPC linux cluster - cappl.njit.edu
The goal of this page is to make available a general amount of information on the
cappl linux cluster to its users.
General Information about cappl
AFS on cappl
Software Available on cappl
User Logins
Printing from cappl
Getting Help
General Information
The cappl cluster consists of 16 Dell Poweredge 1750s which make up the slave nodes
and a single Dell Poweredge 1750 for the master node. All of the nodes, including the
master have 1GB RAM and 2 Intel(R) Xeon(TM) Processors operating at 2.80GHz.
The Operating System installed on the master node is RedHat Enterprise Linux Version
4.0 (Nahant Update 3) and is running Linux kernel 2.6.
The slave nodes are connected via Gigabit Ethernet (GigE) to a cisco catalyst XXX switch.
There are various Message Passing Interfaces available for use on cappl. See MPI
below for additional details.
The cluster management software in use is warewulf which provides a framework
for managing clusters. A single image is created with only the essential
components needed for operation. This image is stored on the master node and
each slave node pulls a copy of this image at boot time, making each slave node
essentially diskless. After each node boots up, a RAM disk stores an image of
the OS, which is a small 45MB.
The disks on the master node are setup in a RAID 0 (mirrored) configuration.
If a disk failure should occur on one of the disks, the system will continue
to be operational.
Master node disk layout:
FileSystem Size Purpose
/ 12.0GB Root File System
/boot 2.0GB OS boot loader files (GRUB)
/home 37.0GB User Home Directories
/opt 11.0GB Locally installed software
/usr/vice 3.4GB AFS client files and disk cache
Disk layout on each slave node:
FileSystem Size Purpose
/scratch 76.0GB Scratch space available for temporary storage
/usr/vice/cache 3.5GB The AFS disk cache used buy afsd.
[ swap ] 2.0GB Local Swap Space
The following NFS mounts are mounted from the master:
/opt
/home
AFS on cappl
As previously mentioned, the cappl cluster is an AFS client.
version (1.4.0) of the OpenAFS Software is running on all of the slave nodes
and the master.
A local disk cache is used to cache AFS files to local disk space which speeds up
access to frequently accessed files.
When one logs in to cappl.njit.edu using ssh one will obtain their AFS token,
providing that ssh keys have not been setup. See: http://web.njit.edu/all_topics/SSH
for additional information on SSH keys.
To check the status of your AFS token, run:
/usr/bin/tokens
and the output should be similar to:
Tokens held by the Cache Manager:
User’s (AFS ID 22966) tokens for afs@cad.njit.edu [Expires Apr 27 14:47]
--End of list--
In order to be able to read from and write into your AFS home directory, you
must obtain your AFS token.
At the present time there is no way to securely obtain one’s AFS kerberos
token on each of the slave nodes when running MPI jobs. Due to this
limitation the use of a local home directory on cappl (/home) will
need to be used instead of using your AFS home directory. However, one
will still be able to use software that is installed in AFS without
first obtaining a token. This limitation only effects the use of one’s
AFS home directory during MPI jobs.
Software
Software that is specific to cappl includes the MPI Libraries
and Sun Grid Engine Software.
The cappl cluster, including all of the slave nodes are AFS clients, so the AFS
file space is available on all nodes. Since cappl is an AFS client all software
available on all Linux AFS clients is also available on cappl.
There are a number of MPI implementations available on cappl. The available
libraries include:
MPICH
MPICH/2
LAM
OpenMPI
The above libraries are installed in /opt/mpi on cappl
The list of available compilers on cappl currently includes
gcc version 3 (3.4.5)
At this time no commercially available compilers such as Portland or Pathscale
are available on cappl.
It is important to mention that the above MPI implementations are not loaded by
default. The module command is a utility that can be used to assist the user in
setting up the needed environment, i.e, execution and library path.
There are various modules which are available, to see them issue the following
command:
module avail
--------------------------- /usr/share/modules/modulefiles --------------------
dot mpi/lam-gnu4 mpi/mpich-pgi null sge
module-cvs mpi/lam-pgi mpi/mpich2-gnu3 pathsc use.own
module-info mpi/mpich-gnu3 mpi/mpich2-gnu4 pgi52
modules mpi/mpich-gnu4 mpi/ompi-gnu3 pgi60
mpi/lam-gnu3 mpi/mpich-pathsc mpi/ompi-gnu4 pgi61
The above shows all modules which are available to be loaded
The sge module is used for Sun Grid Engine. In order to run jobs on cappl, all users
must have the sge module loaded. By default, users who source
/afs/cad/solaris/local/etc/std-cshrc in their ~/.cshrc files will have the sge module
loaded.
A user may see what modules are loaded by executing the following command:
module list
Currently Loaded Modulefiles:
1) sge
The above output shows that the sge module is currently loaded.
if the sge module was not loaded, i.e., in the case of the user not sourcing
/afs/cad/solaris/local/etc/std-cshrc in their .cshrc file, the user can load
the sge module with the following command:
module load sge
Additional information on the module command can be found on the module man
page.
User Logins
Your login information, i.e, username and password is the same on cappl as
it is on all other AFS clients.
Printing
There are currently no printers configured on cappl
Support
The cappl cluster is managed by University Computing Systems, IST. Any
questions regarding usage or operation of the cluster should be directed to
sys@oak.njit.edu .
Before contacting UCS, users should consult the Sun Grid Engine usage page
to see if the answer to a particular question can be found there.