University Computing Systems
HPC Linux cluster - cappl.njit.edu
General Information
AFS
Monitor Usage
MPI
Printing
Software Available
Support
User Logins
General Information
- Specifications
- The cappl cluster consists of 16 Dell Poweredge 1750s, which make up the
slave nodes, and a single Dell Poweredge 1750 for the master node. All
of the nodes, including the master, have 1GB RAM and 2 Intel(R) Xeon(TM)
processors operating at 2.80GHz.
- The Operating System installed on the master node is RedHat Enterprise
Linux Version 4.0 (Nahant Update 3) and is running Linux kernel 2.6.
- The slave nodes are connected via Gigabit Ethernet (GigE) to a Cisco
Catalyst XXX switch. There are various Message Passing Interfaces
available for use on cappl. See MPI below for
additional details.
- The cluster management software in use is
warewulf, which provides
a framework for managing clusters. A single image is created with only
the essential components needed for operation. This image is stored on
the master node and each slave node pulls a copy of this image at boot
time, making each slave node essentially diskless. After each node boots
up, a RAM disk stores an image of the OS, which is a small 45MB.
- The disks on the master node are set up in a RAID 0 (mirrored)
configuration. If a disk failure should occur on one of the disks,
the system will continue to be operational.
- Direct connections to cappl are made using ssh only, and must be done from
within the NJIT network. If a connection to cappl must be made from outside
of the NJIT network, then a VPN client must be used, or a connection to one
of the public AFS systems afs<n>.njit.edu must be used as a jump-in point
to cappl.
Information on obtaining and using the VPN client can be found at :
http://telecom.njit.edu/vpnc/index.html
Master node disk layout:
FileSystem Size Purpose
/ 12.0GB Root File System
/boot 2.0GB OS boot loader files (GRUB)
/home 37.0GB User Home Directories
/opt 11.0GB Locally installed software
/usr/vice 3.4GB AFS client files and disk cache
Disk layout on each slave node:
FileSystem Size Purpose
/scratch 76.0GB Scratch space available for temporary storage
/usr/vice/cache 3.5GB The AFS disk cache used buy afsd.
[ swap ] 2.0GB Local Swap Space
The following NFS mounts are mounted from the master:
/opt
/home
AFS on Cappl
- As previously mentioned, the cappl cluster is an AFS client.
Version 1.4.0 of the OpenAFS Software is
running on all of the slave nodes and the master.
- A local disk cache is used to cache AFS files to local disk space which
speeds up access to frequently accessed files.
- When one logs in to cappl.njit.edu using ssh one will obtain their AFS
token, providing that ssh keys have not been setup. See: http://web.njit.edu/all_topics/SSH
for additional information on SSH keys.
- To check the status of your AFS token, run:
/usr/bin/tokens
The output should be similar to:
Tokens held by the Cache Manager:
User’s (AFS ID 22966) tokens for afs@cad.njit.edu [Expires Apr 27 14:47]
--End of list--
In order to be able to read from and write into your AFS home directory, you
must obtain your AFS token.
At the present time there is no way to securely obtain one’s AFS Kerberos
token on each of the slave nodes when running MPI jobs. Due to this
limitation the use of a local home directory on cappl (/home) will
need to be used instead of using your AFS home directory. However, one
will still be able to use software that is installed in AFS without
first obtaining a token. This limitation only affects the use of one’s
AFS home directory during MPI jobs.
Monitor usage
Ganglia
Printing
There are currently no printers configured on cappl
MPI
- There are several MPI implementations available on cappl. The available
libraries are :
- MPICH
- MPICH/2
- LAM
- OpenMPI
The above libraries are installed in /opt/mpi on cappl
- The list of available compilers on cappl currently is :
At this time no commercially available compilers, such as those from Portland
Group or Pathscale, are available on cappl.
- It is important to mention that the above MPI implementations are not
loaded by default. The module command is a utility that can be
used to assist the user in setting up the needed environment, i.e.,
the execution and library paths.
There are various modules which are available. To see them issue the
following command:
module avail
--------------------------- /usr/share/modules/modulefiles --------------------
dot mpi/lam-gnu4 mpi/mpich-pgi null sge
module-cvs mpi/lam-pgi mpi/mpich2-gnu3 pathsc use.own
module-info mpi/mpich-gnu3 mpi/mpich2-gnu4 pgi52
modules mpi/mpich-gnu4 mpi/ompi-gnu3 pgi60
mpi/lam-gnu3 mpi/mpich-pathsc mpi/ompi-gnu4 pgi61
The above shows all modules which are available to be loaded.
- The sge module is used for Sun Grid Engine. In order to run jobs on
cappl, all users must have the sge module loaded. By default, users
who source /afs/cad/solaris/local/etc/std-cshrc in their ~/.cshrc files
will have the sge module loaded.
- A user may see what modules are loaded by executing the following
command:
module list
Currently Loaded Modulefiles:
1) sge
The above output shows that the sge module is currently loaded.
- If the sge module was not loaded, i.e., in the case of the user not
sourcing /afs/cad/solaris/local/etc/std-cshrc in their ~/.cshrc file,
the user can load the sge module with the following command:
module load sge
Additional information on the module command can be found on the module man
page.
Software
- Software that is specific to cappl includes the MPI Libraries
and Sun Grid Engine Software.
- The cappl cluster, including all of the slave nodes are AFS clients, so
the AFS file space is available on all nodes. Since cappl is an AFS
client all software available on all Linux AFS clients is also available
on cappl.
Support
The cappl cluster is managed by University Computing Systems, IST. Any
questions regarding usage or operation of the cluster should be directed to
sys@oak.njit.edu.
Before contacting UCS, users should consult the Sun Grid Engine usage page
to see if the answer to a particular question can be found there.
User logins
Your login information, i.e., username and password, is the same on cappl as
it is on all other AFS clients.