Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Zest is the Syracuse University researching computing high-performance computing (HPC) cluster. Zest is a non-interactive Linux environment intended to run analyses that require extensive parallelism or running for extended durations.

Warning
iconfalse
titleIDEs and Development

Zest is not intended to be used as a development environment. Activities on the cluster should be limited to submitting jobs, doing light editing with a text editor such as nano or vim, and running small tests that use a single core for no more than a few minutes. The use of IDEs such as Jupyter, Spyder, VSCode, etc is prohibited as these programs can interfere with other users of the system or, in the worst case, impact the system as a whole. If you need a development environment please contact us and other more appropriate resources can be provided.

Tip

Looking for OrangeGrid? While similar, the Zest and OrangeGrid clusters are unique environments. Information about OrangeGrid is available on the OrangeGrid (OG) | HTCondor Support home page

On This Page

Table of Contents
maxLevel2


Accessing Zest

To access Zest, simply make an SSH connection using your NetID and specifying the login node you have been assigned. The example below uses 'its-zest-login2.syr.edu'. Refer to the access email you received from Research Computing staff with your node information. The cluster supports connection via CMD, programs like Putty, and the use of SCP and WinSCP. 

Code Block
languagebash
themeRDark
titleExample SSH Connection
ssh netid@its-zest-login2.syr.edu
Tip
titleOff Campus?

Off-campus networks cannot access the cluster directly. Users off campus need to connect to campus via Remote Desktop Services (RDS). RDS will provide you with a Windows 10 desktop that has access to your G: drive, campus OneDrive, and the research clusters. Once connected, SSH and SCP are available from the Windows command prompt or you can use Putty and WinSCP which are also installed. Full instructions and details for connection to RDS are available on the RDS home page. Note that Azure VPN is an alternative option, but not available for all users. See the Azure VPN page for more details. 

In rare cases where RDS is not an option, the research computing team may provide remote access via a bastion host.

Expand
titleConnecting Via a Bastion Host

To connect via a bastion host, first SSH to the bastion host specified by research computing staff. Note, however, this connection will require a Google Authenticator passcode. If you have not already configured the Google Authenticator app, instructions have been provided below. 

bastion loginImage Added

Once on the bastion host, simply SSH normally to the login node you have been provided an account for. 

ogloginImage Added

Steps to Set Up Google Authenticator

1) If not already installed, download/install the Google Authenticator application from the application store (Apple) or Google play (Android)

2) Use your SSH client to connect to its-condor-t1.syr.edu.  If you need to download a SSH client PuTTY is a good option for Windows and Unix. PuTTY can be downloaded from here.  Apple user can use the built-in application called Terminal.

putty, condor-t1Image Added

3) Maximize your SSH window (you will a big window to display a QR code that you will scan through the Google Authenticator application).

4) When prompted use your SU NetID and password to login.

initial puttyImage Added

5) It will display a basic instruction set for setting up your 2 factor authentication and then wait at a prompt before continuing – AGAIN BE SURE TO MAXIMIZE the SSH session window before you go to the next step to make sure the barcode will be fully displayed on the screen.


initial loginImage Added

qr codeImage Added

6) Once you continue it will display a key and barcode – use Google Authenticator Application you installed in step  to scan the barcode or enter the key.

7) This should log you in successfully,  on the subsequent logins, you’ll enter NetID password as Password prompt, then 6-digit Google Authenticator one time password at the Verification prompt. Enter the 6-digit code without any spaces even if Google Authenticator shows a space in the number string.

google auth appImage Added


Zest | Slurm Commands & Cluster Info

 Slurm Commands

Once connected, below are some basic  commands to get started. 

Code Block
languagebash
themeRDark
# Show node information. User this to view available nodes and resources.
sinfo

# Show job queue.
squeue

# Display job accounting information.
sacct

# Submit job script.
sbatch [script name]

# Start an interactive job.
salloc [options][command]

# Launch parallel job steps, usually runs in a SBATCH script.
srun [options][command]

# Display running job status.
sstat [jobid]

# Cancel a job.
scancel [jobid]

Learn more basics with the Slurm Quick Start User Guide.

Zest Cluster Local Storage

Note the default local storage locations. 

ResourceDescription

/home/NetID/

NFS based user home directory available throughout the cluster

/tmp/

Temporary fast local storage only persistent for the current job


Lmod Commands

Lmod is also available on the Zest clusters, examples below.

Code Block
languagebash
themeRDark
# Show all available modules.
module avail

# Load the module environment.
module load [name]

# Search for Module names matching string.
modulespider [string]

# Search module name or description.
module keyword [string]

# List currently loaded modules.
module list

# Unload a module from environment.
module unload [name]

# Remove all modules.
module purge

# Save currently loaded modules to collection name. 
module save [name]

# Shows all saved collections.
module savelist

# Restore modules from collection name. 
module restore [name]

# Display all Lmod options.
module help

Submitting Jobs (with Examples)

Submitting jobs on the Zest cluster requires the creation of an SBATCH script. Below are common examples including the use of MPI and GPUs.

Basic SBATCH Example

Below is a basic SBATCH example. 

Code Block
languagebash
themeRDark
#!/bin/bash
#
#SBATCH --nodes=1 
#SBATCH --ntasks=3 
#SBATCH --cpus-per-task=1
#SBATCH --mail-type=ALL
#SBATCH --mail-user=netid@syr.edu # replace netid with your NetID 
#
# This runs hostname three times (tasks) on a single node 
#
srun hostname

Assuming the above is 'job1.sh', use the 'sbatch' command to submit the job as seen below. 

Code Block
languagebash
themeRDark
netid@its-zest-login1:[~]$ sbatch job1.sh 
Submitted batch job 781
netid@its-zest-login1:[~]$ more slurm-781.out
node1002
node1002
node1002
netid@its-zest-login1:[~]$
Note

Note that the default output for jobs will be located in slurm-{jobid}.out.


MPI SBATCH Example

Use srun instead of mpirun since OpenMPI is supported by Slurm

Code Block
languagebash
themeRDark
#!/bin/bash
#
#SBATCH --nodes=3
#SBATCH --ntasks-per-node=20 
#SBATCH --cpus-per-task=1 
#SBATCH --mail-type=ALL
#SBATCH --mail-user=netid@syr.edu
#
module load imb # Loads Intel InfiniBand Benchmarks 
srun IMB-MPI1

Example GPU worker interactive job

Use interactive shell allocation to compile and test applications

Code Block
languagebash
themeRDark
# This will start an interactive shell on a Supermicro GPU system
# using 20 CPUs and 2 GPUs. If the resource is open, you’ll get a shell on 
# a worker node. Otherwise, srun will hang until resource is available.

netid@its-zest-login1:[~]$ srun --pty -p geforce -c 20 --gres=gpu:2 bash 
[netid@node1024 ~]$ cp -Rp /usr/local/cuda/samples CUDA_SAMPLES 
[netid@node1024 ~]$ cd CUDA_SAMPLES
[netid@node1024 ~/CUDA_SAMPLES]$ make -j20 all 
[netid@node1024 ~/CUDA_SAMPLES]$ exit 
netid@its-zest-login1:[~]$


GPU SBATCH Example

Ensure you are using the geforce slurm partition.

Code Block
languagebash
themeRDark
#!/bin/bash
#
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1 
#SBATCH --cpus-per-task=10 
#SBATCH --partition=geforce 
#SBATCH --gres=gpu:2 
#SBATCH --mail-type=ALL
#SBATCH --mail-user=netid@syr.edu 
#
nvidia-smi

GPU SBATCH Using MPI Example

Note the use of srun with the MPI path.

Code Block
languagebash
themeRDark
#!/bin/bash 
#
#SBATCH --nodes=3
#SBATCH --ntasks-per-node=1 
#SBATCH --cpus-per-task=10 
#SBATCH --partition=geforce 
#SBATCH --gres=gpu:2 
#SBATCH --mail-type=ALL
#SBATCH --mail-user=netid@syr.edu 
#
module load cuda
srun /home/netid/CUDA_SAMPLES/bin/x86_64/linux/release/simpleMPI



Zest FAQ

Can I Use Docker with Zest?

The cluster doesn't support Docker directly, however, you can import Docker containers into Singularity. More info on Singularity is available from here: https://docs.sylabs.io/guides/3.6/user-guide/

What packages are available on the login and worker nodes? 

Expand
titleClick here for a full list of packages...
PackageDescription
yasm   Modular Assembler   
gnu8-compilers-ohpc    The GNU C Compiler and Support Files 
ohpc-gnu8-io-libs    OpenHPC IO libraries for GNU  
ohpc-autotools   OpenHPC autotools   
openblas-gnu8-ohpc   An optimized BLAS library based on GotoBLAS2 
openmpi3-pmix-slurm-gnu8-ohpc   lmod-defaults-gnu8-openmpi3-ohpc imb-gnu8-openmpi3-ohpc   A powerful implementation of MPI 
kernel-devel   OpenHPC default login environments 
kernel-headers   Intel MPI Benchmarks (IMB)   
dkms   Development package for building kernel modules   Header files for the Linux kernel for use by glibc  Dynamic Kernel Module Support Framework
libstdc++    GNU Standard C++ Library    
boost-gnu8-openmpi3-ohpc   hwloc-ohpc   Boost free peer-reviewed portable C++ source libraries  Portable Hardware Locality   
scalapack-gnu8-openmpi3-ohpc singularity-ohpc   A subset of LAPACK routines 
gnuplot   Application and environment virtualization 
motif-devel   A program for plotting mathematical expressions and data  Development libraries and header files 
tcl-devel   Tcl scripting language development environment 
tk-devel   Tk graphical toolkit development files 
qt   Qt toolkit    
qt-devel   Development files for the Qt toolkit 
libXScrnSaver X.Org X11 libXss runtime library

 What Lmod modules are available on the login and worker nodes?

Expand
titleClick here for a list of available modules...
ModuleDescription
adios: adios/1.13.1The Adaptable IO System (ADIOS)
anaconda2: anaconda2/2019.7

Python environment

anaconda3: anaconda3/2019.7Python environment
autotoolsAutotools Developer utilities

boost: boost/1.70.0 Boost

Free peer-reviewed portable C++ source libraries

cmake: cmake/3.14.3

CMake is an open-source, cross-platform family of tools designed to build, test and package software
cuda: cuda/10-1NVIDIA CUDA libraries

gnu8: gnu8/8.3.0

GNU Compiler Family (C/C++/Fortran for x86_64)
gromacs: gromacs/2019.4

hdf5: hdf5/1.10.5

A general purpose library and file format for storing scientific data

hwloc: hwloc/2.0.3

Portable Hardware Locality
imb: imb/2018.1Intel MPI Benchmarks (IMB)
mpich: mpich/3.3.1MPICH MPI implementation
mvapich2: mvapich2/2.3.1OSU MVAPICH2 MPI implementation
netcdf: netcdf/4.6.3C Libraries for the Unidata network Common Data Form
netcdf-cxx: netcdf-cxx/4.3.0C++ Libraries for the Unidata network Common Data Form
netcdf-fortran: netcdf-fortran/4.4.5Fortran Libraries for the Unidata network Common Data Form
openblas: openblas/0.3.5An optimized BLAS library based on GotoBLAS2
openmpi3: openmpi3/3.1.4A powerful implementation of MPI
phdf5: phdf5/1.10.5A general purpose library and file format for storing scientific data
pmix: pmix/2.2.2
pnetcdf: pnetcdf/1.11.1A Parallel NetCDF library (PnetCDF)
prun: prun/1.3job launch utility for multiple MPI families
scalapack: scalapack/2.0.2A subset of LAPACK routines redesigned for heterogenous computing
singularity: singularity/3.2.1Application and environment virtualization




Getting Help

Question about Research Computing? Any questions about using or acquiring research computing resources or access can be directed at researchcomputing@syr.edu.