Skip to main content

Computing Servers

This page is a bad fast google translate of the french version.

Fundamentals

The computational servers are shared and accessible to all members of the laboratory. However access is granted provided people have hpc_users group membership (associated with mailing list subscription). Ask any sysadmin to get that membership.

Some users have higher priority on servers. The list of prioritized users is in the file /etc/security/limits.conf as you can see. If you think you have the right to be in this list, contact the HPC referent for your team (or your team manager). Only teams directly involved in the financing of servers are eligible.

Some serveurs are fully shared and do not have priorities : sully and roadrunner.

 

Server features

All servers are installed in Ubuntu 18.04 LTS 64bit. They are cloned on the same model as the desktop workstations, thus facilitating program prototyping on clients before running on servers.

Nom Processeurs Cœurs RAM GPU Espace disque Maple Magma Matlab
ace 2 processeurs Intel Xeon E5-2620 v4 2.1 GHz 32 128 G 4 GeForce GTX TITAN X 3072 Cores 12G RAM 3,7 To     X
appo 4 processeurs AMD Opteron 6282SE 2,6 GHz 64 512 G   2,5 To     X
bly 2 processeurs Intel Xeon E5-2680 v2 2.80GHz 40 512 G 2 Tesla K40M 2880 Cores 12G RAM 9,9 To     X
cody 4 processeurs AMD Opteron 6282SE 2,6 GHz 64 512 G   2,5 To     X
colt 2 processeurs Intel Xeon E5-2640 v4 2.4GHz 40 256 G 8 MSI GeForce GTX 1080 Ti 3584 Cores 11G RAM 1,9 To     X
fox (not available) 4 processeurs AMD Opteron 6174 2,2 GHz 48 256 G   2,5 To
 
    X
jet 2 processeurs Intel Xeon E5-2680 v3 2.50GHz 48 512 G 2 Tesla K80 4992 Cores 24G RAM (2496 Cores and 12G per GPU) 1,2 To     X
neyo 2 processeurs Intel Xeon E5-2620 v4 2.10GHz 32 512 G 4 ASUS GeForce GTX 1080 Ti 3584 Cores 11G RAM 2,4 To     X
rex (not available) 4 processeurs AMD Opteron 6174 2,2 GHz 48 256 G   2,5 To     X
SRV-GPU01 2 processeurs Intel Xeon Gold 5115 2.40GHz 40 256 G 8 PNY GeForce GTX 1080 Ti 3584 Cores 11G RAM 3,7 To     X
SRV-GPU02 2 processeurs Intel Xeon E5-2640 v4 2.40GHz 40 256 G 8 MSI GeForce GTX 1080 Ti 3584 Cores 11G RAM 3,7 To     X
sully max 32 N/A max 128 GB     X X  
yorn 2 processeurs Intel Xeon E5-2620 v4 2.10GHz 32 512 G 4 ASUS GeForce GTX 1080 Ti 3584 Cores 11G RAM 1,8 To     X

Supervision

At every moment, you can view load of computation servers for 24 hours at : http://supervision.greyc.fr.
Warning, graphs are updated every two minutes.

 

Best practices

Here are some common sense rules that should allow smooth experiments.

Do not start more processes (or threads) that cores on the server

Limit as possible outputs (stdout or stderr) as they slow down significantly the calculations.

Priority use the server's local disk space

It is forbidden to use the NFS service directories for I / O.

It is not recommended to use the other NFS services to make calculations with intensive needs I / O.

Do not let htop permanently because it consumes CPU control.

It is desirable to use the screen to control the console to recover in case of failure of the network connection. If you are allergic to screen, you can use nohup to minimum.

 

Available disk space

The servers are more disc space height and variables.

Chemin Intensive I / O Size Quotas Backup
/data/ Yes by server No. Access granted to hpc_users No
/ceph/ Yes 65To No. Access granted to hpc_users No ( Replicated data )
/cluster/ not recommended 9,6To No. Access granted to hpc_users On request
/home/ forbidden 1To Yes, 20Go Yes


The directory / cluster / contains shared data, possibly subject to licenses (corpus). The Access Control List (ACL) to allow the access only to authorized persons rights. If you do not have access to a resource, contact the owner.

 

Screen cheat sheet

Useful commands to type in a normal shell (bash type):

Start screen: screen

Reconnect from another console: screen-r

Share the screen several: screen-x

Steal the screen: screen-d-r

List screen already launched: screen-ls

Inside the software screen, some interesting shortcuts:

Ctrl-a c: creation of a terminal

Ctrl-a k kill (destroy) the current terminal

Ctrl-a A: Rename the current terminal

Ctrl-a n: next terminal

Ctrl-a p: previous terminal

Ctrl-a NUM: NUM go to terminal

Ctrl-a d: off (turn the screen background software, for later retrieval)

 

Priority principles

The following is valid only on servers where priorities have been established:

Every user has a server running 15 default priority

Partially priority users are 10

Priority is 5

For a list of priorities for a server, look at the contents of /etc/security/limits.conf. In case of bad priority assignment, consult your reference:

Team Image: A. Lechervy

Team MAD: L. Jeanpierre.

In case of urgent need related to deadlines, priority teams, the referees are trying to better organize the distribution calculations on the server, if necessary, requesting cleaning system administrators.

 

Access from outside the laboratory

SSH gateways are reachable from outside the laboratory. Using the generic name gw.greyc.fr should lead you to one of those. For other servers, you must, at the technical and skills rebound use ssh gateways (proxy jump) or configure your VPN access to DSI Unicaen network. The second solution is more convenient if you have to perform data transfers. However, VPN access is only granted to people with standardized machines.

 

Maintenance and communication

To avoid disrupting the current experiments, the servers are restarted after asking the owners of calculation running. Communication of maintenance will only be to the HPC-users list GREYC. If you do not subscribe to this list, please make the request to your reference or an administrator if you do not have a referent.

 

Operation cuda

On a machine with an NVIDIA graphics card, you can run CUDA process. How to:

  1. Check the status with "nvidia-smi" . This command should return information about NVIDIA graphics card(s) installed in the machine . If it is not the case, contact an administrator.
  2. Edit your PATH and LD_LIBRARY_PATH to use the CUDA environment:
    1. export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/lib/:$LD_LIBRARY_PATH
    2. export PATH=/usr/local/cuda/bin:$PATH
  3. Test cards :  /usr/local/cuda/extras/demo_suite/deviceQuery