This page is a bad fast google translate of the french version.
Fundamentals
The computational servers are shared and accessible to all members of the laboratory. However access is granted provided people have hpc_users group membership (associated with mailing list subscription). Ask any sysadmin to get that membership.
Some users have higher priority on servers. The list of prioritized users is in the file /etc/security/limits.conf as you can see. If you think you have the right to be in this list, contact the HPC referent for your team (or your team manager). Only teams directly involved in the financing of servers are eligible.
Some serveurs are fully shared and do not have priorities : sully and roadrunner.
Server features
All servers are installed in Ubuntu 18.04 LTS 64bit. They are cloned on the same model as the desktop workstations, thus facilitating program prototyping on clients before running on servers.
Name | Processors | Core | RAM | GPU | Disk space | Maple | Magma | Matlab |
---|---|---|---|---|---|---|---|---|
N315L-G16G01.ressource.unicaen.fr (ace) | 2 processeurs Intel Xeon E5-2620 v4 2.1 GHz | 32 | 128 G | 4 GeForce GTX TITAN X 3072 Cores 12G RAM | 3,7 To | X | ||
N315L-G12C02.ressource.unicaen.fr (appo) | 4 processeurs AMD Opteron 6282SE 2,6 GHz | 64 | 512 G | 2,5 To | X | |||
N315L-G13G01.ressource.unicaen.fr (bly) | 2 processeurs Intel Xeon E5-2680 v2 2.80GHz | 40 | 512 G | 2 Tesla K40M 2880 Cores 12G RAM | 9,9 To | X | ||
N315L-G12C01.ressource.unicaen.fr (cody) | 4 processeurs AMD Opteron 6282SE 2,6 GHz | 64 | 512 G | 2,5 To | X | |||
N315L-G17G01.ressource.unicaen.fr (colt) | 2 processeurs Intel Xeon E5-2640 v4 2.4GHz | 40 | 256 G | 8 MSI GeForce GTX 1080 Ti 3584 Cores 11G RAM | 3,5 To | X | ||
N315L-G14G01.ressource.unicaen.fr (jet) | 2 processeurs Intel Xeon E5-2680 v3 2.50GHz | 48 | 512 G | 2 Tesla K80 4992 Cores 24G RAM (2496 Cores and 12G per GPU) | 1,2 To | X | ||
N315L-G17G02.ressource.unicaen.fr (neyo) | 2 processeurs Intel Xeon E5-2620 v4 2.10GHz | 32 | 512 G | 4 ASUS GeForce GTX 1080 Ti 3584 Cores 11G RAM | 2,4 To | X | ||
N315L-G18G01.ressource.unicaen.fr (SRV-GPU01) | 2 processeurs Intel Xeon Gold 5115 2.40GHz | 40 | 256 G | 8 PNY GeForce GTX 1080 Ti 3584 Cores 11G RAM | 3,5 To | X | ||
N315L-G18G02.ressource.unicaen.fr (SRV-GPU02) | 2 processeurs Intel Xeon E5-2640 v4 2.40GHz | 40 | 256 G | 8 MSI GeForce GTX 1080 Ti 3584 Cores 11G RAM | 3,5 To | X | ||
sully | max 32 | N/A | max 128 GB | X | X | |||
N315L-G17G03.ressource.unicaen.fr (yorn) | 2 processeurs Intel Xeon E5-2620 v4 2.10GHz | 32 | 512 G | 4 ASUS GeForce GTX 1080 Ti 3584 Cores 11G RAM | 1,8 To | X |
Supervision
At every moment, you can view load of computation servers for 24 hours at : http://supervision.greyc.fr.
Warning, graphs are updated every two minutes.
Best practices
Here are some common sense rules that should allow smooth experiments.
Do not start more processes (or threads) that cores on the server
Limit as possible outputs (stdout or stderr) as they slow down significantly the calculations.
Priority use the server's local disk space
It is forbidden to use the NFS service directories for I / O.
It is not recommended to use the other NFS services to make calculations with intensive needs I / O.
Do not let htop permanently because it consumes CPU control.
It is desirable to use the screen to control the console to recover in case of failure of the network connection. If you are allergic to screen, you can use nohup to minimum.
Available disk space
The servers are more disc space height and variables.
Chemin | Intensive I / O | Size | Quotas | Backup |
---|---|---|---|---|
/data/ | Yes | by server | No. Access granted to hpc_users | No |
/ceph/ | Yes | 65To | No. Access granted to hpc_users | No ( Replicated data ) |
/cluster/ | not recommended | 9,6To | No. Access granted to hpc_users | On request |
/home/ | forbidden | 1To | Yes, 20Go | Yes |
The directory / cluster / contains shared data, possibly subject to licenses (corpus). The Access Control List (ACL) to allow the access only to authorized persons rights. If you do not have access to a resource, contact the owner.
Screen cheat sheet
Useful commands to type in a normal shell (bash type):
Start screen: screen
Reconnect from another console: screen-r
Share the screen several: screen-x
Steal the screen: screen-d-r
List screen already launched: screen-ls
Inside the software screen, some interesting shortcuts:
Ctrl-a c: creation of a terminal
Ctrl-a k kill (destroy) the current terminal
Ctrl-a A: Rename the current terminal
Ctrl-a n: next terminal
Ctrl-a p: previous terminal
Ctrl-a NUM: NUM go to terminal
Ctrl-a d: off (turn the screen background software, for later retrieval)
Priority principles
The following is valid only on servers where priorities have been established:
Every user has a server running 15 default priority
Partially priority users are 10
Priority is 5
For a list of priorities for a server, look at the contents of /etc/security/limits.conf. In case of bad priority assignment, consult your reference:
Team Image: A. Lechervy
Team MAD: L. Jeanpierre.
In case of urgent need related to deadlines, priority teams, the referees are trying to better organize the distribution calculations on the server, if necessary, requesting cleaning system administrators.
Access from outside the laboratory
SSH gateways are reachable from outside the laboratory. Using the generic name gw.greyc.fr should lead you to one of them. For other servers, you will need to use a rebound per ssh gateways (proxy jump).
Staff whose laptop is managed or co-managed by DSI Campus 2 support (machines identified C301L-XXXXXX or N302L-XXXXXX) can request VPN access by contacting support. This solution is more convenient if you have to perform data transfers.
The request for VPN access to the establishment's network is to be made on the Request Management site: https://gedemande.unicaen.fr/ .
Maintenance and communication
To avoid disrupting the current experiments, the servers are restarted after asking the owners of calculation running. Communication of maintenance will only be to the HPC-users list GREYC. If you do not subscribe to this list, please make the request to your reference or an administrator if you do not have a referent.
Operation cuda
On a machine with an NVIDIA graphics card, you can run CUDA process. How to:
- Check the status with "nvidia-smi" . This command should return information about NVIDIA graphics card(s) installed in the machine . If it is not the case, contact an administrator.
-
Edit your PATH and LD_LIBRARY_PATH to use the CUDA environment:
- export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/lib/:$LD_LIBRARY_PATH
- export PATH=/usr/local/cuda/bin:$PATH
- Test cards : /usr/local/cuda/extras/demo_suite/deviceQuery
Other computing resources available to GREYC members
See CRIANN.