Vacancy expired!
- Work with systems staff to enhance our Configuration Management infrastructure
- Evaluate performance impacts of planned operating system changes
- Update and expand existing systems monitoring capabilities
- Develop automation tools for systems administration
- Maintain account management procedures to support growing number of researchers and scientists
- Provide technical support to researchers using HPC resources, troubleshoot problems and develop
- appropriate computational strategies
- Consult and collaborate with scientist co-workers to determine best system configurations for applications
- Minimum of 5 years RedHat, CentOS and/or Ubuntu Linux system administration experience
- Minimum of 3 years scripting experience with Bash, Perl or Python
- Minimum 3 years HPC Linux cluster administration, including job queueing and resource
- management, parallel filesystems and network management
- Prior experience with configuration management tools, such as Ansible, Chef, Puppet
- Prior experience with physical Linux hardware Preferred
- Experience with HPC job queueing systems such as SLURM or PBS
- Experience with HPC management software such as Bright Cluster Manager
- Experience managing parallel and cluster file systems such as NFS, BeeGFS, GPFS, Weka, or Lustre
- Network management experience especially Infiniband
- Experience with GPU compute in HPC and desktop environments
- Experience with container technologies such as Docker and Singularity