HPC Cluster Administrator III

17 Nov 2024

Vacancy expired!

Administrator Unique III (FIII3) - North Chicago, IL |

Job

# ABBVJP13377

AbbVie | w2 contract | 6 months (possible extension)

40/hr work weeks | Mon-Fri | Business Hours (on-site 3-4 days & remote 1-2 days per week)

Responsibilities & Duties
  • Primary responsibility: supports and administers on-premises Linux resources, including HPC Linux clusters and scientific Linux desktops. 2 on-premise HPC clusters that we're looking for someone to assist with. 1 with 50 nodes and another with 8 nodes that's going to be refreshed at the end of the year. Looking for HPC expertise. Investigating issues that people report, assisting with application or software install on a cluster, monitoring of the clusters.
  • Specific responsibilities include:
  • Troubleshoot and resolve hardware, Operating System and application issues in HPC and desktop environments
  • Work with systems staff to enhance our Configuration Management infrastructure
  • Evaluate performance impacts of planned operating system changes
  • Update and expand existing systems monitoring capabilities
  • Develop automation tools for systems administration
  • Maintain account management procedures to support growing number of researchers and scientists
  • Provide technical support to researchers using HPC resources, troubleshoot problems and develop appropriate computational strategies
  • Consult and collaborate with scientist co-workers to determine best system configurations for applications

Requirements

QUALIFICATIONS & SKILLS
  • Minimum of 5 years RedHat, CentOS and/or Ubuntu Linux system administration experience
  • Minimum of 3 years scripting experience with Bash, Perl or Python
  • Minimum 3 years HPC Linux cluster administration, including job queueing and resource management, parallel filesystems and network management
  • Prior experience with configuration management tools, such as Ansible, Chef, Puppet
  • Prior experience with physical Linux hardware
  • NFS experience.

Preference
  • Experience with HPC job queueing systems such as SLURM or PBS
  • Experience with HPC management software such as Bright Cluster Manager
  • Experience managing parallel and cluster file systems such as NFS, BeeGFS, GPFS, Weka, or Lustre
  • Network management experience especially Infiniband
  • Experience with GPU compute in HPC and desktop environments
  • Experience with container technologies such as Docker and Singularity
  • Environment module experience with tcl modules or lua modules
  • Linux physical hardware, whether that be interacting with servers or workstations.
  • On the smaller Cluster, we have Bright Cluster Manager that is also what the replacement Cluster will be using, so having this experience is a plus.
  • Parallel file system knowledge.
  • BeeGFS experience
  • GPFS experience
  • Lustre or Weka experience.
  • GPUs experience.
  • Docker or singularity experience.
  • Environment Modules experience.

Attributes For Success
  • Provide technical expertise to improve HPC cluster management, performance, and resiliency
  • Analytical thinking and problem solving skills for effective troubleshooting and problem resolution
  • Ability to work both independently and as part of the team; flexibility in dealing with assignments and in working on several projects simultaneously
  • Ability to effectively communicate with people of diverse backgrounds and computer knowledge

  • ID: #22910795
  • State: Illinois Northchicago 60064 Northchicago USA
  • City: Northchicago
  • Salary: $55 - $85
  • Job type: Contract
  • Showed: 2021-11-17
  • Deadline: 2021-12-17
  • Category: Et cetera