An L2 HPC (High-Performance Computing) Engineer with an application skillset is responsible for supporting, troubleshooting, and maintaining HPC infrastructure and assisting users with scientific and engineering applications. They operate between infrastructure and application layers, ensuring optimal performance and availability of both.
Core Responsibilities:
HPC Cluster Support:
Manage day-to-day operations of HPC clusters (Slurm, PBS, LSF), monitor jobs, and node health, and manage user issues at L2.
Application Support & Optimization:
Support scientific/engineering applications (ANSYS, Gaussian, GROMACS, OpenFOAM, etc.) including installation, configuration, tuning, and parallel execution optimization (MPI/OpenMP).
User & Job Management:
Handle user access, and environment setup (modules, environment variables), and resolve job scheduling issues.
Performance Monitoring:
Use tools like Ganglia, Prometheus, or Nagios to monitor cluster and job performance.
OS & Middleware Maintenance:
Perform updates and patching of OS (Linux/RHEL/CentOS), compilers (Intel, GNU), and libraries (MPI, BLAS, CUDA).
Collaboration:
Work with L3/engineering teams for complex issues and contribute to environment upgrades or migrations.
Software Module Systems (Lmod/Environment Modules)
Preferred Experience:
4-6 years in HPC environments
Exposure to GPU workloads
Understanding of parallel computing fundamentals
* Ability to interact with application end-users and researchers
Beware of fraud agents! do not pay money to get a job
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.