As a core member of the Roche Science Infrastructure (RSI) team, the HPC Integration Engineer will be responsible for the end-to-end shared HPC solution and its ability to be effectively used to support scientific and end user services. Building relationships and working closely with developers and operators of external services as well as members of the RSI team, Enterprise Operations and Engineering groups, the successful candidate will rely on their experience, knowledge and expertise to support, enhance and optimize the use of shared HPC by the scientific computing community globally at Roche. The successful candidate will be responsible for defining the strategy for how enhanced services that utilize HPC resources interact, providing and end-to-end solution, participating and contributing to the broader scientific community. The HPC Integration Engineer will also be responsible for contributing to the development of the overall elastic compute strategy, integrating new capabilities into HPC. Recognized as an expert in the field, they will provide technical consultancy to other members of Infrastructure Services, with demonstrated complex problem solving abilities. Some experience in mentoring and leading others in small team environments is highly desirable. The position is global and may be placed in one of several geographic locations, with South San Francisco highly preferred, followed by Kaiseraugst, Switzerland.
Collaborating with various scientific computing communities assisting in the adoption and utilization of a globally distributed and accessible hybrid HPC.
Building a competency for the new Elastic Compute capabilities and services.
Providing consulting, guidance and best practices for utilizing advanced scheduling and hybrid computing for scientific computing with Elastic Compute Services.
Working with Service Reliability Engineers (SRE’s) to enable continuous enhancement and deployment maintaining stability & compatibility with scientific computing services and solutions, monitoring & metrics/KPI collection.
Working with (job scheduler, Linux, storage) RSI Service owners to support the technical needs of scientific computing groups.
Working with key stakeholders in assisting with the integration of scientific solutions that utilize HPC services.
Working with both RSI Engineers and key stakeholders, develop standards and processes for build, test, deployment and lifecycle processes of scientific applications.
The principal HPC Integration Engineer will be an experienced IT or Science professional. With a Bachelor’s degree (advanced degree preferred) in a relevant field of technology, science or business, possessing the following qualifications:
5-10 years of experience as a High-Performance Computing consumer/developer with extensive experience as part of or working with scientific computing groups working across multiple disciplines.
In-depth knowledge of HPC environments and the various ways it can be used to support larger scientific computing ecosystems.
Familiarity with hybrid computing and how to leverage scalable computing through job scheduling API's.
Knowledge of configuration management tools, CI/CD testing & delivery and tool chains (e.g., Ansible, Jenkins).
Familiarity with web services and kubernetes and how they can be integrated or leverage HPC services and environments.
Experience with scientific workloads and workload profiling and optimizing.
Familiarity and hands on experience in using HPC Job Schedulers and parallel file systems.
Working knowledge of scripting and programming languages such as C, C++, Fortran, Bash, CSH, TSCH, Perl, Python.
Good organization skills to balance and prioritize work, and ability to multitask.
Good communication skills to communicate with support personnel, customers, and managers.