Share this Job

Senior HPC System-of-Systems Scientist

Date: May 4, 2022

Location: Oak Ridge, TN, US, 37830-8050

Company: Oak Ridge National Laboratory

Requisition Id 7029 

 

 

Overview: 

The INTERSECT initiative at ORNL is building interconnected “Smart Labs of the Future” enabled through a common One-ORNL system-of-systems computational infrastructure. The initiative identifies compelling autonomous science projects and develops a system-of-systems architecture to document required data management, data analysis workflows, experiment management, and AI/ML capabilities. Computer scientists, data scientists, and domain scientists then collaborate closely in the spirit of co-design to develop and integrate these capabilities within diverse “Smart Labs” across the ORNL campus.

The Data Lifecycle and Scalable Workflows group at Oak Ridge National Laboratory is seeking a highly motivated senior scientist to research, design, and deploy capabilities for next-generation leadership computing system of systems.  The successful candidate should have a deep understanding in one or more of the following:

  • System-of-Systems Architectures
  • Open System Architecture
  • Edge Compute Systems
  • Microservice Architectures
  • Cloud Architectures
  • System-System Interconnectivity (e.g., Message Brokers, Network Topologies, Network Abstraction Layers)

This position is part of the Advanced Technologies Section within the National Center for Computational Sciences (NCCS) Division.

The Advanced Technologies Section offers scientific, technical, operational, and thought leadership by developing, hardening, and deploying solutions for compute and data intensive computing environments. 

The NCCS provides state- of-the-art computational and data science infrastructure, coupled with dedicated technical and scientific professionals, to accelerate scientific discovery and engineering advances across a broad range of disciplines. NCCS hosts the Oak Ridge Leadership Computing Facility, one of DOE’s National User Facilities.  NCCS will deploy an exascale system in 2021.
 

Major Duties and Responsibilities

  • Lead the system-of-systems architecture development used to interconnect HPC, edge compute nodes, experimental instruments, and other data creating devices.
  • Lead the development of microservice architectures that support the automation of advanced experiment and data analysis workflows.
  • Lead the requirements gathering, procurement, and NRE activities for various NCCS compute and data system projects as well as DOE-sponsored vendor research programs.
  • Advise senior leadership on upcoming leadership compute systems, edge systems, data system designs, hybrid system-of-systems/cloud architectures, procurements, and deployments.
  • Lead collaborations with internal and external researchers on a wide variety of interconnected HPC and edge compute platform design projects.
  • Lead, integrate and deploy research in hardware-software design and configurations for heterogeneous large-scale system-of-systems.
  • Perform scaling and performance evaluations of modern heterogeneous and extreme-scale high-performance computing systems.
  • Research and develop new capabilities that execute on ORNL’s leading data infrastructures.
  • Collaborate in authoring peer reviewed papers, technical papers, reports and proposals.
  • Perform service including mentoring, participating in conference committees, and maintain membership and leadership

 

Basic Qualifications

  • M.S. in computer science, computer engineering, computational science, or a related field and 12+ years of experience or equivalent or B.S. in one of these fields and 15+ years of experience.
  • Experience creating architectures that interconnect independent systems into a collection of interoperable system-of-systems.
  • Experience in the development and/or implementation of advanced microservice architectures.

 

Preferred Qualifications

  • Ph.D. in computer science, computer engineering, computational science, or a related field and 6+ years of experience or equivalent.
  • Demonstrated research experience and ability to apply cutting edge technologies to system prototypes and ultimately to a stable operational system.
  • Experience with creating and/or implementing common open system architectures.
  • Experience integrating and utilizing edge compute devices.
  • Experience developing and/or implementing cloud architectures and associated technologies.
  • Demonstrated experience in integrating a variety of tools/software (research and commercial) within a heterogeneous ecosystem.
  • Experience with one or more of:
  • Working with compute accelerators, heterogeneous nodes, and system architectures.
  • Hierarchical memory architectures and/or storage class memories
  • High-performance interconnects
  • Heterogeneous system design and deployment.
  • Develop operational models using the DoD Architecture Framework.
  • Understanding of modern software development practices.
  • Understanding of hardware/software interactions.
  • Experience with Docker, databases, and visualizations.
  • Proven ability to analyze, propose, develop, and deploy solutions to existing problems.
  • Experience conducting Software Development Life Cycle (SDLC) analysis to include:  software analysis, code analysis, requirements analysis, software review, identification of code metrics, system risk analysis, security analysis, and software reliability analysis.
  • Excellent interpersonal skills, oral and written communication skills, and strong personal motivation.

 

Special Requirement:

This position requires access to technology that is subject to export control requirements. Successful candidates must be qualified for such access without an export control license.

 

ORNL Ethics and Conduct:

As a member of the ORNL scientific community, you will be expected to commit to ORNL's Research Code of Conduct. Our full code of conduct, and a statement by the Lab Director's office can be found here:  https://www.ornl.gov/content/research-integrity

 

Benefits at ORNL:  

UT Battelle offers an exceptional benefits package to include matching 401K, Pension Plan, Paid Vacation and Medical / Dental plan. Onsite amenities include Credit Union, Medical Clinic and free Fitness facilities.   

 

Relocation:  

UT Battelle offers a wide range of relocation benefits for individuals and families to make it easier to come and work here. If you are invited to interview, please ask your Recruiter about relocating with ORNL. 

 

This position will remain open for a minimum of 5 days after which it will close when a qualified candidate is identified and/or hired.

We accept Word (.doc, .docx), Adobe (unsecured .pdf), Rich Text Format (.rtf), and HTML (.htm, .html) up to 5MB in size. Resumes from third party vendors will not be accepted; these resumes will be deleted and the candidates submitted will not be considered for employment.


If you have trouble applying for a position, please email ORNLRecruiting@ornl.gov.


ORNL is an equal opportunity employer. All qualified applicants, including individuals with disabilities and protected veterans, are encouraged to apply.  UT-Battelle is an E-Verify employer.