HPC Linux Storage Engineer
Date: Jan 21, 2026
Location: Oak Ridge, TN, US, 37831
Company: Oak Ridge National Laboratory
Requisition Id 15790
Overview
Oak Ridge National Laboratory (ORNL), home to some of the world’s most powerful supercomputers, is seeking highly skilled professionals to support large-scale storage systems, high-speed parallel file systems, and archival solutions critical to advancing scientific discovery and innovation. As part of ORNL’s leadership-class computing ecosystem, you will play a vital role in designing, deploying, optimizing, and maintaining infrastructure that powers cutting-edge research across diverse scientific domains.
This evergreen posting represents multiple opportunities across ORNL’s high-performance computing (HPC) environment, supporting scalable, reliable, and secure computing and storage capabilities. Applications are reviewed on an ongoing basis as new positions become available to meet the dynamic needs of our world-class computing facility.
Job Duties and Responsibilities May Include:
- Design and Management of Infrastructure: Architect, deploy, and manage large-scale storage systems and HPC platforms to support research, scientific, and enterprise workloads. Develop and implement solutions for structured, unstructured, and archival data storage, focusing on scalability, reliability, and performance.
- Systems Analysis and Development: Apply systems analysis techniques to consult with users/customers, determine functional requirements, and design, test, or optimize storage and computational solutions tailored to their needs. Develop, document, and modify solutions, including system prototypes and automated workflows, to enhance operational efficiency.
- Performance, Optimization, and Troubleshooting: Ensure the performance, availability, scalability, and security of diverse infrastructure environments. Diagnose and resolve complex operational challenges quickly and effectively, applying advanced performance optimization techniques for a wide range of workloads.
- Collaboration and Best Practices: Work closely with stakeholders from research, technical, and operational teams to understand workflows, identify opportunities for improvement, and deliver effective solutions. Define, implement, and enforce best practices, standards, and procedures across projects and teams.
- Automation and Innovation: Automate system configuration, provisioning, monitoring, and maintenance to reduce manual efforts and downtime. Evaluate emerging technologies and tools to continuously improve system capabilities, adapt to changing needs, and plan for future advancements.
- Support and Maintenance: Support critical infrastructure through participation in a 24/7 on-call rotation and off-hours maintenance windows. Resolve hardware and software issues in coordination with vendors, ensuring minimal impact on operations.
Basic Qualifications
- Bachelor’s degree in computer science, engineering, information technology, or a related field; and at least 5 years of professional experience managing Linux/UNIX systems in heterogeneous environments. An equivalent combination of education and experience will be considered.
- Demonstrated experience with high-performance computing (HPC) storage systems and enterprise storage platforms (e.g., Lustre, GPFS, BeeGFS, or WEKA).
- Proficiency in scripting languages (e.g., Python, Bash, Perl) and configuration management/automation tools (e.g., Ansible, Puppet, Git).
- Strong communication, collaboration, and problem-solving skills with the ability to design and implement solutions independently.
Preferred Qualifications
- Active DOE Q, DoD Top Secret, or TS/SCI clearance.
- Hands-on experience with HPC cluster technologies, including job schedulers (e.g., SLURM) and system deployment tools (e.g., Warewulf, PXEboot, Bright Cluster Manager).
- Expertise in high-performance parallel file systems, tape library systems, and storage networking technologies (e.g., RAID, ZFS, NVMe-oF, Infiniband).
- Familiarity with performance monitoring tools (e.g., Grafana, Nagios), benchmarking systems, and I/O optimization techniques.
- Experience with virtualization and containerization platforms (e.g., VMware, KVM, Podman, Apptainer).
- Background in open source development, including submitting patches upstream, and building custom Linux packages (e.g., RPM for RHEL).
- Demonstrated ability to troubleshoot and optimize high-performance storage, compute, and networking systems in HPC environments.
- Experience documenting technical processes and contributing to complex technical projects in government, scientific, or highly technical settings.
Hybrid Eligibility
These positions are located in Oak Ridge, Tennessee and require onsite presence. We offer a flexible work environment that supports both the organization and the employee. A hybrid/onsite working arrangement may be available with this position, which provides flexibility to work periodically from your home, while reporting onsite to the Oak Ridge, Tennessee location on a weekly and regular basis.
Special Requirement
This position requires the ability to obtain and maintain clearance from the Department of Energy. As such, this position is a Workplace Substance Abuse (WSAP) testing designated position. WSAP positions require passing a pre-placement drug test and participation in an ongoing random drug testing program.
About ORNL
As a U.S. Department of Energy (DOE) Office of Science national laboratory, ORNL has an impressive 80-year legacy of addressing the nation’s most pressing challenges. Our team is made up of over 7,000 dedicated and innovative individuals! Our goal is to create an environment where a variety of perspectives and backgrounds are valued, ensuring ORNL is known as a top choice for employment. These principles are essential for supporting our broader mission to drive scientific breakthroughs and translate them into solutions for energy, environmental, and security challenges facing the nation.
Why Join Us
- Work on the world’s most powerful supercomputers, including Frontier, the first system to achieve exascale performance.
- Enable breakthrough science in fields like fusion energy, climate modeling, AI, and national security.
- Collaborate with diverse teams of scientists, engineers, and technologists from across the DOE complex and academia.
- Grow your career in a mission-driven, innovation-focused environment with access to professional development and leadership opportunities.
- Enjoy life in East Tennessee, with a thriving research community, scenic outdoor recreation, and a high quality of life.
This position will remain open for a minimum of 5 days after which it will close when a qualified candidate is identified and/or hired.
We accept Word (.doc, .docx), Adobe (unsecured .pdf), Rich Text Format (.rtf), and HTML (.htm, .html) up to 5MB in size. Resumes from third party vendors will not be accepted; these resumes will be deleted and the candidates submitted will not be considered for employment.
If you have trouble applying for a position, please email ORNLRecruiting@ornl.gov.
ORNL is an equal opportunity employer. All qualified applicants, including individuals with disabilities and protected veterans, are encouraged to apply. UT-Battelle is an E-Verify employer.
Nearest Major Market: Knoxville