Machine Learning Data Engineer

Date: Sep 12, 2023

Location: Oak Ridge, TN, US, 37830

Company: Oak Ridge National Laboratory

Requisition Id 11602 


Our Organization:

As a U.S. Department of Energy (DOE) Office of Science national laboratory, ORNL has an extraordinary 80-year history of solving the nation’s biggest problems. We have a dedicated and creative staff of over 6,000 people! Our vision for diversity, equity, inclusion, and accessibility (DEIA) is to cultivate an environment and practices that foster diversity in ideas and in the people across the organization, as well as to ensure ORNL is recognized as a workplace of choice. These elements are critical for enabling the execution of ORNL’s broader mission to accelerate scientific discoveries and their translation into energy, environment, and security solutions for the nation.


ORNL is home to Frontier, the world’s fastest and first exascale supercomputer—providing an open science environment to develop solutions that touch us all. With direct access to Frontier, we can simulate and engineer solutions that only exascale computing can enable.


The Analytics and AI Methods at Scale Group (AAIMS) at the Oak Ridge National Laboratory is seeking qualified and driven applicants for a Machine Learning Data Engineer position for the broad area of AI for science projects. The research and development activities include but not limited to: scientific data collection, transformation, feature engineering, large-scale natural language modeling and understanding etc. In this role, you will have the opportunity to work on some of the most challenging and impactful research and development, and collaborate with both computer scientists and domain scientists to build end-to-end data and ML pipeline/services to facilitate and expedite the ML-assisted scientific discovery process.


As a Machine Learning Data Engineer, you should be comfortable around Linux, SQL, Python, containers, Pandas, Spark, and source control in a highly collaborative environment.


Major Duties and Responsibilities:

  • Mobilizing and leading data analysis activities on projects with a focus on common deliverables, goals and timelines, including data preparation, transformation, feature engineering etc. in collaboration with scientists and engineers.
  • Research and evaluate emerging technologies and approaches from the broader ML community.
  • Evaluate and deploy scalable AI frameworks, tools, and execute them on high-performance computing (HPC) resources, in close collaboration with research staff and computing technical staff.
  • Troubleshoot data analysis issues, including implementation issues, hyperparameter choices, and modeling decisions.
  • Quickly and clearly summarize analyses, following best practices in documentation, data visualization, and provenance tracking for reproducibility.
  • Assist in preparation of manuscripts and dissemination of research results in publications and conferences.
  • Develop high-quality Python code following best practices in the community; manage code and data through version control systems and community hub such as HuggingFace.


Basic Qualifications:

  • B.S. and 2+ years of relevant experience or an M.S. and 1+ year of relevant experience.
  • Degree concentration should be in Computer Science/Engineering or closely related field.
  • Experience with Python for data science.
  • Experience with PyTorch and/or Tensorflow.


Preferred Qualifications:

  • Understanding of supervised and unsupervised learning, reinforcement learning, and deep learning.
  • Experience of CUDA programming.
  • Experience of MPI programming, and collective communication primitives.
  • Applied research experience in at least one machine learning discipline such as natural language processing, image processing and classification or related areas.
  • Excellent communication skills for conveying technical material to both scientists and non-scientists in both written and oral presentations.
  • Self-disciplined work ethic and eagerness to tackle challenging research problems.
  • Ability to communicate and work on diverse and interdisciplinary teams.


Benefits at ORNL:

ORNL offers competitive pay and benefits programs to attract and retain talented people. The laboratory offers many employee benefits, including medical and retirement plans and flexible work hours, to help you and your family live happy and healthy. Employee amenities such as on-site fitness, banking, and cafeteria facilities are also provided for convenience.


In addition, we offer a flexible work environment that supports both the organization and the employee. A hybrid/onsite working arrangement may be available with this position.

Other benefits include: Prescription Drug Plan, Dental Plan, Vision Plan, 401(k) Retirement Plan, Contributory Pension Plan, Life Insurance, Disability Benefits, Generous Vacation and Holidays, Parental Leave, Legal Insurance with Identity Theft Protection, Employee Assistance Plan, Flexible

If you have difficulty using the online application system or need an accommodation to apply due to a disability, please email: or call 1.866.963.9545.




This position will remain open for a minimum of 5 days after which it will close when a qualified candidate is identified and/or hired.

We accept Word (.doc, .docx), Adobe (unsecured .pdf), Rich Text Format (.rtf), and HTML (.htm, .html) up to 5MB in size. Resumes from third party vendors will not be accepted; these resumes will be deleted and the candidates submitted will not be considered for employment.

If you have trouble applying for a position, please email

ORNL is an equal opportunity employer. All qualified applicants, including individuals with disabilities and protected veterans, are encouraged to apply.  UT-Battelle is an E-Verify employer.

Nearest Major Market: Knoxville