Skip to the content.

banner

ORNL AI Initiative: Secure, Trustworthy, and Energy-Efficient AI for Science Tutorial Series

ORNL AI Initiative’s tutorial series is designed to equip the scientific community with cutting-edge methodologies, technologies developed within ORNL and specifically through the AI initiative.

Each half-day tutorial is structured to cover a broad spectrum of AI for science applications, from graph neural networks in materials modeling to machine learning in hyperspectral image reconstruction.

The tutorials are designed for researchers of all levels who seek to deepen their understanding of AI for Science. Our primary audience includes researchers with a foundation in AI, aiming to explore and implement advanced, trustworthy AI techniques within scientific contexts. Additionally, we invite researchers and practitioners from the industry, as well as graduate, postgraduate, and undergraduate students from universities who are engaged in AI research and aspire to acquire knowledge specifically in AI for science. Although we will touch upon fundamental concepts in AI for science, our curriculum does not encompass introductory AI topics.

The tutorials will place a strong emphasis on open-source principles and non-sensitive topics and content. We will share open-source code, provide detailed code walkthroughs, and offer hands-on exercises to enhance learning and engagement.

The tutorials will be held virtually, allowing participants to join virtually. Microsoft Teams link will be announced.

Registration is not required for ORNL participants. For non-ORNL participants, an entry pass is required even for virtual attendance. For non-US citizens, this process may take up to 45 days. Please email Jessica Woods at woodsjb@ornl.gov and Taylor Bullock at bullocktm@ornl.gov. Include a short bio and explain why you wish to attend the tutorial series. Note that the tutorials will be recorded and made available online after 2-3 weeks. Therefore, please request an entry pass only if you need to attend virtually.

⤾ Back to top

Next Tutorial

Past Tutorial

Material Property Prediction with Large Scale GNNs
Time: 03/28/2024, 2:00 PM - 4:00 PM (ET)
Virtual (Microsoft Teams)
Massimiliano Lupo Pasini
Oak Ridge National Laboratory

 
MLP
Massimiliano Lupo Pasini
Staff Scientist
Computing and Computational Sciences Directorate, ORNL

Abstract

During the tutorial, we will show how to use HydraGNN (https://github.com/ORNL/HydraGNN), our scalable implementation of multi-task learning graph neural networks developed at Oak Ridge National Laboratory #ORNL as part of the ORNL Artificial Intelligence Initiative. After discussing scientific motivations behind the need to develop scalable GNN training to support scientific applications of interest to the US Department of Energy (DOE), we will cover a hands-on session with examples of increasing difficulty. In particular, we will show how HydraGNN can be used to scale the training of GNN models on millions of atomic structures on OLCF-Summit, NERSC-Perlmutter, and OLCF-Frontier. To this day, we achieved linear scaling on these DOE leadership class supercomputing facilities using up to 1,024 GPUs.

Bio

Massimiliano (Max) Lupo Pasini obtained his Bachelor of Science and Master of Science in Mathematical Engineering at the Politecnico di Milano in Milan, Italy. The focus of his undergraduate and master studies was statistics and discretization techniques and reduction order models for partial differential equations. He obtained his PhD in Applied Mathematics at Emory University in Atlanta (GA) in May 2018. The main topic of his doctorate work was the development of efficient and resilient linear solvers for upcoming computing architectures moving towards exascale. Upon graduation, Max joined the Oak Ridge National Laboratory (ORNL) as a Postdoctoral Researcher Associate in the Scientific Computing Group at the National Center for Computational Sciences (NCCS).

In November 2019, Max became a computational scientist in the Scalable Algorithms and Coupled Physics Group in the Advanced Computing Methods for Engineered Systems Section of the Computational Sciences and Engineering Division at ORNL. In November 2020, Max became a data scientist in the Computational Coupled Physics group within the same section.

Max’s research focuses on the development of surrogate and generative AI models for material sciences, scalable hyper parameter optimization techniques for deep learning (DL) models, and acceleration of computational methods for physics applications. He was the technical lead of the Surrogates Models for Material Properties product within the Artificial Intelligence for Science and Discovery (AISD) thrust of the ORNL Artificial Intelligence Initiative in the fiscal years FY21, FY22, and FY23. He is currently the interim technical lead of the AISD thrust for fiscal years FY24 and FY25.

slides

⤾ Back to top


ExpM+NF: Are normalizing flows the key to unlocking the exponential mechanism? Our efforts to advance differentially private machine learning
Time: 04/25/2024, 2:00 PM - 4:00 PM (ET)
Hybrid (4100, J302; Microsoft Teams)
Robert A Bridges
Oak Ridge National Laboratory

 
RAB
Robert A Bridges
Research Scientist
Computing and Computational Sciences Directorate, ORNL

Abstract

Training machine learning models on data with privacy concerns is an increasingly common and needed situation. Differential privacy provides a mathematical framework for rigorously guaranteeing privacy while releasing information—roughly speaking, one makes a deterministic algorithm into a randomized algorithm (e.g., by adding noise to the output) in such a way that it masks the original data. Of course, accuracy is sacrificed for the gain of privacy. Unfortunately, the state of the art in differentially private stochastic gradient descent (DPSGD), which is the state of the art for private machine learning, is inadequate in practice as it leads to insufficient accuracy-privacy tradeoffs. How can we transform this field to produces sufficiently accurate models while simultaneously protecting our privacy? This talk presents our work to explore a new approach to differentially private machine learning.

First necessary background will be given. I’ll provide an introduction to differential privacy requiring no background in the field with a focus on differentially private machine learning. Our target is understanding the state of the art, DPSGD. The second portion of background material will introduce normalizing flows models—a family of deep learning neural networks that, by design, preserve probability densities, and hence can be used to approximate intractable probability densities. Basic understanding of supervised machine learning and neural networks is required.

Once equipped with the needed background, I’ll present our new approach, ExpM+NF. This exploratory project investigates if we can use the Exponential Mechanism—a differential privacy mechanism for optimization—to train machine learning models with privacy. Historically, the exponential mechanism has been sidelined as it requires sampling from an intractable distribution. Can normalizing flow models, deep neural networks designed for handling intractable distributions be the key? Our year 1 work seeks to answer the question “Can we train a model by sampling from the exponential mechanism via an auxiliary normalizing flow?”. Experiments and results will be presented. Our current (year 2) work seeks to prove a differential privacy guarantee that incorporates the normalizing flow’s approximation to the “real” exponential mechanism distribution.

Bio

Robert A. Bridges (Bobby), is a Senior Research Mathematician at ORNL sitting in the National Security Sciences Directorate. Bobby currently focuses on advancing differentially private machine learning (as this talk may suggest), and using mathematical system models to identify feedback vulnerabilities. Bobby holds an undergraduate degree from Creighton University in mathematics as well as a PhD in mathematics from Purdue University (and he is very happy with both teams’ deep run in the March Madness tourney!) Leaving pure mathematics, Bobby was fortunate to obtain a postdoc position at ORNL in 2012, where he has since supported a wide variety of projects with machine learning, statistical, and applied math expertise. Previous work includes leading projects on intra-vehicle network security and reverse engineering algorithms, leading large-scale experiments testing commercial security tools, and serving for 2 years as the Cybersecurity Research Group Leader.

slides


PI3NN: Uncertainty Quantification for ML models
Time: 06/28/2024, 9:00 AM - 12:00 PM (ET)
Virtual (Microsoft Teams)
Dan Lu
Oak Ridge National Laboratory

   
DL SL
Dan Lu
Senior Research Scientist
Computing and Computational Sciences Directorate, ORNL
Siyan Liu
Research Scientist
Computing and Computational Sciences Directorate, ORNL

Abstract

In machine learning (ML), understanding and quantifying prediction uncertainties is crucial, particularly when decisions rely on ML outputs. In this tutorial, we will introduce PI3NN (https://github.com/liusiyan/UQnet), a prediction interval-based method for uncertainty quantification (UQ) of ML models.

We will begin with an overview of UQ in ML, discussing its significance and impact across various research fields. We will then explore the basics of the PI3NN method, providing practical examples to demonstrate its broad applicability in diverse scientific domains. The tutorial will include a hands-on session where participants can familiarize themselves with this method through examples that include dense neural networks with synthetic data and Long Short-Term Memory networks with real observations. We will guide participants step-by-step through the process in a Google Colab Notebook format, enhancing both understanding and practical skills.

Bio

Dan Lu is a Senior Staff Scientist in the Computational Earth Sciences Group in the Computational Sciences and Engineering Division. Her research interests include Machine Learning (ML), Uncertainty Quantification (UQ), Surrogate Modeling, Inverse Modeling, Multimodel Analysis, Sensitivity Analysis, Experimental Design, and Numerical Simulations of Hydrological and Earth Systems. Dan is the PI of multiple projects funded by both the Office of Science and Applied Offices within DOE, including an UQ for ML project, an Earth System Modeling project, a Hydropower project, and a Geological Carbon Storage project. She has served on several editorial boards of international journals, organized dozens of technical workshops, authored over 70 publications, and developed two software. Dan earned her Ph.D. in Computational Hydrology at Florida State University in 2012, and received the DOE Early Career Award in 2023.

Siyan Liu is a research scientist at Oak Ridge National Laboratory (ORNL), working with the Computational Earth Sciences Group (CESG) and AI initiative. He finished his Ph.D. degree in Chemical & Petroleum Engineering and worked as a postdoc at ORNL before his current role. He has broad research interests, including developing various AI models and foundational models, uncertainty quantification, HPC and scalable AI model training, AI for science applications, numerical simulation of fluid flow in porous media, computational fluid dynamics (CFD), reservoir simulation, apply data analytics and artificial intelligence on other science and engineering problems, and high-performance parallel computing.


VAE-NCDE: a generative time series model for probabilistic multivariate time series forecasting


Time: 07/26/2024, 9:00 AM - 12:00 PM (ET)
Virtual (Microsoft Teams)
William L Gurecky
Oak Ridge National Laboratory

 
WLG
William L Gurecky
Research Scientist
Computing and Computational Sciences Directorate, ORNL

Abstract

TBA

Bio

William is an R&D associate in the Nuclear Energy and Fuel Cycle Division at Oak Ridge National Laboratory. He currently works with the Power Reactor Modeling group developing performant software solutions to address industry identified challenge problems. His research interests include applying Bayesian statistics and machine learning to reactor physics problems. As the principle investigator of a laboratory directed research and development project, William developed a flexible parallel optimization package, ML-PSA, that joins modern machine learning methods with multi-fidelity physics tools to solve multi-constrained and combinatorial optimization problems encountered in reactor core design.

William co-developed the crud simulation code, MAMBA, that is a coupled component of the Virtual Environment for Reactor Applications (VERA). MAMBA allows prediction of crud-induced power shifts, constituting a key capability in the Consortium for the Advanced Simulation of LWRs’ (CASL) technical portfolio. Additionally, he develops reduced order models and applies statistical inference techniques to a variety of model calibration tasks in MAMBA and CTF.

William holds a B.Sc. in Mechanical Engineering (2013), an M.S.E. in Mechanical Engineering (2015) and a Ph.D. in Nuclear & Radiation Engineering (2018) from the University of Texas at Austin. In his doctoral work, William developed a statistical downscaling method to capture the influence of fine-scale flow details down stream from spacer grids on the growth rate of crud in PWRs.


ML-Based Sparse Image Reconstruction On The Edge


Time: 10/18/2024, 9:00 AM - 12:00 PM (ET)
Virtual (Microsoft Teams)
Rama K. Vasudevan, Narasinga Rao Miniskar, Anthony Cabrera, and Singanallur V. Venkatakrishnan
Oak Ridge National Laboratory

       
RKV NRM AMC VVS
Rama K. Vasudevan
Group Leader, Data NanoAnalytics (DNA) Group
Center for Nanophase Materials Sciences, ORNL
Narasinga Rao Miniskar
Senior R&D Staff
Computer Science and Mathematics Division, ORNL
Anthony Cabrera
Research Scientist
Computer Science and Mathematics Division, ORNL
S. V. Venkatakrishnan
R&D Staff
Electrification and Energy Infrastructures Division, ORNL

Abstract Hyper-spectral imaging is of key importance to rapidly determining the functional properties of materials when deployed from conventional microscopy techniques, but also sees significant applications in areas of optical (satellite) imagery and medical imaging technologies for e.g., determining changes to land use, functional changes to organs, etc. The sparse scanned image data from scanning probe microscopy instruments improves overall scanning speed but the original image need to be reconstructed from this sparse data, which is very challenging. In this tutorial, we walk through the hyperspectral image reconstruction traditional method (Biharmonic) and AI based methods (Deep Inverse), and provide an insight on deploying these algorithms on FPGA.

Tentative agenda:

Bio

Schedule

Please reach out if you are interested in presenting at a future event

Date Title ORNL Tutorial Lead
03/28/2024 Material Property Prediction with Large Scale GNNs Massimiliano Lupo Pasini
04/25/2024 ExpM+NF: Are normalizing flows the key to unlocking the exponential mechanism? Our efforts to advance differentially private machine learning Robert A Bridges
06/28/2024 PI3NN: Uncertainty Quantification for ML models Dan Lu
Siyan Liu
07/26/2024 VAE-NCDE: a generative time series model for probabilistic multivariate time series forecasting William L Gurecky
10/18/2024 ML-HSIR: Machine Learning based Hyperspectral Image Reconstruction the edge Narasinga Rao Miniskar
12/13/2024 Intro to causal and explainable models in materials science Ayana Ghosh

⤾ Back to top

Organization

For questions, please contact us.

       
Jong Youl Choi Yan Liu Chen Zhang Prasanna Balaprakash
Jong Youl Choi
HPC Data Research Scientist
Computer Science and Mathematics Division
ORNL
Yan Liu
Computational Scientist
Computational Sciences and Engineering Division
ORNL
Chen Zhang
Computational Scientist
Computer Science and Mathematics Division
ORNL
Prasanna Balaprakash
Director of AI Programs
Distinguished R&D Staff Scientist
Computing and Computational Sciences Directorate, ORNL

⤾ Back to top