SAFE-AI

About

SAFE-AI: The first workshop on Systems and Architectures for Encrypted AI

Venue: New York II (Upstairs)

This workshop on Encrypted AI will explore cutting-edge techniques for accelerating and democratizing privacy-preserving AI, focusing on Fully Homomorphic Encryption (FHE) and Secure Multi-party Computation (MPC). Designed in two parts, the workshop will first host sessions with invited speakers from academia and industry, who will share the latest advances in FHE for AI, real-world applications, and challenges in scaling encrypted AI. The workshop will cover a broad range of topics on systems and architectures for encrypted AI including acceleration techniques across the computing stack, compilers, tools, DSLs, and applications for encrypted AI.

The second part of the workshop is a hands-on tutorial centered on the Cinnamon Framework for scale-out encrypted AI, providing participants with a comprehensive understanding of architecture, compilers, DSL, and encrypted AI applications and models. Attendees will learn practical skills for creating, exploring, and deploying encrypted AI workflows, leveraging Cinnamon’s modular approach to tackle the complexity of FHE on multi-chip systems. By the end of the workshop, participants will be equipped with tools and knowledge to scale AI models securely, integrate FHE into diverse AI applications, and bridge the gap between theoretical encryption techniques and practical, high-performance AI solutions. This event is ideal for researchers, engineers, and industry professionals invested in privacy-preserving AI.

Program

Time	Event
8:30am - 8:50am	Coffee
8:50am - 9:00am	Welcome Remarks
9:00am - 9:45am	Invited Talk 1: GPU-driven FHE: Real-Time Private AI Inference Competing Custom ASIC Jung Ho Ahn, Seoul National University
9:45am - 10:30am	Invited Talk 2: Fast and Accessible FHE-Secured Deep Learning with the Orion Compilation Framework Brandon Reagen, New York University
10:30am - 11:00am	Break
11:00am - 11:45am	Invited Talk 3: Towards Unified Architectural Support for Privacy-Preserving Computing Mingyu Gao, Tsinghua University
11:45am - 12:30pm	Invited Talk 4: HEIR: A Universal Compiler for Fully Homomorphic Encryption Asra Ali, Google
12:30pm - 2:00pm	Lunch
2:00pm - 3:30pm	Cinnamon Tutorial Part 1 The first part of the tutorial will explore writing and running FHE programs in the Cinnamon Framework.
3:30pm - 4:00pm	Break
4:00pm - 5:30pm	Cinnamon Tutorial Part 2 The second part of the tutorial will cover exploring the new parallelization techniques in Cinnamon compiler and the Cinnamon architectural simulator.
5:30pm	Closing Remarks

Talks

Talks featuring speakers across acadmeia and industry on the latest exciting developments in encrypted computing.

9:00am - 9:45am

GPU-driven FHE: Real-Time Private AI Inference Competing Custom ASIC

Abstract

A significant challenge in accelerating fully homomorphic encryption (FHE) for practical applications is achieving high performance. DARPA, through the DPRIVE program in 2020, set a demanding performance target: running inference on a 7-layer convolutional neural network (CNN) against the CIFAR-10 dataset within 25ms. Using two H100 GPUs, we have achieved a remarkable 24.8ms latency with 87% accuracy without fabricating ASIC. This presentation will outline the series of techniques we have developed and leveraged since 2019 to accelerate CKKS, a widely adopted FHE scheme for machine learning. These techniques include low-degree polynomial approximation (AESPA), an agile CKKS GPU library (Cheddar), and advanced channel packing strategies inspired by HyPHEN and NeuJeans, all optimized for powerful GPUs.

Speaker: Jung Ho Ahn (Seoul National University)

Jung Ho Ahn is a professor at Seoul National University. Professor Ahn received his Ph.D. in electrical engineering from Stanford University (2007), was a senior research scientist at HP labs before joining Seoul National University, and took a sabbatical at Google (2016). His research interests include bridging the gap between the performance/efficiency demand of emerging applications (including homomorphic encryption) and the performance/efficiency potential of modern and future massively parallel systems, more specifically on memory subsystems. Professor Ahn is the hall of fame member of HPCA, ISCA, and MICRO.

9:45am - 10:30am

Fast and Accessible FHE-Secured Deep Learning with the Orion Compilation Framework

Abstract

In this talk I’ll present an overview of my lab’s work on private neural inference over the last 5 years. This will include a brief overview of our neural network optimizations and various hardware accelerators. The core focus of the talk will be our new FHE neural network compiler named Orion, which will appear in the main ASPLOS conference. Orion automates and optimizes FHE, providing SOTA performance with a PyTorch programming model, making FHE more accessible than ever before! I’ll also explain how we pack vectors, leverage state-of-the-art FHE techniques, optimize level management and bootstrap placement, handle re-scaling, fit polynomial approximations, and deal with large data structures (e.g., weights and keys). To conclude, we will present a demo of our code and show how it can be used.

Speaker: Brandon Reagen (New York University)

Dr. Reagen is a professor of Electrical and Computer Engineering at New York University with a focus on computer architecture. He leads a group of 7 PhD students whose current focus is to make cryptographic computing practical through algorithmic, compiler, and hardware optimizations for homomorphic encryption, zero-knowledge proofs, and secure multi-party computation. Prior to cryptographic computing, he made significant contributions to accelerator design and hardware for machine learning. The work has been recognized with multiple best paper nominations and Top Pick/honorable mention awards. He has further implemented many ideas, participating in multiple chip tape outs. He is actively a performer on the DARPA DPRIVE, PROWESS, and COOP programs, was recognized as a DARPA Riser in 2022, and won his NSF CAREER award in 2024. He has authored over 50 papers and holds one patent. He holds a Ph.D. in Computer Science from Harvard, B.S. degrees in Computer Engineering and Applied Mathematics from UMass, Amherst.

11:00am - 11:45am

Towards Unified Architectural Support for Privacy-Preserving Computing

Abstract

Privacy-preserving computing involves several categories of cryptographic protocols such as fully homomorphic encryption (FHE), multi-party computation (MPC), and zero-knowledge proof (ZKP). In this talk I will start with our recent work that designed specialized accelerators for ZKP as well as functional units for ZKP and FHE. These efforts lead us to exploit the opportunities to design unified architectures that efficiently support not only largely varying parameters in each protocol, but also multiple different protocols. Considering the still rapid development of crypto protocols, such flexibility in hardware would be highly desired.

Speaker: Mingyu Gao (Tsinghua University)

Mingyu Gao is an associate professor of computer science in the Institute for Interdisciplinary Information Sciences (IIIS) at Tsinghua University. His research interests lie in the fields of computer architecture and systems, including efficient memory architectures, scalable data processing, and hardware system security. He has published in top-tier conferences including ISCA, MICRO, ASPLOS, HPCA, OSDI, SIGMOD, and VLDB. Mingyu received his PhD in Electrical Engineering at Stanford University in 2018.

11:45am - 12:30pm

HEIR: A Universal Compiler for Fully Homomorphic Encryption

Abstract

The HEIR project is a compiler toolchain for fully homomorphic encryption (FHE) that aims to incorporate all major techniques and algorithmic optimizations in FHE, accelerate FHE research, and lower the barrier to production FHE deployments. As a universal compiler, it aims to support any FHE scheme along with a set of extensible optimizations and generate code for any FHE hardware accelerator. With that goal, it can act as a common benchmarking platform for new research ideas and hardware accelerators. In this talk, we’ll demonstrate how to use HEIR’s python frontend to generate code for a variety of different platforms. We’ll also highlight some of the FHE techniques implemented in HEIR, such as data-oblivious program transformations, data layout optimization, relinearization placement, vectorization strategies, and more.

Speaker: Asra Ali (Google)

Asra Ali is a software engineer at Google currently working on compilers for fully homomorphic encryption. Her track record includes developing and maintaining many open source projects around privacy and security, including Google’s Ring Learning with Errors implementation, Sigstore’s transparency log Rekor, and The Update Framework’s Go module. She’s also worked on real world privacy preserving applications using FHE, resulting in one patent. Her background is in abstract math, and published research in number theory.

Tutorial

We will be hosting a two part tutorial on the state-of-the-art Cinnamon framework for scale out encrypted AI. This tutorial will be highly beneficial to anyone who wants to learn more about using FHE for encrypted AI. We request participants to bring their laptops for the tutorial. All material for the tutorial will be provided in the form of a git repositiory and a docker container. Please ensure that you have git and docker installed on your system. More instructions can be found here: Docker Desktop and git. We recommend participants use Visual Studio Code for this tutorial. However, if this is not possible, it is also possible to run the tutorial locally through a jupyter server.

System Requirements

OS: Linux, MacOS or WSL on Windows
Software Requirements: Docker, git
Highly Recommended: Visual Studio Code

System Setup

Due to the possibility of poor internet bandwidth at the workshop location, we request tutorial participants to pull the docker container in advance.

docker pull sidjay10/cinnamon_tutorial:latest

Part 1

2:00pm-3:30pm

In Part 1 of the tutorial, we will introduce the Cinnamon DSL, Cinnamon compiler and the Cinnamon emulator. We will cover the basics of writing, compiling and executing simple FHE applications using the Cinnamon framework. We will also cover an example of writing and running an encrypted logistic regression inference using the Cinnamon framework.

Part 2

4:00pm-5:30pm

In Part 2 of the tutorial, we will explore Cinnamon’s newly developed optimizations, parallelism and scale out techniques for encrypted AI using the Cinnamon compiler and the Cinnamon accelerator simulator and encrypted MNIST inference as an example.

Organizers

Siddharth Jayashankar, Wenting Zheng, Dimitrios Skarlatos

Carnegie Mellon University

Contact

Dimitrios Skarlatos (dskarlat@cs.cmu.edu)