Student Research Opportunities

Contact: Xuhao Chen

We are looking for grad & undergrad students to join our lab. Feel free to reach out if you are interested in machine learning systems, computer architecture, and/or high performance computing. Our projects have the potential to become MEng thesis work. We have 6-A program opportunities available. If you are interested, please send your CV to cxh@mit.edu and fill in the recruiting form.

Research Summary: The AI revolution is transforming various industries and having a significant impact on society. However, AI is computationally expensive and hard to scale, which poses a great challenge in computer system design. Our lab is broadly interested in computer system architectures and high performance computing, particularly for scaling AI and ML computation.

Top-tier system & HPC conferences [OSDI, SOSP], [ASPLOS], [ISCA], [VLDB], [SIGMOD] [SC], [PPoPP].

Interesting MLSys topics: Transformer (Attention), Mixture-of-Experts, Vector Similarity Search, Deep Recommendation Models, Graph Machine Learning, Graph Sampling, Graph Algorithms, Robotics, Large Language Models, GPU Acceleration, Model Serving, Graph Databases, Graph Transformer, Diffusion, Generative AI, Reinforcement Learning.

Below are some ongoing research projects.

Scalable Vector Database [Elx Link]

Recent advances in deep learning models map almost all types of data (e.g., images, videos, documents) into high-dimension vectors. Queries on high-dimensional vectors enable complex semantic-analysis that was previously difficult if not impossible, thus they become the cornerstone for many important online services like search, eCommerce, and recommendation systems.

Vector Database

In this project we aim to build a massive-scale Vector Database on the multi-CPU and multi-GPU platform. In a Vector Database, the major operation is to search the k closest vectors to a given query vector, known as k-Nearest-Neighbor (kNN) search. Due to massive data scale, Approximate Nearest-Neighbor (ANN) search is used in practice instead. One of the most promising ANN approaches is the graph-based approach, which first constructs a proximity graph on the dataset, connecting pairs of vectors that are close to each other, then performs a graph traversal on the proximity graph for each query to find the closest vectors to a query vector. In this project we will build a vector database using graph-based ANN search algorithm that supports billion-scale datasets.

Qualifications:

References

  1. A Real-Time Adaptive Multi-Stream GPU System for Online Approximate Nearest Neighborhood Search , CIKM 2024
  2. ParlayANN, PPoPP 2024.
  3. CAGRA, ICDE 2024
  4. iQAN, PPoPP 2023.
  5. Billion-scale similarity search with GPUs
  6. HNSW
  7. DiskANN [Video] [Slides] [Slides2]
  8. intel/ScalableVectorSearch
  9. IntelLabs/VectorSearchDatasets

Zero-Knowledge Proof [Elx Link]

Zero-knowledge proof (ZKP) is a cryptographic method of proving the validity of a statement without revealing anything other than the validity of the statement itself. This “zero-knowledge” property is attractive for many privacy-preserving applications, such as blockchain and cryptocurrency systems. Despite its great potential, ZKP is notoriously compute intensive, which hampers its real-world adoption. Recent advances in cryptography, known as zk-SNARK, have brought ZKP closer to practical use. Although zk-SNARK enables fast verification of the proof, proof generation in ZKP is still quite expensive and slow.

Zero-Knowledge Proof

In this project, we will explore ZKP acceleration by using algorithm innovations, software performance engineering, and parallel hardware like GPU, FPGA or even ASIC. We aim to investigate and implement efficient algorithms for accelerating elliptic curve computation. We will also explore acceleration opportunities for the major operations, e.g., finite field arithmetic, Multi-scalar Multiplication (MSM) and Number-theoretic transformations (NTT).

Qualifications:

References

  1. Accelerating Multi-Scalar Multiplication for Efficient Zero Knowledge Proofs with Multi-GPU Systems, ASPLOS 2024.
  2. GZKP. ASPLOS 2023.
  3. MSMAC: Accelerating Multi-Scalar Multiplication for Zero-Knowledge Proof. DAC 2024
  4. PipeZK: Accelerating Zero-Knowledge Proof with a Pipelined Architecture. ISCA 2021
  5. Hardcaml MSM: A High-Performance Split CPU-FPGA Multi-Scalar Multiplication Engine. FPGA 2024
  6. Accelerating Zero-Knowledge Proofs Through Hardware-Algorithm Co-Design. MICRO 2024

Deep Recommendation System [Elx Link]

Deep Learning Recommendation Models (DLRMs) are widely used across industry to provide personalized recommendations to users and consumers. Specifically, they are the backbone behind user engagement for industries such as ecommerce, entertainment, and social networks. DLRMs have two stages: (1) training, in which the model learns to minimize the difference between predicted and actual user interactions, and (2) inference, in which the model provides recommendations based on new data. Traditionally, GPUs have been the hardware component of choice for DLRM training because of the high computational demand. In contrast, CPUs have been widely used for DLRM inference due to tight latency requirements that restrict the batch size. An existing bottleneck in inference is the high computational and memory bandwidth, which contribute greatly to loads on data centers and computing clusters.

Deep Recommendation System

In this project, we will focus on exploring GPU optimizations to the embedding stage of the DLRM inference pipeline, which has traditionally only utilized CPUs. Initially, we would like to explore CPU-GPU coupled schemes, for instance using GPUs as extra cache space (e.g. to store more embeddings or to memoize sparse feature computations), and multi-GPU cluster computation in order to further accelerate inference for more complex models. The goal is to coalesce existing inference frameworks, profile them, and implement novel ones to exhibit substantial speedup for DLRM inference on GPUs.

Qualifications:

References

  1. Optimizing CPU Performance for Recommendation Systems At-Scale. ISCA ’23.
  2. GRACE, ASPLOS 2023.
  3. EVStore. ASPLOS ‘23.

Graph AI Systems [Elx Link]

Deep Learning is good at capturing hidden patterns of Euclidean data (images, text, videos). But what about applications where data is generated from non-Euclidean domains, represented as graphs with complex relationships and interdependencies between objects? That’s where Graph AI or Graph ML come in. Handling the complexity of graph data and graph algorithms requires innovations in every layer of the computer system, including both software and hardware.

An overview of graph neural networks for anomaly detection in e-commerce

In this project we will design and build efficient graph AI systems to support scalable graph AI computing. In particular, we will build software frameworks for Graph AI and ML, e.g., graph neural networks (GNN), graph pattern mining (GPM) and graph sampling, and hardware accelerators that further enhance system efficiency and scalability.

Qualifications:

References

  1. F^2CGT VLDB 2024
  2. gSampler, SOSP 2023 [Code]
  3. NextDoor, EuroSys 2021 [Code]
  4. Scalable graph sampling on gpus with compressed graph, CIKM 2022

Graph AI for Financial Security [Elx Link]

The advent of cryptocurrency introduced by Bitcoin ignited an explosion of technological and entrepreneurial interest in payment processing. Dampening this excitement was Bitcoin’s bad reputation. Many criminals used Bitcoin’s pseudonymity to hide in plain sight, conducting ransomware attacks and operating dark marketplaces for the exchange of illegal goods and services.

An overview of graph neural networks for anomaly detection in e-commerce

This project offers a golden opportunity to apply machine learning for financial forensics. The data of Bitcoin transactions naturally forms a financial transaction graph, in which we can apply graph machine learning and graph pattern mining techniques to automatically detect illegal activities. We will explore the identification and clustering of frequent subgraphs to uncover money laundering patterns, and conduct link predictions on the wallets (nodes) to unveil the malicious actor behind the scene.

Qualifications:

References

  1. GraphPrompt
  2. Glass

AI/ML for Performance Engineering [Elx Link]

Generative AI, such as Large Language Models (LLMs), has been successfully used to generate computer programs, a.k.a code generation. However, its model performance degrades substantially when asked to do code optimization a.k.a. software performance engineering (SPE), i.e., generate not just correct but fast code.

An overview of AI coder

This project aims to leverage the capabilities of LLMs to revolutionize the area of automatic code optimization. We focus on transforming existing sequential code into high-performance, parallelized code, optimized for specific parallel hardware.

Qualifications:

References

  1. Performance-Aligned LLMs for Generating Fast Code
  2. Learning Performance Improving Code Edits
  3. Can Large Language Models Write Parallel Code?
  4. MPIrigen: MPI Code Generation through Domain-Specific Language Models
  5. The Landscape and Challenges of HPC Research and LLMs

Efficient Robotics Computing [Elx Link]

The advancement of robotics technology is rapidly changing the world we live in. With predictions of 20 million robots by 2030 and a market capitalization of US$210 billion by 2025, it is clear that robotics will play an increasingly important role in society. To become widespread, robots need to meet the demands of real-world environments, which necessitates them being autonomous and capable of performing complex artificial intelligence (AI) tasks in real-time.

robot

In this project we aim to build software and hardware systems for Robotics.

Qualifications:

References

  1. Phillip B Gibbons (CMU)'s lab. Tartan: Microarchitecting a Robotic Processor, ISCA 2024
  2. Tor Aamodt (UBC)'s Lab. Collision Prediction for Robotics Accelerators, ISCA 2024
  3. Tor Aamodt (UBC)'s Lab. Energy-Efficient Realtime Motion Planning. ISCA 2023.
  4. Phillip B Gibbons (CMU)'s lab. Agents of Autonomy: A Systematic Study of Robotics on Modern Hardware. SIGMETRICS 2023.
  5. Sabrina M. Neuman, Vijay Janapa Reddi (Harvard)'s Lab. RoboShape. ISCA 2023.
  6. Lydia Kavraki (Rice)'s Lab
  7. Sophia Shao (Berkeley)'s lab. RoSÉ
  8. A Survey of FPGA-Based Robotic Computing
  9. Phillip B Gibbons (CMU)'s lab. RACOD ISCA 2022.