About me
I am an Assistant Professor in the College of Information and Computer Sciences (CICS) at the University of Massachusetts Amherst, the flagship campus of the UMass system. I received my Ph.D. in Electrical Engineering from North Carolina State University in 2020. I am a member of the Programming Language and Systems at Massachusetts (PLASMA) lab at UMass. My research lies in machine learning systems, with an emphasis on improving the speed, scalability, and reliability of Machine Learning.
My research aims to address a fundamental question in Machine Learning adoption: How can we create machine learning systems that efficiently deliver reliable predictions to meet the requirements of diverse applications running on various systems? To tackle this, my group focuses on reducing the costs of model development and enabling deep learning in resource-constrained edge environments. Our approaches leverage insights from the inherent trade-offs in accuracy and efficiency of machine learning, along with principles of system design such as composability, pipelining, and locality awareness. Ultimately, our aim is to democratize machine learning by making it a readily accessible technology applicable to a wide array of real-world scenarios.
I am currently on leave from UMass and working at AWS on developing efficient and reliable LLM systems for safe and intelligent cloud operations. Please ping me at huiguan@amazon.com if you are interested in discussing the new exciting area or exploring research intern opportunities at AWS.
News
[May. 2025]: đđđCongratulations to Lijun Zhang, Hanmei Yang, and Jin Zhou for successfully defending their PhD Thesis. Lijun will join Amazon as Postdoc Scientist; Hanmei will join Meta and Jin will join NVIDIA. Wish them a wonderful new jouney.
[Apr. 2025]: âCongratulations to Sandeep for our work on Scaling Graph Neural Network Training on Large Graphs accepted to MLSysâ25. The work proposes split parallelism that distributes mini-batch GNN training workloads across multi-GPUs to reduce redundant data movement and computation and thus accelerate training.
[Feb. 2025]: đCongratulations to Sohaib for successfully defending his PhD Thesis on âOptimized Resource Allocation for Serving Deep Learning Modelsâ.
[Feb. 2025]: đ„Congratulations to Sohaib and Qizheng for our work âDiffServe: Efficiently Serving Text-to-Image Diffusion Models with Query-Aware Model Scalingâ accepted to MLSysâ25. We use Diffusion Models as a case study to demonstrate the potential of model cascading in improving model serving system efficiency (e.g, higher throughput, lower SLO violations, etc)
[Oct. 2024]: Congratulations to Lijun Zhang for our work âAttack-Resilient Image Watermarking Using Stable Diffusionâ accepted to NeurIPSâ24.
[Oct. 2024]: Congratulations to Kunjal Panchal for our work âThinking Forward: Memory-Efficient Federated Finetuning of Language Modelsâ accepted to NeurIPSâ24.
[Mar. 2024]: Congratulations to Mohammad for our work âCACTUS: Dynamically Switchable Context-aware micro-Classifiers for Efficient IoT Inferenceâ accepted to MobiSysâ24.
[Mar. 2024]: Congratulations to Sohaib for our work âLoki: A System for Serving ML Inference Pipelines with Hardware and Accuracy Scalingâ accepted to HPDCâ24.
[Mar. 2024]: Thanks for the gift funds from Adobe and Dobly.
[Feb. 2024]: Thanks for the support from the NSF for the CAREER award on Adaptive Deep Learning Systems Towards Edge Intelligence.
[Feb. 2024]: Congratulations to Qizheng Yang for our work âGMorph: Accelerating Multi-DNN Inference via Model Fusionâ accepted to EuroSysâ24.
[Sept. 2023]: Congratulations to Kunjal Panchal for our work on âFlow: Per-instance Personalized Federated Learningâ accepted to NeurIPSâ23.
[Sept. 2023]: Congratulations to Lijun Zhang for the 2023 IBM PhD Fellowship Award.
[Sept. 2023]: Congratulations to Sohaib Ahmad for our work on âProteus: A High-Throughput Inference-Serving System with Accuracy Scalingâ accepted to ASPLOSâ24.
[Sept. 2023]: Thanks for the support of NSF to our project Memory-Driven Full-Stack Collaboration for Embedded Systems. With collaborators, we will bring the power of deep learning to resource-constrained embedded systems!
[Aug. 2023]: Thanks for the support of NSF to our project Deep Learning on Anomaly Detection for Human Dynamics and Hazard Response. With collaborators, we will work on graph machine learning for anomaly detection.
[Aug. 2023]: Congratulations to Juelin and Sandeep for their work on âAccelerating Subgraph Enumeration Using Auxiliary Graphsâ accepted to PACTâ23.
[May. 2023]: Our work on âFlash: Concept Drift Adaptation in Federated Learningâ is accepted to ICMLâ23. It proposes a novel adaptive optimizer that simultanuously addresses both data heterogeneity and the concept drift issues in federated learning.
[May. 2023]: Our work on âAutomatically marginalized MCMC in probabilistic programmingâ is accepted to ICMLâ23. It proposes automatic marginalization to make sampling process using Hamiltonian Monte Carlo more efficient.
[May. 2023]: Our work on âNUMAlloc: A Faster NUMA Memory Allocatorâ is accepted to ISMMâ23.
[May. 2023]: Our work on âGSplit: Scaling Graph Neural Network Training on Large Graphs via Split-Parallelismâ is on Arxiv.
[Apr. 2023]: Our work on Re-thinking computation offload for efficient inference on IoT devices with duty-cycled radios is accepted to MobiComâ23.
[Oct. 2022]: Iâm excited to share that we have received an Amazon Research Award for our proposal âGroot: A GPU-Resident System for Efficient Graph Machine Learningâ at UMass Amherst. Learn more about the program on the website.
[Sept. 2022]: Our work on AutoMTL: A Programming Framework for Automating Efficient Multi-Task Learning is accepted to NeurIPSâ22. Congratulations to Lijun. The project is open-sourced
[Sept. 2022]: Thanks for the support of NSF to our project Transparently Scaling Graph Neural Network Training to Large-Scale Models and Graphs.
[Jul. 2022]: Our work on Fine-Grained Personalized Federated Learning Through Dynamic Routing is accepted to CrossFLâ2022 Workshop @MLSys. Congratulations to Kunjal.
[Jul. 2022]: Our work on Improving Subgraph Representation Learning via Multi-View Augmentation is accepted to AI4Scienceâ22 Workshop @ICML.
[May. 2022]: Our work âA Tree-Structured Multi-Task Model Recommenderâ is accepted to AutoMLâ22. Congratulations to Lijun. The project is open-sourced.
[May. 2022]: Welcome a new PhD student Qizheng Yang to join our lab this summer.
[Mar. 2022]: Thanks for the support of NVIDIA Academic Hardware Grant Program to the project âMultitasking-Centric Optimization for Deep Learning Applicationsâ.
[Mar. 2022]: Our paper âRethinking Hard-Parameter Sharing in Multi-Domain Learningâ is accepted to ICMEâ22. Congratulations to Lijun.
[Mar. 2022]: Our paper âEnabling Near Real-Time NLU-Driven Natural Language Programming through Dynamic Grammar Graph-Based Translationâ is accepted to CGOâ22.
[Mar. 2022]: Our paper âCOMET: A Novel Memory-Efficient Deep Learning Training Framework by Using Error-Bounded Lossy Compressionâ is accepted to VLDBâ22.
[Nov. 2021]: Our collaborative project with Prof. Zhou Lin on âAccelerating Fragment-Based Quantum Chemistry via Machine Learningâ received UMass ADVANCE Collaborative Research Seed Grant.
[Oct. 2021]: Our paper âFreeLunch: Compression-based GPU Memory Management for Convolutional Neural Networksâ is accepted to MCHPCâ21 Workshop, in conjunction with SCâ21.
[Oct. 2021]: Our paper âRecurrent Neural Networks Meet Context-Free Grammar: Two Birds with One Stoneâ is accepted to ICDMâ21.
[June 2021]: Our paper âScalable Graph Neural Network Training: The Case for Samplingâ has appeared in the ACM SIGOPS Operating Systems Review.
[June 2021]: Our paper CoCoPIE is accepted to CACMâ21.
[June 2021]: Our paper NumaPerf is accepted to ICSâ21.
[May 2021]: I have received an Adobe Research Collaboration Grant on developing resource-efficient deep multi-task learning solutions.
[May 2021]: Our paper âReuse-Centric Kmeans Configurationâ is accepted to Information Systems. Congratulations to Lijun.
Awards
- NSF CAREER Award, 2024
- Amazon Research Award, 2022
- NCSU Electrical and Computer Engineering Outstanding Dissertation Award, 2020
- IBM PhD Fellowship, 2015-2018