Hui Guan's Homepage

2025

[EXAIT@ICML’25] Reimagining Parameter Space Exploration with Diffusion Models. [PDF)]
Lijun Zhang, Xiao Liu, Hui Guan
First Exploration in AI Today Workshop at ICML (EXAIT at ICML 2025)
[MLSys’25] SPA: Scaling Graph Neural Network Training on Large Graphs via Probablistic Splitting. [PDF)]
Sandeep Polisetty, Juelin Liu, Yi Fung, Seung-Hwan Lim, Hui Guan, Marco Serafini
The Eighth Annual Conference on Machine Learning and Systems, Santa Clara, May 12-15, 2025 (Acceptance Rate = 22% (61/271))
[MLSys’25] DiffServe: Efficiently Serving Text-to-Image Diffusion Models with Query-Aware Model Scaling. [PDF]
Sohaib Ahmad (co-first author), Qizheng Yang (co-first author), Haoliang Wang, Ramesh K. Sitaraman, and Hui Guan
The Eighth Annual Conference on Machine Learning and Systems, Santa Clara, May 12-15, 2025. (Acceptance Rate = 22% (61/271))
[VLDB’25] Graph neural network training systems: A performance comparison of full-graph and mini-batch.[PDF]
Saurabh Bajaj, Hojae Son, Juelin Liu, Hui Guan, and Marco Serafini
Proceedings of the VLDB Endowment, 2025

2024

[IEEE Access’24] Information-Enhanced Graph Neural Network for Transcending Homophily Barriers.[PDF]
Xiao Liu, Lijun Zhang, Hui GUan
In IEEE Access, 2024.
[MLforSys@NeurIPS’24] Understanding and Alleviating Memory Issue in RLHF for LLMs.[PDF]
Jin Zhou, Hanmei Yang, Steven Jiaxun Tang, Mingcan Xiang, Hui Guan, Tongping Liu
NeurIPS’24 Workshop MLforSys, Dec 14, 2024, Vancouver
[AI4Mat@NeurIPS’24] Integrating Graph Neural Networks and Many-Body Expansion Theory for Potential Energy Surfaces.[PDF]
Siqi Chen, Zhiqiang Wang, Xianqi Deng, Yili Shen, Cheng-Wei Ju, Jun Yi, Lin Xiong, Guo Ling, Dieaa Alhmoud, Hui Guan, Zhou Lin
NeurIPS’24 Workshop AI4Mat, Dec 15, 2024, Vancouver
[NeurIPS’24] Attack-Resilient Image Watermarking Using Stable Diffusion. [PDF][Code]
Lijun Zhang, Xiao Liu, Antoni Viros i Martin, Cindy Xiong Bearfield, Yuriy Brun, Hui Guan
NeurIPS ’24, Mon, Dec 9, 2024 – Sun, Dec 15, 2024, Vancouver
[NeurIPS’24] Thinking Forward: Memory-Efficient Federated Finetuning of Language Models. [PDF][Code]
Kunjal Panchal, Nisarg Parikh, Sunav Choudhary, Lijun Zhang, Yuriy Brun, Hui Guan
NeurIPS ’24, Mon, Dec 9, 2024 – Sun, Dec 15, 2024, Vancouver
[ACM MM’24] AdapMTL: Adaptive Pruning Framework for Multitask Learning Model. [PDF]
Mingcan Xiang, Jiaxun Tang, Qizheng Yang, Hui Guan, Tongping Liu
ACM MM ’24, October 28-November 1, 2024, Melbourne, VIC, Australia
https://doi.org/10.1145/3664647.3681426
[MIPR’24] Structured Pruning for Multi-Task Deep Neural Networks.[PDF]
Siddhant Garg, Lijun Zhang, Hui Guan
International Conference on Multimedia Information Processing and Retrieval, August 07, 2024.
[MobiSys’24] CACTUS: Dynamically Switchable Context-aware micro-Classifiers for Efficient IoT Inference. [PDF][Code]
Mohammad Mehdi Rastikerdar, Jin Huang, Shiwei Fang, Hui Guan, Deepak Ganesan.
The 22nd ACM International Conference on Mobile Systems, Applications, and Services (MobiSys), Tokyo, Japan, June 3-7, 2024.
https://doi.org/10.1145/3643832.3661888
[HPDC’24] Loki: A System for Serving ML Inference Pipelines with Hardware and Accuracy Scaling. [PDF]
Sohaib Ahmad, Hui Guan, Ramesh K. Sitaraman.
The 33rd International Symposium on High-Performance Parallel and Distributed Computing (HPDC’24), Pisa, Italy, June 3-7, 2024. (Acceptance Rate = 17% (26/152))
[EuroSys’24] GMorph: Accelerating Multi-DNN Inference via Model Fusion. [PDF][Code]
Qizheng Yang, Tianyi Yang, Mingcan Xiang, Lijun Zhang, Haoliang Wang, Marco Serafini, Hui Guan.
The 2024 European Conference on Computer Systems (EuroSys), April 22-25, 2024.
https://doi.org/10.1145/3627703.3650074
[ASPLOS’24] Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling. [PDF][Code]
Sohaib Ahmad, Hui Guan, Brain D. Friedman, Thomas Williams, Ramesh K. Sitaraman, Thomas Woo.
The 2024 ACM Conference on Architectural Support for Programming Languages and Operating Systems, April 27-May 1, 2024.
https://doi.org/10.1145/3617232.3624849

2023

[NeurIPS’23] Flow: Per-instance Personalized Federated Learning. [PDF][Code]
Kunjal Panchal, Sunav Choudhary, Nisarg Parikh, Lijun Zhang, Hui Guan.
The 2023 Conference on Neural Information Processing Systems, Dec. 10-16, 2023.
[PACT’23] GraphMini: Accelerating Graph Pattern Matching Using Auxiliary Graphs. [PDF][Code]
Juelin Liu, Sandeep Polisetty, Hui Guan, Marco Serafini.
The 32nd International Conference on Parallel Architectures and Compilation Techniques, Oct. 21-25, 2023.
[TNNLS’23] A Tree-Structured Multi-Task Model Architectures Recommendation System. [PDF][Code]
Lijun Zhang, Xiao Liu, Hui Guan.
IEEE Transactions on Neural Networks and Learning Systems, 2023.
[Manuscript] GSplit: Scaling Graph Neural Network Training on Large Graphs via Split-Parallelism. [PDF]
Sandeep Polisetty, Juelin Liu, Kibi Falus, Yi Ren Fung, Seung-Hwan Lim, Hui Guan, Marco Serafini.
Arxiv, 2023.
[ICML’23] Flash: Concept Drift Adaptation in Federated Learning. [PDF]
Kunjal Panchal, Sunav Choudhary, Subrata Mitra, Koyel Mukherjee, Somdeb Sarkhel, Saayan Mitra, Hui Guan.
40th International Conference on Machine Learning, Jul. 23-29, 2023
[ICML’23] Automatically marginalized MCMC in probabilistic programming. [PDF]
Jinlin Lai, Javier Burroni, Hui Guan, Daniel Sheldon.
40th International Conference on Machine Learning, Jul. 23-29, 2023
[ISMM’23] NUMAlloc: A Faster NUMA Memory Allocator. [PDF]
Hanmei Yang, Xin Zhao, Jin Zhou, Wei Wang, Sandip Kundu, Bo Wu, Hui Guan, and Tongping Liu.
ACM SIGPLAN International Symposium on Memory Management, 2023.
[MobiCom’23] Re-thinking computation offload for efficient inference on IoT devices with duty-cycled radios. [PDF]
Jin Huang, Hui Guan, Deepak Ganesan.
The 29th International Conference on Mobile Computing and Networking, Madrid, Spain, Oct. 2-6, 2023
[IEEE Access’23] An Alternative Hard-Parameter Sharing Paradigm for Multi-Domain Learning. [PDF]
Lijun Zhang, Qizheng Yang, Xiao Liu, Hui Guan.
In IEEE Access, 2023.

2022

[NeurIPS’22] AutoMTL: A Programming Framework for Automating Efficient Multi-Task Learning. [PDF][Code]
Lijun Zhang, Xiao Liu, Hui Guan.
36th Conference on Neural Information Processing Systems (NeurIPS 2022), November 28, 2022. (Acceptance rate: 25.6%)
[AutoML’22] A Tree-Structured Multi-Task Model Recommender. [PDF][Code][Teaser][Video]
Lijun Zhang, Xiao Liu, Hui Guan.
1st International Conference on Automated Machine Learning, July 25-27, 2022. (Acceptance rate: 19.2%)
[ICME’22] Rethinking Hard-Parameter Sharing in Multi-Domain Learning. [PDF]
Lijun Zhang, Qizheng Yang, Xiao Liu, Hui Guan.
IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), Taipei, Taiwan, July 18-22, 2022. (Acceptance rate: 29%)
[CGO’22] Enabling Near Real-Time NLU-Driven Natural Language Programming through Dynamic Grammar Graph-Based Translation. [PDF]
Zifan Nan, Xipeng Shen, Hui Guan.
The 2022 International Symposium on Code Generation and Optimization (CGO), Seoul, South Korea, 2022.
[VLDB’22] COMET: A Novel Memory-Efficient Deep Learning Training Framework by Using Error-Bounded Lossy Compression. [PDF]
Sian Jin, Chengming Zhang, Xintong Jiang, Yunhe Feng, Hui Guan, Guanpeng Li, Shuaiwen Leon Song, and Dingwen Tao.
In International Conference on Very Large Data Bases, 2022.
[CrossFL’22] Flow: Fine-grained Personalized Federated Learning through Dynamic Routing. [PDF][Poster]
Kunjal Panchal, Hui Guan
CrossFL 2022 Workshop @ MLSys’22
[AI4Science@ICML’22] Improving Subgraph Representation Learning via Multi-View Augmentation. [PDF][talk]
Yili Shen, Jiaxu Yan, Cheng-Wei Ju, Jun Yi, Zhou Lin, Hui Guan
ICML 2022 AI4Science Workshop

2021

[ICDM’21] Recurrent Neural Networks Meet Context-Free Grammar: Two Birds with One Stone. [PDF]
Hui Guan, Umang Chaudhary, Yuanchao Xu, Lin Ning, Lijun Zhang, and Xipeng Shen.
In IEEE International Conference on Data Mining, 2021 (short paper). (Acceptance rate: 20% (198/990))
[ICS’21] NumaPerf: Predictive and Comprehensive NUMA Profiling. [PDF]
Xin Zhao, Jin Zhou, Hui Guan, Wei Wang, Xu Liu, Tongping Liu.
In Proceedings of International Conference on Supercomputing, 2021. (Acceptance rate: 25% (39/157))
[CC’21] Deep NLP-Based Co-Evolvement for Synthesizing Code Analysis from Natural Language. [PDF]
Zifan Nan, Hui Guan, Xipeng Shen, and Chunhua Liao.
In The ACM SIGPLAN 2021 International Conference on Compiler Construction, 2021.
[OSR’21] Scalable Graph Neural Network Training: The Case for Sampling. [PDF]
Marco Serafini, Hui Guan.
In ACM SIGOPS Operating Systems Review, 2021.
[MCHPC’21] FreeLunch: Compression-based GPU Memory Management for Convolutional Neural Networks. [PDF]
Shaurya Patel, Tongping Liu, Hui Guan.
In MCHPC’21 Workshop.
[CACM’21] CoCoPIE: Enabling Real-Time AI on Off-the-Shelf Mobile Devices via Compression-Compilation Co-Design. [PDF]
Hui Guan, Shaoshan Liu, Xiaolong Ma, Wei Niu, Bin Ren, Xipeng Shen, Yanzhi Wang, Pu Zhao. (Authors in Alphabetical Order)
In Communications of the ACM, 2021.
[InformationSystems’21] Reuse-Centric K-Means Configuration. [PDF]]
Lijun Zhang, Hui Guan, Yufei Ding, Xipeng Shen, Hamid Krim.
Information Systems, 2021.

2020 and Before

[FSE’20] HISyn: Human Learning-Inspired Natural Language Programming. [PDF]
Zifan Nan, Hui Guan, Xipeng Shen.
In The ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Sacramento, California, United States, November 2020. (Acceptance rate: 101/360=28%)
[MLSys’20] FLEET: Flexible Efficient Ensemble Training for Heterogeneous Deep Neural Networks. [PDF]
Hui Guan, Laxmikant Kishor Mokadam, Xipeng Shen, Robert Patton.
MLSys’20. (Acceptance rate: 20.0% (34/170)).
[TPDS’20] An Automatic Synthesizer of Advising Tools for High Performance Computing. [PDF]
Hui Guan, Xipeng Shen, and Hamid Krim.
In IEEE Transactions on Parallel and Distributed Systems (TPDS), 2020
[NeurIPS’19] In-Place Zero-Space Memory Protection for CNN. [PDF]
Hui Guan, Lin Ning, Zhen Lin, Xipeng Shen, Huiyang Zhou, and Seung-Hwan Lim.
In Advances in Neural Information Processing Systems, pp. 5735-5744. 2019. (Acceptance rate: 21.2% (1428/6743))
[PLDI’19] Wootz: a Compiler-based Framework for Fast CNN Pruning via Composability. [PDF]
Hui Guan, Xipeng Shen, and Seung-Hwan Lim.
In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 717-730. ACM, 2019. (Acceptance rate: 27.7% (76/274))
[ICDE’19] Adaptive Deep Reuse: Accelerating CNN Training on the Fly. [PDF]
Lin Ning, Hui Guan, and Xipeng Shen.
In 2019 IEEE 35th International Conference on Data Engineering (ICDE), pp. 1538-1549. IEEE, 2019. (Acceptance rate: 18%)
[MLSys@NeurIPS’19] Post-Training 4-bit Quantization on Embedding Tables. [PDF]
Hui Guan, Andrey Malevich, Jiyan Yang, Jongsoo Park, and Hector Yuen.
MLSys Workshop on Systems for ML @ NeurIPS, 2019.
[SC’18] Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines. [PDF]
Randall Pittman, Hui Guan, Xipeng Shen, Seung-Hwan Lim, and Robert M. Patton.
In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, p. 64. IEEE, 2018. (Acceptance rate: 23%)
[ICDE’18] Reuse-Centric K-Means Configuration. [PDF]]
Hui Guan, Yufei Ding, Xipeng Shen, and Hamid Krim.
In 2018 IEEE 34th International Conference on Data Engineering (ICDE), pp. 1224-1227. IEEE, 2018. (short paper) (Acceptance rate: 23%)
[SysML’18] TOP: A Compiler-Based Framework for Optimizing Machine Learning Algorithms through Generalized Triangle Inequality. [PDF]
Yufei Ding, Lin Ning, Hui Guan, Xipeng Shen, Madanlal Musuvathi, Todd Mytkowicz.
SysML, Feb 16th, 2018, Stanford University, 2018.
[SC’17] Egeria: a Framework for Automatic Synthesis of HPC Advising Tools through Multi-Layered Natural Language Processing. [PDF]
Hui Guan, Xipeng Shen, and Hamid Krim.
In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, p. 10. ACM, 2017. (Acceptance rate: 18% (61/327))
[PLDI’17] “Generalizations of the Theory and Deployment of Triangular Inequality for Compiler-Based Strength Reduction. [PDF]
Yufei Ding, Lin Ning, Hui Guan, and Xipeng Shen.
In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 33-48. ACM, 2017. (Acceptance rate: 15% (47/322))
[SPAWC’16] A topological collapse for document summarization. [PDF]
Hui Guan, Wen Tang, Hamid Krim, James Keiser, Andrew Rindos, and Radmila Sazdanovic. In 2016 IEEE 17th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), pp. 1-5. IEEE, 2016.