I am currently a Founding Engineer at Eigen AI, where I work on model customization and serving infrastructure acceleration.
About
I received my Ph.D. in Computer Science from Purdue University in August 2024, under the supervision of Professor Xuehai Qian. Prior to that, I earned my Bachelor's Degree from Tsinghua University in 2018, where I worked with Professor Tianling Ren and Professor Shouyi Yin. Before transferring to Purdue in Fall 2022, I spent four years (2018 to 2022) in the Viterbi School of Engineering at the University of Southern California as a Ph.D. student. After completing my Ph.D., I spent some time at the University of Pittsburgh, where I worked on CFD simulation with analog quantum computing alongside Professor Junyu Liu, Professor Peyman Givi, and Professor Juan Jose Mendoza Arenas.
Current Work at Eigen AI
As a Founding Engineer at Eigen AI, I focus on optimizing the serving and inference of Large Language Models (LLMs), Vision-Language Models (VLMs), and multimodal generative models. My work encompasses:
- Inference Optimization: Applying techniques including speculative decoding, token merging, sparsity exploitation, and quantization to accelerate model inference while maintaining output quality.
- Serving Infrastructure: Building robust serving systems with comprehensive observability for monitoring cluster health, performance metrics, and resource utilization.
- Endpoint Management: Developing infrastructure to make model endpoints more manageable, stable, and reliable for production deployments.
- Model Customization: Enabling efficient fine-tuning and adaptation of foundation models for specific use cases and customer requirements.
PhD Research (Quantum Computer Architecture)
My PhD research applied computer architecture and systems principles to quantum computing, focusing on hardware-software co-design:
- Quantum Compilation & Optimization: Developing compiler frameworks (AccQOC, NAPA, EPOC) that translate high-level quantum programs into optimized hardware-native pulse sequences for improved circuit fidelity and reduced execution time.
- Distributed Quantum Systems: Designing compilation and resource allocation techniques (ECDQC, EDDQC) for multi-QPU systems, minimizing communication overhead and optimizing qubit placement across distributed quantum processors.
- Hardware-Software Co-Design: Building system abstractions that expose hardware characteristics (topology, noise, calibration) to enable architecture-aware optimizations at both the circuit and pulse level.
- Performance Analysis: Developing methodologies for fidelity estimation, randomized benchmarking, and error characterization to evaluate and improve quantum system performance.
- Quantum-Inspired Algorithms: Applying insights from quantum mechanics to design optimization algorithms (Quantum Hamiltonian Descent) for classical combinatorial problems like graph partitioning.
Publications
Selected publications (see Google Scholar for full list):
Mini Library
Lecture Notes & Courses
- Quantum Computation - Caltech's quantum computation course by John Preskill
- Quantum Information Science I - Scott Aaronson's comprehensive notes on quantum computation
- Quantum Information Science II - Advanced quantum computing course notes by Scott Aaronson
- Quantum Computation - Berkeley course by Umesh Vazirani
- Qubits, Quantum Mechanics and Computers - Additional quantum computing notes by Umesh Vazirani
- Quantum Information Theory - Course materials by Mark M. Wilde
- Quantum Algorithms - Comprehensive lecture notes by Andrew Childs
- Surviving as a Quantum Computer in a Classical World - Textbook by Daniel Gottesman
Contact
Email: nothing.personal.strict.business@gmail.com
Links:
Google Scholar |
LinkedIn |
Download CV