We work at the intersection of AI and hardware to make AI computing sustainable



Publications
Transformers in the Dark: Navigating Unknown Search Spaces via Bandit Feedback
TABED: Test-Time Adaptive Ensemble Drafting for Robust Speculative Decoding in LVLMs
Draft-based Approximate Inference for LLMs
ParallelBench: Understanding the Tradeoffs of Parallel Decoding in Diffusion LLMs
Inference-Aligned SFT for Diffusion LLMs via Group-based Trajectory Sampling
XQuant: Breaking the Memory Wall for LLM Inference with KV Cache Rematerialization
Counting Guidance for High Fidelity Text-to-Image Synthesis
VersaPRM: Multi-Domain Process Reward Model via Synthetic Reasoning Data
Parameter-Efficient Fine-Tuning of State Space Models
State-offset Tuning: State-based Parameter-Efficient Fine-Tuning for State Space Models
Eta Inversion: Designing an Optimal Eta Function for Diffusion-based Real Image Editing
Can MLLMs Perform Multimodal In-Context Learning for Text-to-Image Generation?

