Publications

ISCA
LUT Tensor Core: A Software-Hardware Co-Design for LUT-Based Low-Bit LLM Inference

Zhiwen Mo, Lei Wang, Jianyu Wei, Zhichen Zeng, Shijie Cao, Lingxiao Ma, Naifeng Jing, Ting Cao, Jilong Xue, Fan Yang, Mao Yang

The 52nd International Symposium on Computer Architecture (ISCA ’25), 2025
2025
Internal
A GPU Kernel Research Project

Zhiwen Mo, et al.

Research in progress, 2025
2025
OSDI
PipeThreader: Software-Defined Pipelining for Efficient DNN Execution

Yu Cheng, Lei Wang, Yining Shi, and Yuqing Xia, Lingxiao Ma, Jilong Xue, Yang Wang, Zhiwen Mo, Feiyang Chen, Fan Yang, Mao Yang, Zhi Yang

The 19th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’25), 2025
2025
arXiv
TileLang: A Composable Tiled Programming Model for AI Systems

Lei Wang, Yu Cheng, Yining Shi, Zhengju Tang, Zhiwen Mo, Wenhao Xie, Lingxiao Ma, Yuqing Xia, Jilong Xue, Fan Yang, Zhi Yang

arXiv preprint arXiv:2504.17577, 2025
2025
arXiv
Rethinking Optimal Verification Granularity for Compute-Efficient Test-Time Scaling

Hao Mark Chen, Guanxi Lu, Yasuyuki Okoshi, Zhiwen Mo, Masato Motomura, Hongxiang Fan

arXiv preprint arXiv:2505.11730, 2025
2025
DAC
Enabling Multiple Tensor-wise Operator Fusion for Transformer Models on Spatial Accelerators

Lei Xu, Zhiwen Mo, Qin Wang, Jianfei Jiang, Naifeng Jing

The 62nd ACM/IEEE Design Automation Conference (DAC ’24), 2024
2024