Publications
ISCA
LUT Tensor Core: A Software-Hardware Co-Design for LUT-Based Low-Bit LLM Inference
Zhiwen Mo, Lei Wang, Jianyu Wei, Zhichen Zeng, Shijie Cao, Lingxiao Ma, Naifeng Jing, Ting Cao, Jilong Xue, Fan Yang, Mao Yang
The 52nd International Symposium on Computer Architecture (ISCA ’25), 2025
2025
Internal
A GPU Kernel Research Project
Zhiwen Mo, et al.
Research in progress, 2025
2025
OSDI
PipeThreader: Software-Defined Pipelining for Efficient DNN Execution
Yu Cheng, Lei Wang, Yining Shi, and Yuqing Xia, Lingxiao Ma, Jilong Xue, Yang Wang, Zhiwen Mo, Feiyang Chen, Fan Yang, Mao Yang, Zhi Yang
The 19th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’25), 2025
2025
arXiv
TileLang: A Composable Tiled Programming Model for AI Systems
Lei Wang, Yu Cheng, Yining Shi, Zhengju Tang, Zhiwen Mo, Wenhao Xie, Lingxiao Ma, Yuqing Xia, Jilong Xue, Fan Yang, Zhi Yang
arXiv preprint arXiv:2504.17577, 2025
2025
arXiv
Rethinking Optimal Verification Granularity for Compute-Efficient Test-Time Scaling
Hao Mark Chen, Guanxi Lu, Yasuyuki Okoshi, Zhiwen Mo, Masato Motomura, Hongxiang Fan
arXiv preprint arXiv:2505.11730, 2025
2025
DAC
Enabling Multiple Tensor-wise Operator Fusion for Transformer Models on Spatial Accelerators
Lei Xu, Zhiwen Mo, Qin Wang, Jianfei Jiang, Naifeng Jing
The 62nd ACM/IEEE Design Automation Conference (DAC ’24), 2024
2024