Publications
arXiv · 2026
Rethinking Compute Substrates for 3D-Stacked Near-Memory LLM Decoding: Microarchitecture-Scheduling Co-Design
Chenyang Ai, Yixing Zhang, Haoran Wu, Yudong Pan, Lechuan Zhao, Wenhui OU
Studies 3D-stacked near-memory processing for LLM decoding and proposes a microarchitecture-scheduling co-design that emphasizes area efficiency, reconfigurability, and effective multi-core scheduling.
arXiv:2604.04253 · Submitted on April 5, 2026, revised on April 9, 2026.