Dongkuan Wu

Email: dongkuanwu@gmail.com · 12211229@mail.sustech.edu.cn Tel: (+86) 15147183485 Supervisor: Hao Yu (SUSTech)
Focus: LLM compression (quantization, pruning), FPGA-based acceleration and edge deployment

Experience

Infix — Intern

LLM optimization & deployment
Sep. 2025 — Present

Toolchain for Model Compression & VCU128 FPGA Deployment

SUSTech, Yuhao Lab
  • Built a reusable HW–SW co-design flow from offline GPTQ (per‑group) & parameter packing to Verilog kernel integration; ran DeepSeek‑Distill‑Qwen 2.5‑7B, Qwen 2‑7B, ChatGLM 2/3‑6B (GPTQ) on VCU128.
  • Rearranged/packed quantized tensors into AXI‑Stream blocks to reduce runtime transposition; coordinated with Edge‑LLM framework for better on‑chip utilization and throughput.
2024 — 2025

One‑Shot Sparsity Mask Rebuilder (LLM‑Barber)

SUSTech, advised by Yupeng Su
  • Proposed a block‑aware reconstruction that integrates sparsity across Self‑Attention and MLP for globally optimized pruning; extensive experiments across models/settings.
2024 — 2025

Rotation‑driven Mixed‑Precision Quantization

UCSD, Mingu Kang Lab
  • Partitioned each layer’s weights/activations and applied orthogonal rotations to flatten outliers; enabled fine‑grained mixed‑precision with preserved end‑to‑end accuracy.
Mar. 2025 — Jun. 2025

Publications

LLM‑Barber: Block‑Aware Rebuilder for Sparsity Mask in One‑Shot for Large Language Models
Yupeng Su, Ziyi Guan, Xiaoqun Liu, Tianlai Jin, Dongkuan Wu · arXiv:2408.10631
Accepted to ICCAD 2025
2024 — 2025
APTQ+: Attention‑FFN‑aware Post‑Training Quantization for a Layer‑wise LLM Accelerator on FPGA
Ziyi Guan†, Zhengfei Chen†, Dongkuan Wu, Ao Shen, Yifan Guo, Yupeng Su
Under submission to IEEE TCAD 2026
2025

Education

Southern University of Science and Technology (SUSTech)

School of Microelectronics & Engineering · GPA: 3.73/4.0 · Supervisor: Prof. Hao Yu
Sept. 2022 — Jun. 2026

University of California, San Diego (UCSD)

Three‑month program (Reinforcement Learning, Machine Learning Algorithms, Digital IC) · GPA: 3.81/4.0 · with Prof. Mingu Kang
Mar. 2025 — Jun. 2025

Skills

Programming: Python, C/C++, Verilog, MATLAB
Lab/Domain: LLM compression (quantization, pruning), model deployment on edge devices; Cadence simulation & verification
Hardware/EDA: Xilinx VCU128; AXI‑Stream data packing; FPGA inference toolchain