Dongkuan Wu

Email: dongkuanwu@gmail.com · 12211229@mail.sustech.edu.cn Tel: (+86) 15147183485 Supervisor: Hao Yu (SUSTech)

Focus: LLM compression (quantization, pruning), FPGA-based acceleration and edge deployment

Experience

Infix — Intern

LLM optimization & deployment

Sep. 2025 — Present

Toolchain for Model Compression & VCU128 FPGA Deployment

SUSTech, Yuhao Lab

Built a reusable HW–SW co-design flow from offline GPTQ (per‑group) & parameter packing to Verilog kernel integration; ran DeepSeek‑Distill‑Qwen 2.5‑7B, Qwen 2‑7B, ChatGLM 2/3‑6B (GPTQ) on VCU128.
Rearranged/packed quantized tensors into AXI‑Stream blocks to reduce runtime transposition; coordinated with Edge‑LLM framework for better on‑chip utilization and throughput.

2024 — 2025

One‑Shot Sparsity Mask Rebuilder (LLM‑Barber)

SUSTech, advised by Yupeng Su

Proposed a block‑aware reconstruction that integrates sparsity across Self‑Attention and MLP for globally optimized pruning; extensive experiments across models/settings.

2024 — 2025

Rotation‑driven Mixed‑Precision Quantization

UCSD, Mingu Kang Lab

Partitioned each layer’s weights/activations and applied orthogonal rotations to flatten outliers; enabled fine‑grained mixed‑precision with preserved end‑to‑end accuracy.

Mar. 2025 — Jun. 2025

Publications

LLM‑Barber: Block‑Aware Rebuilder for Sparsity Mask in One‑Shot for Large Language Models
Yupeng Su, Ziyi Guan, Xiaoqun Liu, Tianlai Jin, Dongkuan Wu · arXiv:2408.10631
Accepted to ICCAD 2025

2024 — 2025

APTQ+: Attention‑FFN‑aware Post‑Training Quantization for a Layer‑wise LLM Accelerator on FPGA
Ziyi Guan†, Zhengfei Chen†, Dongkuan Wu, Ao Shen, Yifan Guo, Yupeng Su
Under submission to IEEE TCAD 2026

2025

Education

Southern University of Science and Technology (SUSTech)

School of Microelectronics & Engineering · GPA: 3.73/4.0 · Supervisor: Prof. Hao Yu

Sept. 2022 — Jun. 2026

University of California, San Diego (UCSD)

Three‑month program (Reinforcement Learning, Machine Learning Algorithms, Digital IC) · GPA: 3.81/4.0 · with Prof. Mingu Kang

Mar. 2025 — Jun. 2025

Skills

Programming: Python, C/C++, Verilog, MATLAB

Lab/Domain: LLM compression (quantization, pruning), model deployment on edge devices; Cadence simulation & verification

Hardware/EDA: Xilinx VCU128; AXI‑Stream data packing; FPGA inference toolchain