Homepage of Zhen Dong

PhD Student at UC Berkeley

Research Interests

Hardware and software co-design for efficient deep learning.

Model compression for classification/object detection/NLP on embedded platforms.

AutoML and Hardware-aware neural architecture search.

Computer architectures beyond Von Neumann such as in-memory computing.


University of California at Berkeley:

Visual Object and Activity Recognition (4.00)

RISC-V CPU on FPGA Lab (4.00)

Digital Circuits and Computer Architecture (4.00)

Applications of Parallel Computers (4.00)

Statistical Learning Theory (4.00)


Peking University: (Rank 1/327 in EECS)

Digital Logic (4.00)

Principles of Digital Integrated Circuits (4.00)

Analog Circuits (3.99)

Advanced Analog Integrated Circuits Design (3.99)


Micro-Nano Integrated System (4.00)

Fundamentals of Solid State Physics (3.98)

Fundamentals of Semiconductor Materials (3.97)

Physics of Semiconductor (3.98)

Semiconductor Device Physics (3.98)

Principle of Integrated Circuits Process (3.99)


  • Winner of 2018-2020 Berkeley Fellowship.

  • IEEE Transactions on Electron Devices (TED) Reviewer.

  • IEEE Transactions on Neural Networks and Learning Systems (TNNLS) Reviewer.

  • AWS Research Credits Award and Google Cloud Research Credits Award.

  • Tang Lixin Scholarship for outstanding students in China. (top 0.5%)

  • Tang Lixin 1st Prize Scholarship for graduate students studying abroad. (top 0.05%)

  • SenseTime Scholarship, National Scholarship and Fang Zheng Scholarship. (top 1%)

  • Pacemaker to Triple-A student and Triple-A student (twice) at Peking University.

  • 1st Prize in the Chinese Olympiad in Physics and the Chinese Physics Competition for college students.

  • Princeton University Math Competition (PUMac): Top three among all participants in geometry group.

  • Top Ten Undergraduate Research Award at PKU EECS.

  • Outstanding Graduates at Peking University and Outstanding Graduates in Beijing.


  • Zhen Dong*, Dequan Wang*, Qijing Huang*, Yizhao Gao, Yaohui Cai, Bichen Wu, Kurt Keutzer, John Wawrzynek. “CoDeNet: Algorithm-hardware Co-design for Deformable Convolution“, under review.
  • Zhen Dong, Zhewei Yao, Yaohui Cai, Daiyaan Arfeen, Amir Gholami, Michael W. Mahoney, Kurt Keutzer. “HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks“, NeurIPS 2020.
  • Yaohui Cai*, Zhewei Yao*, Zhen Dong*, Amir Gholami, Michael W. Mahoney, Kurt Keutzer. “ZeroQ: A Novel Zero Shot Quantization Framework“, CVPR 2020.
  • Sheng Shen*, Zhen Dong*, Jiayu Ye*, Linjian Ma, Zhewei Yao, Amir Gholami, Michael W. Mahoney, Kurt Keutzer. “Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT“, Spotlight, AAAI 2020.
  • Zhen Dong, Zhewei Yao, Daiyaan Arfeen, Yaohui Cai, Michael Mahoney, Kurt Keutzer. “Trace Weighted Hessian-Aware Quantization“, Oral, Opt-Workshop, NeurIPS 2019.
  • Q. Huang, D. Wang, Y. Gao, Y. Cai, Zhen Dong, B. Wu, K. Keutzer and J. Wawrzynek. “Algorithm-hardware Co-design for Deformable Convolution“, Oral, EMC2-Workshop, NeurIPS 2019.
  • Zhen Dong, Yaohui Cai, Amir Gholami, Tianjun Zhang, Kurt Keutzer. “Ultra-low Bit Quantization for Visual Wake Word Challenge“, 2nd Place at VWW Competition, CVPR 2019
  • Zhen Dong*, Zhewei Yao*, Amir Gholami*, Michael W. Mahoney, Kurt Keutzer. “HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision“, ICCV 2019.
  • Zhen Dong, Zheng Zhou, Zefan Li, Peng Huang, Lifeng Liu, Xiaoyan Liu, Jinfeng Kang. “Convolutional Neural Networks for Image Recognition and Online Learning Based on RRAM Devices.” IEEE Transactions on Electron Devices 2018, p.793-801.
  • Jinfeng Kang, Zhen Dong, Peng Huang, Renze Han, Lifeng Liu, Xiaoyan Liu. China patent about 3D RRAM.
  • Huang, P., Li, Z., Zhen Dong, Han, R., Zhou, Z., Zhu, D., Liu, L., Liu, X. and Kang, J. “Binary Resistive Switching Device Based Electronic Synapse with Spike-Rate-Dependent-Plasticity for Online Learning.” ACS Applied Electronic Materials 2018, pp. 845-853.
  • Zhen Dong, Z. Zhou, Z. F. Li, C. Liu, Y. N. Jiang, P. Huang, L. F. Liu, X. Y. Liu, and J. F. Kang. “RRAM based convolutional neural networks for high accuracy pattern recognition and online learning tasks.” Oral, VLSI-SNW 2017, pp. 145-146. IEEE, 2017.
  • Runze Han, Peng Huang, Yachen Xiang, Chen Liu, Zhen Dong, et al. “A Novel Convolution Computing Paradigm Based on NOR Flash Array With High Computing Speed and Energy Efficiency.” IEEE Transactions on Circuits and Systems, p.1-12.
  • Xinxin Wang, Peng Huang, Zhen Dong, Zheng Zhou, Yuning Jiang, Runze Han, Lifeng Liu, Xiaoyan Liu, Jinfeng Kang. “A Novel RRAM-based Adaptive-Threshold LIF Neuron Circuit for High Recognition Accuracy.” International Symposium on VLSI Technology, Systems and Applications (VLSI-TSA), pp. 1-2.
  • Zheng Zhou, Chen Liu, Wensheng Shen, Zhen Dong, Zhe Chen, Peng Huang, Lifeng Liu, Xiaoyan Liu, Jinfeng Kang. “The Characteristics of Binary Spike-Time-Dependent Plasticity in HfO2-Based RRAM and Applications for Pattern Recognition.” Nanoscale Research Letters, 12(1), p.244.
  • P. Huang, D. B. Zhu, C. Liu, Z. Zhou, Zhen Dong, H. Jiang, W. S. Shen, L. F. Liu, X. Y. Liu, and J. F. Kang. “RTN based Oxygen Vacancy Probing Method for Ox-RRAM Reliability Characterization and Its Application in Tail Bits.” International Electron Devices Meeting (IEDM) 2017, pp. 21-4.

Research Experience

PhD Student, Electrical Engineering and Computer Sciences, UC Berkeley

Advisor: Prof. Kurt Keutzer

Research on Hessian-AWare Quantization (HAWQ, HAWQ-V2, ZeroQ)                                             Nov 2018 – present

  • Propose a second order based method to decide mixed-precision configuration and block-wise fine-tuning order.
  • Prove theorem to use the trace of Hessian as sensitivity metric and conduct fast Pareto frontier optimization.
  • Extend HAWQ to segmentation, object detection tasks and achieve state-of-the-art results.
  • Conduct fast end-to-end quantization without fine-tuning and without using any training/test data.

Research on HW-SW Co-design and NAS (HAWQ-V3, CoDeNet)                                                       Jan 2019 – present

  • Propose efficient deformable operations for object detection on embedded FPGAs.
  • Design new FPGA-core with ultra-low precision arithmetic.
  • HW-SW joint architecture search and efficient implementation of mixed-precision NNs on CPU/GPU/FPGAs

Research on Efficient Natural Language Processing (Q-BERT)                                                          June 2019 – present

  • Propose new method to reduce the model size of BERT-base for applications on edge devices.
  • Use second order information to help reduce communications during distributed training.
  • Mixed-precision distributed training on the cloud or efficient fine-tuning on the edge.

Research Intern, Facebook AI

Research on efficient natural language processing (NLP) with limited resources.                              May 2020 — August 2020

Undergraduate visiting researcher program (UGVR), Stanford University

Advisor: Prof. H.-S. Philip Wong

Research on utilizing RRAM array for large-scale networks and transfer learning.             

Research on building tools based on statistical ML for analyzing energy consumption and delay in 3D RRAM array. 

Research Intern, SenseTime Corporation

Research on 4-bit model compression (both weight and activation) on RetinaNet for the SenseTime database.

Research Assistant, EECS School, Peking University

Advisor: Prof. Jinfeng Kang

Research on spike-time-dependent plasticity (STDP) characteristics in Oxide-RRAM for brain-inspired computing.    

Research on NVM-based hardware implementation of convolutional neural networks.           


  • “Efficient Neural Networks through Systematic Quantization”, BAIR/CPAR/BDD Seminar 2020, [slides], [link].
  • “ZeroQ: A novel Zero-Shot Quantization Framework”, Real-Time Intelligent Secure Explainable Systems (RISELab) Retreat 2020, Lake Tahoe (online), US, [slides].
  • Berkeley AI Research (BAIR)/ Berkeley Deep Drive (BDD) Workshop 2020, Santa Rosa, US.
  • “Q-BERT: Hessian Based Quantization of BERT”, AAAI 2020, New York, US, [slides].
  • “Hessian-Aware trace-Weighted Quantization”, Beyond First-Order Methods in ML Workshop at NeurIPS 2019, Vancouver, Canada.
  • Real-Time Intelligent Secure Explainable Systems (RISELab) Retreat 2019, Monterey, US.
  • Berkeley AI Research (BAIR)/ Berkeley Deep Drive (BDD) Workshop 2019, Berkeley, US.
  • Visual Wake Word Challenge, LPIRC Workshop at CVPR 2019, Long Beach, US, [slides], [link].
  • “RRAM Based Convolutional Neural Networks for High Accuracy Pattern Recognition and Online Learning Tasks”, VLSI-SNW 2017, Kyoto, Japan, [slides].


UC Berkeley, CA, 94709