I’m Wei Wen (温伟), a Ph.D. candidate in Duke University, supervised by Dr. Hai (Helen) Li and Dr. Yiran Chen. My research is Deep Learning and its applications in Computer Vision and Natural Language Processing. Recently, I focused on understanding learning algorithms, structural learning for accurate and efficient deep neural networks, and optimization algorithms for distributed deep learning.
I will intern at Google Brain in this summer. I enjoyed research internships in Facebook Research, Microsoft Research Redmond & Asia, and HP Labs, where I incorporated my research into industrial AI productions.
My Ph.D. thesis is “Efficient and Scalable Deep Learning“. Committee: Hai (Helen) Li, Yiran Chen, Robert Calderbank, Yangqing Jia and Guillermo Sapiro. My research topics were SmoothOut algorithm to escape sharp minima in deep neural networks to understand generalization, TernGrad SGD to overcome the communication bottleneck in distributed deep learning, and structural learning algorithms to learn sparse and low-rank structures in deep neural networks for faster inference.
Industrial Experience
More in LinkedIn.
- Google Brain, Research Intern, Mountain View, 05/2019-08/2019
- Facebook Research, Research Intern, AI Infra & Applied Machine Learning, Menlo Park, 05/2018-08/2018
- Mentor: Yangqing Jia
- Caffe2 and Personalization; Distributed Machine Learning
- Microsoft Research, Research Intern, Web Search and Business AI, Redmond & Bellevue, 05/2017-07/2017
- Mentor: Yuxiong He
- Machine Reading Comprehension; Recurrent Neural Nets
- HP Labs, Research Intern, Platform Architecture Group, Palo Alto, 05/31/2016-08/31/2016
- Mentor:Cong Xu. Manager: Paolo Faraboschi
- Worked on distributed deep learning.
- Agricultural Bank of China, Software Engineer, Software Development Center, Beijing, 07/2013-07/2014
- Microsoft Research Asia, Research Intern, Mobile Computer Vision, Beijing, 04/2013-06/2013
- Tencent Inc., Software Engineer Intern, Advertising Platform and Products Division, Beijing, 07/2012-09/2012
Publications
- Lym, Sangkug, Armand Behroozi, Wei Wen, Ge Li, Yongkee Kwon, and Mattan Erez. “Mini-batch Serialization: CNN Training with Inter-layer Data Reuse.” SysML Conference 2019. [paper]
- Wei Wen, Yandan Wang, Feng Yan, Cong Xu, Yiran Chen, Hai Li, “SmoothOut: Smoothing Out Sharp Minima to Improve Generalization in Deep Learning”, preprint. [paper][code]
- Wei Wen, Yuxiong He, Samyam Rajbhandari, Minjia Zhang, Wenhan Wang, Fang Liu, Bin Hu, Yiran Chen, Hai Li, “Learning Intrinsic Sparse Structures within Long Short-Term Memory”,the 6th International Conference on Learning Representations (ICLR), 2018. [poster][paper][code]
- Wei Wen, Cong Xu, Feng Yan, Chunpeng Wu, Yandan Wang, Yiran Chen, Hai Li, “TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning”,the 31st Annual Conference on Neural Information Processing Systems (NIPS), 2017. (Oral, 40/3240=1.2%. Available in PyTorch/Caffe2.). [paper][video][slides][code][poster]
- Wei Wen, Cong Xu, Chunpeng Wu, Yandan Wang, Yiran Chen, Hai Li, “Coordinating Filters for Faster Deep Neural Networks”, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017. [paper][code][poster]
- Wei Wen, Chunpeng Wu, Yandan Wang, Yiran Chen, Hai Li, “Learning Structured Sparsity in Deep Neural Networks”, the 30th Annual Conference on Neural Information Processing Systems (NIPS), 2016. Acceptance Rate: 568/2500=22.7%. (Integrated into Intel Nervada) [paper][code][poster]
- Jingchi Zhang, Wei Wen, Michael Deisher, Hsin-Pai Cheng, Hai Li, Yiran Chen, “Learning Efficient Sparse Structures in Speech Recognition”, International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
- Hsin-Pai Cheng, Yuanjun Huang, Xuyang Guo, Feng Yan, Yifei Huang, Wei Wen, Hai Li, Yiran Chen, “Differentiable Fine-grained Quantization for Deep Neural Network Compression”, NeurIPS 2018 CDNNRIA Workshop . [paper]
- Jongsoo Park, Sheng Li, Wei Wen, Ping Tak Peter Tang, Hai Li, Yiran Chen, Pradeep Dubey, “Faster CNNs with Direct Sparse Convolutions and Guided Pruning”, the 5th International Conference on Learning Representations (ICLR), 2017. [paper][code][media]
- Chunpeng Wu, Wei Wen, paper] , Yiran Chen, Hai Li, “A Compact DNN: Approaching GoogLeNet-Level Accuracy of Classification and Domain Adaptation”, CVPR, 2017. [
- Yandan Wang, Wei Wen, Linghao Song, Hai Li, “Classification Accuracy Improvement for Neuromorphic Computing Systems with One-level Precision Synapses “, ASP-DAC, 2017. (Best Paper Award). [paper]
- Wei Wen, Chunpeng Wu, Yandan Wang, Kent Nixon, Qing Wu, Mark Barnell, Hai Li, Yiran Chen, “A New Learning Method for Inference Accuracy, Core Occupation, and Performance Co-optimization on TrueNorth Chip”, 53rd ACM/EDAC/IEEE Design Automation Conference (DAC), 2016. Acceptance Rate: 152/876=17.4%. (Best Paper Nomination, 16/876=1.83%). [paper]
- Wei Wen, Chi-Ruo Wu, Xiaofang Hu, Beiye Liu, Tsung-Yi Ho, Xin Li, Yiran Chen, “An EDA Framework for Large Scale Hybrid Neuromorphic Computing Systems”, 52nd ACM/EDAC/IEEE Design Automation Conference (DAC), 2015. Acceptance Rate: 162/789=20.5%. (Best Paper Nomination, 7/789=0.89%). [paper]
- Yandan Wang, Wei Wen, Beiye Liu, Donald Chiarulli, Hai Li, “Group Scissor: Scaling Neuromorphic Computing Design to Big Neural Networks”, 54th ACM/EDAC/IEEE Design Automation Conference (DAC), 2017. Acceptance Rate: 24%. [paper]
- Jongsoo Park, Sheng R. Li, Wei Wen, Hai Li, Yiran Chen, Pradeep Dubey, “Holistic SparseCNN: Forging the Trident of Accuracy, Speed, and Size”, arXiv 1608.01409, 2016. (in Intel Developer Forum 2016, pages 41-43). [paper][code]
Talks and Presentations
- UC Berkeley, Scientific Computing and Matrix Computations Seminar, “On Matrix Sparsification and Quantization for Efficient and Scalable Deep Learning“, 10/10/2018
- Cornell University, AI Seminar, “Efficient and Scalable Deep Learning“, 10/05/2018
- NIPS 2017 oral presentation, TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning, 12/6/2017
- Alibaba DAMO Academy, “Deep Learning in Cloud-Edge AI Systems“, SunnyVale, CA, 06/28/2018
- “Deep Learning in the Cloud and in the Fog”, [Blog@AI科技评论]
- “Deep Learning in Cloud-Edge AI Systems”, [Video in Mandarin @将门创投]
- “Lifting Efficiency in Deep Learning – For both Training and Inference”, [Video in Mandarin @机器之心]
- “Scalable Event-driven Neuromorphic Learning Machines 3″, Intel Strategic Research Alliances (ISRA) – UC Berkeley, UC Irvine, Univ of Pitt, UCSD”, 10/27/2016
- “A Predictive Performance Model of Distributed Deep Learning on Heterogeneous Systems”, Final Intern Talk, HP Labs, 08/23/2016
- “Variation-Aware Predictive Performance Model for Distributed Deep Learning”, Summer Intern Fair Poster, HP Labs, 08/02/2016
- “An Overview of Deep Learning Accelerator”, Seminar, HP Labs, 07/18/2016
Activities
- Serving as a reviewer of NeurIPS, ICML, ICLR, CVPR, ICCV, TPAMI, TNNLS, TCAD, Neurocomputing, TCBB, ICME, etc
- Activity volunteer, Machine Learning for Girls, FEMMES (Female Excelling More in Math, Engineering, and Science) Capstone at Duke University, 02/2018
- Conference volunteer, ESWEEK 2016, OCTOBER 2-7, PITTSBURGH, PA, USA, 10/2016
Teaching
- TA: CEE 690/ECE 590: Introduction to Deep Learning, Duke University, Fall 2018
- TA: STA561/COMPSCI571/ECE682: Probabilistic Machine Learning, Duke University, Spring 2019
Education
- Ph.D. in Electrical and Computer Engineering, Duke University, 08/2014-12/2019 (Expected)
- First 3 years in University of Pittsburgh, then moved to Duke with my advisors.
- M.S. in Electronic and Information Engineering, Beihang University, Beijing, China, 09/2010-01/2013
- B.S. in Electronic and Information Engineering, Beihang University, Beijing, China, 09/2006-07/2010