I’m Wei Wen (温伟), a Ph.D. student in Duke University, supervised by Dr. Hai Li and Dr. Yiran Chen. My current research focuses on scalable and efficient machine learning for cloud-edge AI systems. I’m interested in Machine Learning in general.
Recently, I worked on distributed training and efficient inference methods for deep learning. More specific, I proposed TernGrad SGD to overcome the communication bottleneck in distributed deep learning, and worked on optimization methods to learn structurally-sparse and lower-rank deep neural networks for faster inference.
I had internship in Facebook Research at Menlo Park, Microsoft Research at Redmond, and HP Labs at Palo Alto.
- Wei Wen, Yandan Wang, Feng Yan, Cong Xu, Yiran Chen, Hai Li, “SmoothOut: Smoothing Out Sharp Minima for Generalization in Large-Batch Deep Learning”, preprint. [paper][code]
- Wei Wen, Yuxiong He, Samyam Rajbhandari, Minjia Zhang, Wenhan Wang, Fang Liu, Bin Hu, Yiran Chen, Hai Li, “Learning Intrinsic Sparse Structures within Long Short-Term Memory”,the 6th International Conference on Learning Representations (ICLR), 2018. [poster][paper][code]
- Wei Wen, Cong Xu, Feng Yan, Chunpeng Wu, Yandan Wang, Yiran Chen, Hai Li, “TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning”,the 31st Annual Conference on Neural Information Processing Systems (NIPS), 2017. (Oral, 40/3240=1.2%). [paper][video][slides][code][poster]
- Wei Wen, Cong Xu, Chunpeng Wu, Yandan Wang, Yiran Chen, Hai Li, “Coordinating Filters for Faster Deep Neural Networks”, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017. [paper][code][poster]
- Wei Wen, Chunpeng Wu, Yandan Wang, Yiran Chen, Hai Li, “Learning Structured Sparsity in Deep Neural Networks”, the 30th Annual Conference on Neural Information Processing Systems (NIPS), 2016. Acceptance Rate: 568/2500=22.7%. (Integrated into Intel Nervada) [paper][code][poster]
- Jongsoo Park, Sheng Li, Wei Wen, Ping Tak Peter Tang, Hai Li, Yiran Chen, Pradeep Dubey, “Faster CNNs with Direct Sparse Convolutions and Guided Pruning”, the 5th International Conference on Learning Representations (ICLR), 2017. [paper][code]
- Chunpeng Wu, Wei Wen, paper] , Yiran Chen, Hai Li, “A Compact DNN: Approaching GoogLeNet-Level Accuracy of Classification and Domain Adaptation”, CVPR, 2017. [
- Yandan Wang, Wei Wen, Linghao Song, Hai Li, “Classification Accuracy Improvement for Neuromorphic Computing Systems with One-level Precision Synapses “, ASP-DAC, 2017. (Best Paper Award). [paper]
- Wei Wen, Chunpeng Wu, Yandan Wang, Kent Nixon, Qing Wu, Mark Barnell, Hai Li, Yiran Chen, “A New Learning Method for Inference Accuracy, Core Occupation, and Performance Co-optimization on TrueNorth Chip”, 53rd ACM/EDAC/IEEE Design Automation Conference (DAC), 2016. Acceptance Rate: 152/876=17.4%. (Best Paper Nomination, 16/876=1.83%). [paper]
- Wei Wen, Chi-Ruo Wu, Xiaofang Hu, Beiye Liu, Tsung-Yi Ho, Xin Li, Yiran Chen, “An EDA Framework for Large Scale Hybrid Neuromorphic Computing Systems”, 52nd ACM/EDAC/IEEE Design Automation Conference (DAC), 2015. Acceptance Rate: 162/789=20.5%. (Best Paper Nomination, 7/789=0.89%). [paper]
- Yandan Wang, Wei Wen, Beiye Liu, Donald Chiarulli, Hai Li, “Group Scissor: Scaling Neuromorphic Computing Design to Big Neural Networks”, 54th ACM/EDAC/IEEE Design Automation Conference (DAC), 2017. Acceptance Rate: 24%. [paper]
- Jongsoo Park, Sheng R. Li, Wei Wen, Hai Li, Yiran Chen, Pradeep Dubey, “Holistic SparseCNN: Forging the Trident of Accuracy, Speed, and Size”, arXiv 1608.01409, 2016. (in Intel Developer Forum 2016, pages 41-43). [paper][code]
Talks and Presentations
- “Deep Learning in Cloud-Edge AI Systems”, Alibaba DAMO Academy, SunnyVale, CA, 06/28/2018
- “Deep Learning in the Cloud and in the Fog”, [Blog@AI科技评论]
- “Deep Learning in Cloud-Edge AI Systems”, [Video in Mandarin @将门创投]
- “Lifting Efficiency in Deep Learning – For both Training and Inference”, [Video in Mandarin @机器之心]
- “Scalable Event-driven Neuromorphic Learning Machines 3″, Intel Strategic Research Alliances (ISRA) – UC Berkeley, UC Irvine, Univ of Pitt, UCSD”, 10/27/2016
- “A Predictive Performance Model of Distributed Deep Learning on Heterogeneous Systems”, Final Intern Talk, HP Labs, 08/23/2016
- “Variation-Aware Predictive Performance Model for Distributed Deep Learning”, Summer Intern Fair Poster, HP Labs, 08/02/2016
- “An Overview of Deep Learning Accelerator”, Seminar, HP Labs, 07/18/2016
- Paper reviewer, Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 05/2018
- Paper reviewer, Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 04/2018
- Activity volunteer, Machine Learning for Girls, FEMMES (Female Excelling More in Math, Engineering, and Science) Capstone at Duke University, 02/2018
- Paper reviewer, Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 02/2018
- Paper reviewer, IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 01/2018
- Paper reviewer, IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 08/2017
- Conference volunteer, ESWEEK 2016, OCTOBER 2-7, PITTSBURGH, PA, USA, 10/2016
- Paper reviewer, NIPS 2016
Ph.D. in Electrical and Computer Engineering, Duke University, Durham, NC, United States
09/2014-08/2017 (Transferred to Duke University)
Ph.D. in Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, PA, United States
Master in Electronic and Information Engineering, Beihang University, Beijing, China
Bachelor in Electronic and Information Engineering, Beihang University, Beijing, China
Employment & Internship
More on LinkedIn.
Facebook Research – Caffe2 & Applied Machine Learning & AI Infra, Menlo Park, CA, USA, 05/2018-08/2018
Research Intern, Mentor: Yangqing Jia
– Caffe2/PyTorch 1.0
– Distributed Machine Learning
Microsoft Research – Web Search and AI, Redmond & Bellevue, WA, USA, 05/2017-07/2017
Summer intern, Mentors: Yuxiong He & Fang Liu
– Machine Reading Comprehension
– Recurrent Neural Nets
HP Labs – Platform Architecture Group, Palo Alto, CA, USA, 05/31/2016-08/31/2016
Summer intern, Mentor: Cong Xu, Manager: Paolo Faraboschi
– Worked on distributed deep learning.
Agricultural Bank of China – Software Development Center, Beijing, 07/2013-07/2014
Software Developer, Supervisor: Mr. Lei Fan
– Developed web services for online bank transactions.
Microsoft Research – Mobile and Sensing Systems Group, Beijing, China, 04/2013-06/2013
Research Intern, Supervisor: Dr. Guobin Shen
– Worked on computer vision to build a mobile system that generates a frontal view for the user even when the user is at a slant viewing angle.
Tencent Inc. – Advertising Platform and Products Division, Beijing, China, 07/2012-09/2012
Summer Intern, Software developer, Supervisor: Mr. Yanan Zhao
– Developed a Java web spider to crawl webpages from the Internet;
– Developed a HTML parser;
– Developed MVC-framework-based advertising websites.