Large-Scale Deep Learning in Heterogeneous Distributed Systems (Xiaowen Chu et al.)

A key driving force of the success of deep learning is the growing computing power of multi-core and many-core processors such as GPUs, FPGA, and ASIC. With the increase of training data size and complexity of deep neural networks, how to efficiently utilize the limited, expensive, and shared computing and communicating resources in a heterogeneous distributed system to support large-scale deep learning tasks from different users becomes an important issue for cloud service providers. Our ultimate goal is to make the deep learning tasks as fast as possible by (1) exploiting the hardware potential to the limit; (2) optimizing the related software components; (3) designing smart resource allocation and task schedule for different simultaneous deep learning tasks.


Our Impact:

Selected Publications:

  1. S. Shi, Q. Wang, P. Xu, and X.-W. Chu, “Benchmarking State-of-the-Art Deep Learning Software Tools,” arXiv:1608.07249,
  2. S. Shi, P. Xu, and X.-W. Chu, “Supervised Learning Based Algorithm Selection for Deep Neural Networks,” IEEE ICPADS 2017 (International Conference on Parallel and Distributed Systems), Shenzhen, China, Dec. 2017.
  3. Q. Wang and X.-W. Chu, “GPGPU Power Estimation with Core and Memory Frequency Scaling,” GreenMetrics 2017, in conjunction with ACM Sigmetrics 2017, Champaign-Urbana, USA June 2017. (To appear in ACM Performance Evaluation Review.)
  4. V. Chau, X.-W. Chu, H. Liu, and Y.-W. Leung, “Energy Efficient Job Scheduling with DVFS for CPU-GPU Heterogeneous Systems,” ACM e-Energy 2017, Hong Kong, May 2017.
  5. X. Mei, X.-W. Chu, Y.-W. Leung, H. Liu, and Z. Li, “Energy Efficient Real-time Task Scheduling on CPU-GPU Hybrid Clusters,” IEEE Infocom 2017, Atlanta, GA, USA, 1-4 May, 2017.
  6. X. Mei, Q. Wang, and X.-W. Chu, “A Survey and Measurement Study of GPU DVFS on Energy Conservation,” Digital Communications and Networks, Vol. 3, No. 2, Pages 89-100, May 2017.
  7. X. Mei and X.-W. Chu, “Dissecting GPU Memory Hierarchy through Microbenchmarking,” IEEE Transactions on Parallel and Distributed Systems, Vol. 28. No. 1, pages 72-86, Jan 2017. (An earlier short version has been presented at IFIP NPC 2014.)
  8. X. Mei, L. Yung, K. Zhao, and X.-W. Chu, “A Measurement Study of GPU DVFS on Energy Conservation,” USENIX HotPower’13, co-located with the 24th ACM Symposium on Operating Systems Principles (SOSP), Pennsylvania, USA, November 2013.

For further information on this research topic, please contact Dr. Xiaowen Chu.