Benchmarking State-of-the-Art Deep Learning Software Tools

The source code and experimental data of

Benchmarking State-of-the-Art Deep Learning Software Tools

(Version 4, 10 September 2016)

Declaration: The runtime performance of each software tool depends not only on the hardware platform, but also on the third-party libraries and the network configuration files. Our results only reflect the performance of the tested networks with the associated configuration files and the specified third-party libraries on our testing machines, which are not necessarily the best performance that can be achieved by the software tool.

List of Currently Tested Software Tools

List of Currently Tested Neural Networks

Fully Connected Networks
Convolutional Neural NetworksAlexNet ResNet-50

Recurrent Neural Networks

Fully Connected Networks (FCNs)

Caffe

Source Code: CaffeFCNs.zip
Usage example: caffe train -solver=ffn26752-b32-solver-GPU.prototxt -gpu=0

Data: created by the script: createFakeDataForCaffeFCN.py

Revision History:
1. Use "caffe train" instead of "caffe time" to measure the time of each mini-batch.

CNTK

Source Code: CNTKFCNs.zip
Usage example: cntk configFile=fcn5.cntk configName=fcn5 deviceId=0 minibatch=32

Data: created by the script: createDataForCNTKFCN.py

TensorFlow

Source Code: TensorFlowFCNs.zip
Usage example: python fcn5.py -e 2 -b 32 -i 4 -d 0

Data: randomly generated in the source code.

Torch

Source Code: TorchFCNs.zip
Usage example: th fcn5.lua -nGPU 1 -deviceId 0 -batchSize 32 -nEpochs 2 -nIterations 4

Data: randomly generated in the source code.

Convolutional Neural Networks (CNNs)

AlexNet

Caffe

Source Code: CaffeAlexNet.zip
Usage example: caffe train -solver=alexnet-b256-solver-GPU.prototxt -iterations=10 -gpu=0

Data: created by the script: createFakeImageNetForCaffeCNN.py

Revision History:
1. Use "caffe train" instead of "caffe time" to measure the time of each mini-batch.

CNTK

Source Code: CNTKAlexNet.zip
Usage example: cntk configFile=alexnet.cntk configName=alexnet deviceId=0 minibatchSize=256 epochSize=2560 maxEpochs=4

Data: created by the script: createFakeImageNetDataForCNTKCNN.py and createLabelMapForCNTKCNN.py

Revision History:
1. Add option "prefetch=true" to configuration file, which may result in slightly better performance.

TensorFlow

Source Code: TensorFlowAlexNet.zip
Usage example: python alexnetbm.py -e 4 -b 256 -i 10 -d 0

Data: randomly generated in the source code.

Torch

Source Code: TorchAlexNet.zip
Usage: th alexnetbm.lua -nGPU 1 -deviceId 0 -batchSize 256 -nEpochs 4 -nIterations 10

Data: randomly generated in the source code.

ResNet-50

Caffe

Source Code: CaffeResNet.zip
Usage example: caffe train -solver=resnet-b32-solver-GPU.prototxt -iterations=4 -gpu=0

Data: the same as Caffe AlexNet data.

CNTK

Source Code: CNTKResNet.zip
Usage example: cntk configFile=resnet.cntk configName=resnet deviceId=0 minibatchSize=32 epochSize=2 maxEpochs=2

Data: the same as CNTK AlexNet data.

Revision History:
1. Add option "prefetch=true" to configuration file, which may result in slightly better performance.

TensorFlow

Source Code: CNTKResNet.zip
Usage example: python resnetbm.py -e 2 -b 32 -i 4 -d 0

Data: randomly generated in the source code.

Revision History:
1. Correct a bug that runs a smaller network.

Torch

Source Code: TorchResNet.zip
Usage example: th resnetbm.lua -depth 50 -nGPU 1 -deviceId 0 -batchSize 256 -nEpochs 4 -nIterations 10 -dataset imagenet

Data: randomly generated in the source code.

Recurrent Neural Networks (RNNs)

LSTM

CNTK

Source Code: CNTKLSTM.zip
Usage example: cntk configFile=lstm.cntk configName=lstm deviceId=0 minibatchSize=256 epochSize=8192 maxEpochs=1

Data: from CNTK.

TensorFlow

Source Code: TensorFlowLSTM.zip
Usage example: python lstm.py --batchsize 256 --iters 10 --seqlen 32 --numlayer 2 --hiddensize 256 --device 0 --data_path ~/data/PennTreeBank

Data: from CNTK.

Torch

Source Code: TorchLSTM.zip
Usage example: th lstm.lua --seqlen 32 --batchsize 256 --iters 10 --hiddensize {256,256} --cuda --lstm --startlr 1 --cutoff 5 --maxepoch 1 --device 0

Data: generated by the source code.

Acknowledgements

Alexey-Kamenev: https://github.com/Alexey-Kamenev/Benchmarks.
Soumith: https://github.com/soumith/convnet-benchmarks.
Glample: https://github.com/glample/rnn-benchmarks.
The CNTK Team for providing feedbacks and configuration files

The source code and experimental data of