Benchmarking State-of-the-Art Deep Learning Software Tools

The source code and experimental data of

Benchmarking State-of-the-Art Deep Learning Software Tools

(Version 3, 4 September 2016)

Declaration: The runtime performance of each software tool depends not only on the hardware platform, but also on the third-party libraries and the network configuration files. Our results only reflect the performance of the tested networks with the associated configuration files and the specified third-party libraries on our testing machines, which are not necessarily the best performance that can be achieved by the software tool.

Update <07 Sept 2016>: The CNTK team reported a bug in our ResNet-50 network configuration. The TensorFlow results on ResNet-50 reported in this Version 3 are not correct. We will release the corrected version very soon.

List of Currently Tested Software Tools

List of Currently Tested Neural Networks

Fully Connected Networks
Convolutional Neural NetworksAlexNet ResNet-50

Recurrent Neural Networks

Fully Connected Networks (FCNs)

Caffe

Source Code: CaffeFCNs.zip
Usage example: caffe time -model=fcn5-b32.prototxt -gpu=0

Data: created by the script: createFakeDataForCaffeFCN.py

CNTK

Source Code: CNTKFCNs.zip
Usage example: cntk configFile=fcn5.cntk configName=fcn5 deviceId=0 minibatch=32

Data: created by the script: createDataForCNTKFCN.py

TensorFlow

Source Code: TensorFlowFCNs.zip
Usage example: python fcn5.py -e 2 -b 32 -i 4 -d 0

Data: randomly generated in the source code.

Revision History:
1. Set input data be generated on CPU size, which is the same with other tools.

Torch

Source Code: TorchFCNs.zip
Usage example: th fcn5.lua -nGPU 1 -deviceId 0 -batchSize 32 -nEpochs 2 -nIterations 4

Data: randomly generated in the source code.

Convolutional Neural Networks (CNNs)

AlexNet

Caffe

Source Code: CaffeAlexNet.zip
Usage example: caffe time -model=alexnet-b256.prototxt -iterations=10 -gpu=0

Data: created by the script: createFakeImageNetForCaffeCNN.py

Revision History:
1. Remove two dropout operations of the network.

CNTK

Source Code: CNTKAlexNet.zip
Usage example: cntk configFile=alexnet.cntk configName=alexnet deviceId=0 minibatchSize=256 epochSize=2560 maxEpochs=4

Data: created by the script: createFakeImageNetDataForCNTKCNN.py and createLabelMapForCNTKCNN.py

Revision History:
1. Remove dropout operations of the network.

TensorFlow

Source Code: TensorFlowAlexNet.zip
Usage example: python alexnet.py -e 4 -b 256 -i 10 -d 0

Data: randomly generated in the source code.

Revision History:
1. Set dimension of input data to 224x224x3;
2. Remove all padding settings (i.e., using 'SAME' padding);
3. Revise the output of 4th convolution layer to 384 instead of 256;
4. Set input data be generated on CPU size.

Torch

Source Code: TorchAlexNet.zip
Usage:

Data: randomly generated in the source code.

Revision History:
1. Remove dropout operations of the network.

ResNet-50

Caffe

Source Code: CaffeResNet.zip
Usage example: caffe time -model=resnet-b32.prototxt -iterations=4 -gpu=0

Data: the same as Caffe AlexNet data.

CNTK

Source Code: CNTKResNet.zip
Usage example: cntk configFile=resnet.cntk configName=resnet deviceId=0 minibatchSize=32 epochSize=2 maxEpochs=2

Data: the same as CNTK AlexNet data.

Revision History:
1. Add setting of shareNodeValueMatrices=true to configuration file;
2. Add setting of maxTempMemSizeInSamplesForCNN=1 to configuration file when there is no enough memory.

TensorFlow

Source Code: There is a bug in our ResNet-50 network configuration. The TensorFlow results on ResNet-50 reported in this Version 3 are not correct. We will release the corrected version very soon.
Usage example: python resnet.py -e 2 -b 32 -i 4 -d 0

Data: randomly generated in the source code.

Revision History:
1. Set dimension of input data to 224x224x3;
2. Set input data be generated on CPU size.

Torch

Source Code: TorchResNet.zip
Usage example: th resnet.lua -depth 50 -nGPU 1 -deviceId 0 -batchSize 256 -nEpochs 4 -nIterations 10 -dataset imagenet

Data: randomly generated in the source code.

Revision History:
1. Set dimension of input data to 224x224x3;
2. Set input data be generated on CPU size.

Recurrent Neural Networks (RNNs)

LSTM

CNTK

Source Code: CNTKLSTM.zip
Usage example: cntk configFile=lstm.cntk configName=lstm deviceId=0 minibatchSize=256 epochSize=8192 maxEpochs=1

Data: from CNTK.

Revision History:
1. Remove extra LSTM classification task in configuration file;
2. Using customized brain script to do LSTM operation.

TensorFlow

Source Code: TensorFlowLSTM.zip
Usage example: python lstm.py --batchsize 256 --iters 10 --seqlen 32 --numlayer 2 --hiddensize 256 --device 0 --data_path ~/data/PennTreeBank

Data: from CNTK.

Torch

Source Code: TorchLSTM.zip
Usage example: th lstm.lua --seqlen 32 --batchsize 256 --iters 10 --hiddensize {256,256} --cuda --lstm --startlr 1 --cutoff 5 --maxepoch 1 --device 0

Data: generated by the source code.

Acknowledgements

Alexey-Kamenev: https://github.com/Alexey-Kamenev/Benchmarks.
Soumith: https://github.com/soumith/convnet-benchmarks.
Glample: https://github.com/glample/rnn-benchmarks.
The CNTK Team for providing feedbacks and configuration files

Previous versions (obsoleted)

Version 2