We measured the raw coding throughput of G-CRS. The related code and experimental results are provided in this section.
Section 3.1 Introduction
We evaluate the effectiveness of our
optimization strategies by adding our optimization strategies one by one, the code is in the file named EffectivenessOfOpt.zip. We investigate the dominating
factor that contributes the most to the performance improvement by removing one optimization strategy each time, the code is in the file named DominatedFactor.zip.
Section 3.2 Source Code
Source Code: EffectivenessOfOpt.zip
Source Code: DominatedFactor.zip
Please take a look at the README.md to see how to run the program
Section 3.3 Experimental Results for Effectiveness of our
Optimization Strategies
Results of Encoding on Maxwell GTX 980, note that the unit for throughput is GB/s for each table
k=10, m=1
w | 4 | 5 | 6 | 7 | 8 |
Base | 21.00 | 16.64 | 16.88 | 16.61 | 16.79 |
Bitmatrix | 108.76 | 106.49 | 108.55 | 103.54 | 111.07 |
EfficientAccess | 144.28 | 142.29 | 142.73 | 138.00 | 137.65 |
Bandwidth | 145.31 | 143.07 | 143.60 | 141.28 | 141.26 |
G-CRS | 145.19 | 144.70 | 145.38 | 145.69 | 145.22 |
k=10, m=2
w | 4 | 5 | 6 | 7 | 8 |
Base | 16.38 | 15.40 | 15.32 | 14.60 | 14.66 |
Bitmatrix | 52.45 | 51.26 | 52.75 | 50.08 | 54.10 |
EfficientAccess | 99.48 | 86.37 | 83.40 | 74.42 | 74.01 |
Bandwidth | 132.22 | 130.96 | 129.12 | 116.08 | 110.60 |
G-CRS | 132.15 | 131.07 | 131.74 | 131.83 | 132.10 |
k=10, m=3
w | 4 | 5 | 6 | 7 | 8 |
Base | 14.78 | 14.11 | 14.01 | 13.24 | 12.95 |
Bitmatrix | 34.50 | 33.18 | 33.97 | 31.80 | 34.81 |
EfficientAccess | 65.25 | 56.92 | 55.04 | 49.15 | 48.77 |
Bandwidth | 121.29 | 107.68 | 99.31 | 84.65 | 77.53 |
G-CRS | 121.18 | 119.94 | 120.69 | 115.97 | 108.48 |
k=10, m=4
w | 4 | 5 | 6 | 7 | 8 |
Base | 13.30 | 12.55 | 11.99 | 11.11 | 8.01 |
Bitmatrix | 25.81 | 24.26 | 24.87 | 23.04 | 25.96 |
EfficientAccess | 49.00 | 42.78 | 41.25 | 36.79 | 36.27 |
Bandwidth | 92.05 | 75.59 | 65.75 | 57.35 | 51.56 |
G-CRS | 111.86 | 110.89 | 104.07 | 88.64 | 82.62 |
Results on Pascal Titan X, note that the unit for throughput is GB/s for each table
k=10, m=1
w | 4 | 5 | 6 | 7 | 8 |
Base | 27.30 | 23.24 | 26.13 | 25.75 | 26.19 |
Bitmatrix | 247.81 | 226.05 | 219.92 | 198.12 | 250.64 |
EfficientAccess | 275.92 | 268.05 | 262.76 | 259.54 | 277.43 |
Bandwidth | 276.33 | 269.67 | 263.55 | 260.41 | 277.06 |
G-CRS | 278.77 | 272.38 | 267.65 | 267.04 | 279.52 |
k=10, m=2
w | 4 | 5 | 6 | 7 | 8 |
Base | 25.87 | 24.60 | 24.81 | 24.26 | 24.39 |
Bitmatrix | 142.74 | 129.84 | 129.95 | 119.73 | 134.80 |
EfficientAccess | 214.25 | 189.66 | 184.63 | 173.71 | 174.02 |
Bandwidth | 254.54 | 244.07 | 237.00 | 231.69 | 244.75 |
G-CRS | 256.23 | 249.15 | 245.73 | 243.33 | 257.80 |
k=10, m=3
w | 4 | 5 | 6 | 7 | 8 |
Base | 21.00 | 16.64 | 16.88 | 16.61 | 16.79 |
Bitmatrix | 93.20 | 83.97 | 85.40 | 77.84 | 86.44 |
EfficientAccess | 147.76 | 132.20 | 128.17 | 116.88 | 115.22 |
Bandwidth | 237.42 | 217.59 | 210.35 | 197.95 | 186.95 |
G-CRS | 240.68 | 233.10 | 227.06 | 223.70 | 238.60 |
k=10, m=4
w | 4 | 5 | 6 | 7 | 8 |
Base | 17.87 | 17.06 | 19.35 | 19.72 | 13.17 |
Bitmatrix | 69.89 | 62.69 | 63.26 | 57.95 | 65.93 |
EfficientAccess | 112.99 | 100.39 | 96.80 | 88.05 | 85.91 |
Bandwidth | 216.47 | 185.47 | 163.45 | 143.10 | 128.80 |
G-CRS | 228.91 | 220.57 | 213.31 | 203.72 | 200.66 |
Section 3.4 Experimental Results for Dominating Factor
Results of Encoding on Maxwell GTX 980, note that the unit for throughput is GB/s for each table
k=10, m=1
w | 4 | 5 | 6 | 7 | 8 |
-Bitmatrix | 23.67 | 17.91 | 18.01 | 17.91 | 17.82 |
-EfficientAccess | 103.92 | 100.83 | 104.09 | 98.36 | 110.58 |
-Bandwidth | 144.71 | 144.29 | 145.17 | 143.94 | 144.74 |
-DivergenceOpt | 145.31 | 143.07 | 143.60 | 141.28 | 141.26 |
G-CRS | 145.19 | 144.70 | 145.38 | 145.69 | 145.22 |
k=10, m=2
w | 4 | 5 | 6 | 7 | 8 |
-Bitmatrix | 17.95 | 17.73 | 17.89 | 17.62 | 15.66 |
-EfficientAccess | 104.23 | 100.26 | 100.49 | 95.90 | 109.58 |
-Bandwidth | 127.15 | 111.13 | 118.24 | 86.37 | 111.93 |
-DivergenceOpt | 132.22 | 130.96 | 129.12 | 116.08 | 110.60 |
G-CRS | 132.15 | 131.07 | 131.74 | 131.83 | 132.10 |
k=10, m=3
w | 4 | 5 | 6 | 7 | 8 |
-Bitmatrix | 18.02 | 17.74 | 17.79 | 16.37 | 14.44 |
-EfficientAccess | 107.23 | 97.64 | 91.84 | 79.49 | 80.61 |
-Bandwidth | 94.29 | 72.48 | 79.17 | 55.57 | 74.04 |
-DivergenceOpt | 121.29 | 107.68 | 99.31 | 84.65 | 77.53 |
G-CRS | 121.18 | 119.94 | 120.69 | 115.97 | 108.48 |
k=10, m=4
w | 4 | 5 | 6 | 7 | 8 |
-Bitmatrix | 17.19 | 17.55 | 16.33 | 15.93 | 8.89 |
-EfficientAccess | 102.72 | 98.35 | 52.33 | 54.46 | 69.59 |
-Bandwidth | 73.09 | 53.24 | 59.21 | 40.42 | 55.02 |
-DivergenceOpt | 92.05 | 75.59 | 65.75 | 57.35 | 51.56 |
G-CRS | 111.86 | 110.89 | 104.07 | 88.64 | 82.62 |
Results on Pascal Titan X, note that the unit for throughput is GB/s for each table
k=10, m=1
w | 4 | 5 | 6 | 7 | 8 |
-Bitmatrix | 30.36 | 27.46 | 27.65 | 27.31 | 27.26 |
-EfficientAccess | 239.29 | 227.47 | 217.89 | 194.79 | 249.79 |
-Bandwidth | 280.78 | 263.64 | 265.05 | 256.51 | 278.27 |
-DivergenceOpt | 276.33 | 269.67 | 263.55 | 260.41 | 277.06 |
G-CRS | 278.77 | 272.38 | 267.65 | 267.04 | 279.52 |
k=10, m=2
w | 4 | 5 | 6 | 7 | 8 |
-Bitmatrix | 26.88 | 27.10 | 27.45 | 27.03 | 23.91 |
-EfficientAccess | 253.88 | 210.52 | 203.05 | 183.86 | 241.61 |
-Bandwidth | 244.14 | 204.11 | 216.51 | 186.12 | 224.33 |
-DivergenceOpt | 254.54 | 244.07 | 237.00 | 231.69 | 244.75 |
G-CRS | 256.23 | 249.15 | 245.73 | 243.33 | 257.80 |
k=10, m=3
w | 4 | 5 | 6 | 7 | 8 |
-Bitmatrix | 27.12 | 26.75 | 27.23 | 25.43 | 22.16 |
-EfficientAccess | 229.26 | 176.00 | 172.87 | 146.08 | 170.87 |
-Bandwidth | 199.13 | 155.94 | 169.03 | 132.75 | 169.40 |
-DivergenceOpt | 237.42 | 217.59 | 210.35 | 197.95 | 186.95 |
G-CRS | 240.68 | 233.10 | 227.06 | 223.70 | 238.60 |
k=10, m=4
w | 4 | 5 | 6 | 7 | 8 |
-Bitmatrix | 26.13 | 26.72 | 25.46 | 24.95 | 23.09 |
-EfficientAccess | 222.01 | 188.70 | 110.48 | 114.58 | 168.92 |
-Bandwidth | 166.10 | 121.27 | 134.42 | 98.53 | 130.52 |
-DivergenceOpt | 216.47 | 185.47 | 163.45 | 143.10 | 128.80 |
G-CRS | 228.91 | 220.57 | 213.31 | 203.72 | 200.66 |
Section 4.1 Introduction
We measured the coding throughput of pipelined G-CRS. The related code and experimental results are provided in this section.
Section 4.2 Source Code
Source Code: G-CRSPCIE.zip
Please take a look at the README.md to see how to run the program
Section 4.3 Experimental Results
Results on Maxwell GTX 980
Results on Pascal Titan X