Using optimized settings and tools to speed up your code!
Table of Contents
1 Config Channle First Image Data Format for CNN
Modify the ~/.keras.json and setting are:
backend: mxnet image_data_format: channels_first
Channles_first is optimal for training on NVIDIA GPUs with cuDNN.
NOTE: It is recommended pratice to pass the input_shape based on the shape field of your input tensor (that means first non-input layer).
2 Install Optimized MXNet
$ pip install mxnet-cu80mkl/mxnet-cu90mkl/mxnet-mkl
3 Using MXNet Profiler
MXNet (V0.9.1+) has a build-in profiler that gives detailed information about execution time at the symbol level. This feature complements general profiling tools like nvprof and gprof by summarizing at the operator level, instead of a function, kernel, or instruction level.