Many popular ML frameworks and libraries already offer the possibility to use GPU accelerators to speed up learning process with supported interfaces such as TensorFlow, CNTK, Theano, Keras, Caffe, Torch, DL4J, MXNet, Chainer and many more [DLwiki] [Felice 2017] [Kalogeiton 2017]. Some of them also allow to use optimised libraries such as CUDA (cuDNN), and OpenCL to improve the performance even further. The main feature of the many-core accelerators is a massively parallel architecture allowing them to speed up computations that involve matrix-based operations. The GPGPU interest can be found in many other life large-scale simulation packages with dynamic progress developments.
The popularity and trend measures are the subject to change according to the high dynamic development of DL framework and tools. The estimation of these values was based on Github repository stargazing [Jolav 2018].