Abstract

The Effects of Quantized Parameters on the Accuracy and Efficiency of Neural Networks

Artificial neural networks are computation systems that belong to a class of machine learning approaches under the general subset of artificial intelligence, which is used to recognize patterns and create functions based on data provided as input. The primary input of most artificial neural networks are numbers, usually real numbers with a large number of trailing digits. Quantization, however, is the process of reducing the number of bits used to represent the number. While performing this significantly improves the bandwidth and the storage needed, it also reduces the accuracy of the neural network. Thus, this experiment will attempt to determine how to optimize the amount of quantization such that accuracy and efficiency are relatively preserved. This experiment was able to definitively determine that intermittent quantization added was more effective than back-end quantization, or optimization introduced at the end of processing a data set.