Simple neural network version 1
Simple neural network version 2
Small Visual Geometry Group SmallVGG
Visual Geometry Group VGG
The reasons that the project focused on continuing developing the three classes is that it can be expanded to more languages easily without making a lot of changes to the primary model. If we use the model for two categories classification, we will need to use different methods in Keras and require a lot of modification for future expansion.
$ for example : $googleimagesdownload --keywords "playground" --limit 20 --color red
def normapro(folder, f):
# note f is now the name of one image
i = Image.open(folder + f)
# split the image name and extension
fn, fext = os.path.splitext(f)
# do rotation of 45 and crop the image
out1 = i.rotate(45)
box = (350, 0, 650, 400) # box = (30, 30, 410, 410)
region1 = out1.crop(box)
region1.save(folder + '{}_prepro1.jpg'.format(fn))
# do rotation of 90 and crop the image
out2 = i.transpose(Image.ROTATE_90)
box = (30, 100, 410, 600)
region2 = out2.crop(box)
region2.save(folder + '{}_prepro2.jpg'.format(fn))
# flip images left to right and opposit
out3 = i.transpose(Image.FLIP_LEFT_RIGHT)
out3.save(folder + '{}_prepro3.jpg'.format(fn))
# flip images top to bottom
out4 = i.transpose(Image.FLIP_TOP_BOTTOM)
out4.save(folder + '{}_prepro4.jpg'.format(fn))
# invert black to white and opposit
out5 = PIL.ImageOps.invert(i)
out5.save(folder + '{}_prepro5.jpg'.format(fn))
# change the color of the images
orig_color = (255, 255, 255)
replacement_color1 = (200, 0, 0)
replacement_color2 = (0, 200, 0)
replacement_color3 = (0, 0, 200)
img = i.convert('RGB')
data1 = np.array(img)
data2 = np.array(img)
data3 = np.array(img)
data1[(data1 != orig_color).all(axis=-1)] = replacement_color1
data2[(data2 != orig_color).all(axis=-1)] = replacement_color2
data3[(data3 != orig_color).all(axis=-1)] = replacement_color3
out6 = Image.fromarray(data1, mode='RGB')
out7 = Image.fromarray(data2, mode='RGB')
out8 = Image.fromarray(data3, mode='RGB')
out6.save(folder + '{}_prepro6.jpg'.format(fn))
out7.save(folder + '{}_prepro7.jpg'.format(fn))
out8.save(folder + '{}_prepro8.jpg'.format(fn))
# now incrument the breaker bk
def bnwpro(folder, f):
# note f is now the name of one image
i = Image.open(folder + f)
# split the image name and extension
fn, fext = os.path.splitext(f)
#
inverted_image = PIL.ImageOps.invert(i)
inverted_image.save(folder + '{}.jpg'.format(fn))
Dataset version 1 for the Arabic language
Dataset version 1 for the English language
Dataset version 2 for the English language
Dataset version 3 Pillow image processing for the Japanese language
Dataset version 3 Pillow image processing for the Japanese language
Dataset version 3 with applying Keras data image generator
Classification report for data with several transformation
Classification report for data with inverted background
Above pictures, show the classification report on the validation data while training the model which shows excellent results. Testing the Simple neural network models on 24 random images that never seen by the model, the results are not that promising. From the 24 random samples, the model classified 14 images correctly and ten images that are not correctly classified. And the overall performance was 58.33%.
Classification report for data with several transformation
Classification report for data with inverted background
Surprisingly, the SNN performed better than VGG, and that could be for many reasons. Using Pillow to modify the data and then use it in VGG model which already had a Keras data image generator which takes the date and transforms it again and adds noise to it. And this second transformation may cause VGG to be confused. For the CNN, it seems that classify almost every image to be Japanese, and maybe after rotating Arabic and English data, it becomes similar to Japanese data which made CNN get confused.
In addition to being inexpensive and easy to work on, Been able to run the project on RPi will help in including it with other interesting RPi projects. And that was part of the reasons why. So the project could be integrated into many other applications,e.g., google image translator.
It seems that no one did this project before, the method of classifying different languages font recognition (DLFR) as an object and not extracting the text from the image. Overall, this project could be beneficial for others who want to integrate it into their applications and try different or improve current methods. The DLFR project focused on using two models in deep learning, simple neural networks, and convolutional neural networks. Moreover, the project's dataset was self-created with approximately 2100 images per language. There were three different ways to build the datasets. The first method used completely random pictures with a different font, position, and background. The second and the third methods were random lists of words taken randomly from articles and news. The best dataset performance was both the second and the third datasets. To improve the datasets farther, it must represent all the languages alphabet and character which needs a lot of work to accomplish. In the future, we could use a region based convolutional neural networks for instance segmentation (R-CNN) model. R-CNN will mask each alphabet for different languages with its classification which will lead to better prediction.