Data Set Statistics
The data set comprises images from 19 different Urdu news channel videos and a small collection from non-news channels as well.
All images are in PNG format.
Image names are in a conventional digit format(i.e. 01,02)
Ground Truth Labeling
The ground truth data folder is placed in each of original images folder with the name of gt_rect.
gt_rect folder contains .dat files(one dat file for each image,.dat file can be opened in WordPad).It has the dimensions of rectangles drawn around the artificial textual contents found in the image.
Image name and Ground truth data containing file(.dat file,one for each image) name are same.
Format of the rectangle dimensions is as follows:
Figure 1:A sample .dat file
Each row here in image corresponds to the given dimensions of the rectangle.
X=x Coordinate of the drawn rectangle.
Y=y Coordinate of the drawn rectangle.
Width and Height corresponds to actual width and height of the rectangle.