138 writers out of 170 have copied both the Bengali and Hindi text and contributed at least 9 pages. So each of the 138 writers has contributed either 9 or 10 pages. 16 writers have copied 5 pages of the Bengali part of the document, and the remaining 16 writers have copied 5 pages of the Hindi part of the document.
Each page of the document was scanned with a resolution of 300 dpi and a colour depth of 24 bit. The images are stored as LZW compressed TIFFs. Each page is labelled as (Writer ID)-(Page ID)_(optional part number)_(B/H). The Writer ID is the same as the document ID, since each document is assigned to a particular writer. The Page ID denotes the ID of each page in the document. The part number is optional and is only included when a writer required more than the space provided to complete the content of a single page. The letters B and H denote whether the content of the page is in Bengali or Hindi, respectively.
We have prepared a lite version of the IIEST-Indic-HW dataset consisting of samples from 20 writers.