Writer Identification Dataset


 

HIT-MW database can be used to research on the writer identification. We have labeled hundreds of documents in HIT-MW database for this purpose. If you are interested in this dataset, please contact hitmwdb AT gmail.com . The dataset is available free of charge. Following are a part of writer labels: 

Dataset for Writer Identification V1.0
Sample ID Male(1=true) Source Writer ID
04010101 1 Heilongjiang 000023
04010102 1 Heilongjiang 000023
04010103 1 Heilongjiang 000053
04010201 1 Heilongjiang 000053
04010301 1 Heilongjiang 000001
04010302 1 Heilongjiang 000001
04010303 1 Heilongjiang 000010
04010304 1 Heilongjiang 000023
04010305 1 Heilongjiang 000014
04010401 1 Heilongjiang 000014
04010402 1 Heilongjiang 000005
04010501 1 Heilongjiang 000019
04010502 1 Heilongjiang 000049
04010503 0 Anhui 000233
04010601 0 Heilongjiang 000040
04010603 0 Anhui 000191
04010604 0 Anhui 000178
04010605 0 Anhui 000226
04010606 0 Anhui 000240
.. .. ..

Specail Notes:

Thanks for use! 
Any use of the dataset should be authorized by us. Please forward questions to:
hitmwdb AT gmail.com
More information is available at:
http://hitmwdb.googlepages.com
by Tonghua Su
Harbin Institute of Technology (HIT), P R China
Feb. 1, 2008