This dataset consists of more than four hundred thousand handwritten names collected through charity projects.
Character Recognition utilizes image processing technologies to convert characters on scanned documents into digital forms. It typically performs well in machine-printed fonts. However, it still poses difficult challenges for machines to recognize handwritten characters, because of the huge variation in individual writing styles.
There are 206,799 first names and 207,024 surnames in total. The data was divided into a training set (331,059), a testing set (41,382), and a validation set (41,382) respectively.
The input data here are hundreds of thousands of images of handwritten names. In the Data, you’ll find the transcribed images broken up into test, training, and validation sets.