A team at Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) has developed an AI system that can mimic a person's writing style based on a few paragraphs of the original letter. Researchers who presented initial results of their research in 2021 at the International Conference on Computer Vision (ICCV) recently received a patent for the tool from the US Patent and Trademark Office.
The team presenting “Handwriting Transformers” included Assistant Professor of Computer Vision Rao Muhammad Anwer, Associate Professor of Computer Science Vison Salman Khan, Associate Head of the Department of Computer Vision and Professor of Computer Vision Fahad Shahbaz Khan, and Ankan Kumar Bhunia .
Previous research has relied on generative adversarial networks (GANs). However, although these approaches can capture an author's general style, such as the direction of writing or the width of the strokes that make up letters, they encounter two major problems.
First, the connection between style and content is weak, as these features are treated separately and merged together, resulting in a lack of explicit interconnection at the character level. On the other hand, they do not explicitly encode local stylistic patterns such as character style and ligatures, which can be found, for example, in the word “heart” or in the Latin phrase “ex aequo”.
To overcome these limitations, the researchers adopted a novel approach using Vision Transformers, neural networks designed for computer vision tasks.
Fahad Khan explains:
“To imitate a person's writing style, we want to look at the entire text and only then do we begin to understand how the author linked the characters, how the author linked the letters or words with spaces.” All of these tasks require some kind global receptive field, which is not easy with convolutional neural networks. We recognized this gap in existing methods and adopted this transformer-based method.”
The scientists compared their approach to generating handwritten text images, HWT (Handwriting Transformers), with two other handwriting generation technologies. They asked 100 people to rate the text generated by the different models. In 81% of cases, they preferred HWT over other text generators.
A qualitative comparison of HWT with two other handwriting generators, GANwriting and Davis et al. All three generators were instructed to produce the same text: “No two people can write in exactly the same way, just as two people cannot have the same fingerprints.” » All three applications were tested by six different editors using handwritten text samples (column far left) trained. Davis et al. They capture an author's general style, such as the direction of the text, but have difficulty mimicking character-specific stylistic details. GANwriting is limited by the length of words it can imitate and was unable to complete the provided textual content – for example, it generated the word “precise” instead of “precisely”. MBZUAI researchers' approach better mimics global and local style patterns, resulting in more realistic writing.They also showed them the original text and the generated one, the participants could not distinguish between the two, confirming the performance of the AI system.
Although this advance paves the way for promising applications, researchers are aware of the ethical implications associated with their technology and warn of the potential danger of counterfeiting and other abuses. They emphasize the need to take countermeasures as part of a responsible operation.
Rao Muhammad Anwer explains:
“We are very cautious about this as there could be misuse. Handwriting represents a person’s identity, so we think about it carefully before using it.”.
Article references: Blog MBZUAI
Authors:
Rao Muhammad Anwer, Vison Salman Khan, Fahad Shahbaz Khan, Ankan Kumar Bhunia