Handwriting Database, 566 samples of contemporary Latin writing

805 words5 min read
database cover02

This is a database that captures the writing of over 550 contemporary users of the Latin script, all collected during 2022 and 2023. Scroll down to navigate through the samples.

This database is part of a larger research effort launched by Typotheque in early 2022 in order to learn more about handwriting across populations that use the Latin script. We first published an open call for submissions and later actively reached out to participants who would represent the 20 populations we initially aimed to cover. Thanks to the great response to this call, however, we now have data from users living in 46 countries and who are native speakers of 29 different languages. We would like to thank all participants that sent us their samples and took part in the research.

In order to capture the writing of each user representatively, we developed what we call copy-book forms. These are printable documents that prompted the respondent to write in a designated space. We asked the writer to copy out a series of sentences, all pronounceable, natural sounding and in their native language. The copy-book form is composed of three sections, each of which serves a different purpose.

The first of these sections was designed to capture natural handwriting in lower case. Participants were asked to copy a number of pangrams (sentences that include all the letters of the alphabet), mostly obtained from Rutter, 2014. Because each of these sentences contains all the letters of the Latin alphabet, a good sense of the person’s handwriting could be gleaned from simply looking at the first page of a completed copy-book form, with each letter appearing a handful of times within this section.

The second section provided participants with sentences that include proper nouns (both cities and people), enabling us to study the relationship between the upper-case and lower-case letters. This section also provided a long string of numbers. The third and last section repeated the pangrams from the first page, but now in an all-caps setting, to produce a good sense of the person’s printed handwriting.

The copy-book form was originally created in English, Spanish and Catalan, and later adapted to Dutch, French, German, Portuguese, Polish, Italian, Croatian, Romanian, Slovak, Turkish, Indonesian, Czech, Swedish, Hungarian, Finnish, Danish and Vietnamese. The sentences for each language were verified with local consultants, and necessary adaptations were made for each of the alphabets. Each language version included accented characters unique to their orthography.

Contributors to the database were also invited to take part in another in-house psychology-based study about the perceptual implications of familiarity. Participants completed a follow-up survey that measured these biases, and the findings of this study can be found here.

The intention of this research was to create a new knowledge base for the contemporary use of the Latin script. We wanted to construct an accurate idea of how readers of today write in order to inform our design process and produce better font solutions. This new knowledge has in fact already fuelled the design of Typotheque’s latest typeface: Dash, a new script typeface by Petra Dočekalová.

Because we believe in Open Science, this research is available to the public under the Creative Commons BY-SA 4.0 licence. This means you can use the data in this repository as inspiration in your creative process, as well as directly take any of its samples – even for new commercial projects, as long as you provide the appropriate credit and allow others to use your adaptations under the same terms.

A package with all anonymised raw data is available on request. Additionally, the full manuscript for this research, with a more extensive description of the preliminary literature review and methods, is also available. Feel free to reach out at hector@typotheque.com.

Copy-book form adaptations were made possible by Ioana Stănescu and Ada Ardeleanu (Romanian), Marc Simón and Abril Tibau "Xoxo" (Catalan), Marlis Zimmermann (Dutch), Magdalena Jaglińska (Polish), Trung Nguyen (Vietnamese), Inan Bostanci and Aybüke Uytun Yılmaz (Turkish), Sara Sofia Pitkänen (Finnish), Roberto Arista (Italian), Nikola Djurek (Croatian) and Peter Biľak (Czech and Slovak).

1/ Although the copy-book form was designed to be printed, completed, scanned and submitted, for those who could not use a printer, a non-printable version was also made available. Given the voluntary nature of this research, the decision as to whether to print the form was left to participants.

2/ Norwegian participants were surveyed using the English version of the form.

3/ We pre-registered the population background we were aiming to sample, and after collecting voluntary data for a few months, the Prolific recruitment platform was used to collect data from more participants and to ensure the sample matched our expected proportions of population.