A Collaborative Research Project to Overcome Digital Language Support Barriers for Indigenous North American Typography

The 293 First Nations, Inuit, and Métis communities across what is now Canada speak 84 languages and dialects in both traditional homelands and in urban centres. There is a great wealth of orthographic and typographic variation within the unique writing systems that these communities employ to represent their languages. Our project aims to strengthen the active language revitalization and preservation programmes within Indigenous communities by supporting easier and barrier-free access to utilising their writing systems in digital spaces.

Despite the high number of living languages and rich (ortho)graphic diversity, Indigenous communities continue to face widespread challenges in actively accessing and using their languages on digital platforms. These barriers stem both from questions around how a community’s orthography must perform for accurate reading and comprehension, and also in how the typography must appear and shape the orthography in text composition. Even if a community has a stable digital text encoding situation and supportive language tools, issues presented by a lack of adequate keyboards, accurate font rendering support and glyph representation can restrict a community’s ability to exercise self-determination over the appearance and functionality of their language. Digital text spaces on smartphones, computers and other platforms are an essential space for everyday language engagement, mobilisation and transmission. Challenges of any kind risk the success of community-driven language revitalization and preservation programmes, and impede overall sovereignty over their language.

Map of Indigenous languages in Canada

In order to overcome these challenges, we propose a project to comprehensively research and document the text encoding, keyboard sources, technical issues, and typographic preferences for each individual Indigenous language community in North America, with the goal removing barriers to access and strengthening the digital and overall vitality of these languages today and into the future. We will do this by working in active collaboration with local Indigenous language keepers in communities. Through direct and highly customised partnerships with each individual community, we will ensure that the information pertaining to the completeness of Unicode character sets and stability, keyboard layouts, font support, and Common Locale Data Repository (CLDR) data gathered is accurate and community-centred.

This work is timely and urgent, not only given UNESCOs decade of Indigenous languages (2022 – 2032), but also in light of the latest Canadian Census data (2021) which reports 237,420 Indigenous language speakers in Canada, marking a decline of 10,750 speakers from the 2016 Census, the first decline since comparable data on these languages began to be collected in 1991. On the other hand, the First Peoples’ Cultural Council’s 2022 Report ‘On the status of B.C. First Nations Languages’ documents an encouraging increase of 3,106 new language learners since 2018. Based on our conversations and the 10 existing collaborative partnerships with Indigenous language communities across Canada in which we are already engaged, the success and stability of current language revitalization efforts hinge on how readily the language can be mobilised in digital spaces and on everyday devices. Barriers towards using Indigenous languages in digital spaces significantly block the ability of community members to engage with all users, and younger generations of learners in particular. Impediments to digital access and further negatively impact a community’s ability to preserve texts and use their language reliably in day-to-day scenarios. We anticipate that our project will begin in the Spring of 2024, with a focus on two Indigenous language communities in the first, pilot phase. From here, we hope to grow the project to include other communities.

About Language Support and Typography Issues for Indigenous Communities in North America

Below are some examples of different, common technical issues that Indigenous language communities in North America face in using their language in digital spaces:

Unicode and Text Encoding

The Unicode Standard is the international standard for how all text is encoded on digital devices. This means that all characters in a language's orthography must be included in Unicode for it to be accessible for typing and exhanging text across devices such as desktop computers, smartphones, and tablets:

Unicode illustration
The above image shows the interconnectedness and relationship between the Unicode Standard – how all digital text for the world's languages are encoded – and the tools on our devices which we use to enter text. Ideally, if all of the characters that our language needs is encoded in Unicode, our fonts and keyboards support all of these characters, and our devices support the rendering of the characters, we can freely and reliably use, exchange, and store any text that we compose with any other device in the world.

In order for any language to be fully and accurately supported across all devices, the required characters must be in Unicode and our devices must follow the Unicode Standard, meaning that the language tools (keyboard and fonts) that we use on our devices must be fully Unicode-compliant. If this criteria is met, then we are able to accurately and consistently display text on all of our devices and share texts with any one else's device.

Proposing Additions or Revisions to the Unicode Standard

Sometimes, language communities may be missing characters in the Unicode Standard that they require to use their language on digital text platforms. Similarly, sometimes there are mistakes in the Unicode Standard towards how characters for some communities should appear visually. We have worked with several Indigenous communities in Canada to propose new characters and character representation revisions to the Unicode Standard. Below, we share links to these successful proposals:

It is important to note that once new characters are added to the Unicode Standard, it can take some time before those characters are published in a new version of the Unicode Standard, and before major operating systems provide support for those characters on their devices (Apple, Microsoft, Android).

Local Typographic Preferences

The Unicode Standard specifies character code charts that show a general representation for how a character may appear generally in text. All of the default, core fonts on our devices follow this visual character appearance model, however, it is possible that different language communities have a different graphic preference as opposed to the general, standard representation of a given Unicode character. It is possible to support these local preferences in fonts:

Nunavut versus Nunavik angma local form preference
Barred Lambda possible local typographic variant

Diacritic Mark Appearance and Rendering

Some languages require many diacritic marks that many common fonts may not clearly distinguish visually:

diacritic representation

In the above example, the first two fonts render the distinction adequatley between the comma above diacritic and the acute diacritic marks, while the third line font renders these two characters as almost visually identical.

Furthermore, even if a font provides the correct design for these shapes, the font may not render the dynmaic diacritic marks accurately, as in the below example, second line:

broken diacritic shaping

Font Knowledge for Improving Indigenous Font and Keyboard Support

One of the aims of our project is to help communities identify issues in fonts and keyboards that present barriers towards language use in digital text, and to provide public font development documentation, data, and knowledge that allows for all software providers to ensure that their fonts meet the standards determined by local Indigenous communities, to esnure that each community's language works correctly and is rendered correctly typographically. An example of this is the following public GitHub repository on font development knowledge for the Syllabics writing system and it's typography:


Syllabics Knowledge GitHub Repository

Font designers and developers require language data such as the Unicode characters that a community uses in their keyboard, to know how diacritic marks should appear and perform in digital text, and some short language text examples that match the list of Unicode characters in order to ensure that the font represents the natural language in text harmoniously. Font designers also require knowledge on how each community prefers certain letter shapes to appear, which may differ from the standard way that Unicode represents each character.

Research Team

Leo Vicenti Assistant Professor and Type Designer, Emily Carr University of Art + Design
Mark Turin Associate Professor, Anthropologist & Linguist, University of British Columbia
Cris Hernández Type Designer and Researcher, Que Queda Type
Julia Schillo Graduate Student, Graphic Designer & Linguist, Simon Fraser University
Kevin King Type Designer and Researcher, Typotheque