publications | Nina Markl

2026

The Sound of Silencing: Identities and Ideologies in Commercial Text-To-Speech

Alice Ross, Nina Markl, Catherine Lai, and Lauren A. Hall-Lew

In Proceedings of the Extended Abstracts of the 2026 CHI Conference on Human Factors in Computing Systems

Abs

Text-to-speech (TTS) technology allows the synthesis of speech that is frequently described as highly ‘natural’ and, in some contexts, indistinguishable from human speech. Voice interfaces using such synthesised speech are increasingly encountered in a wide range of contexts. Recognising that listeners are likely to hear human-like voices as belonging to different demographic/social groups, and that these social judgments exist within ideological frameworks, we note a lack of diversity in popularly used English-speaking TTS voices, and caution that decisions taken in the design and deployment of voice interfaces risk perpetuating, or even exacerbating, existing social biases. Drawing upon sociolinguistic theory, we carry out a novel experiment to investigate these issues in a leading commercial TTS system, concluding that the system’s output disproportionately reproduces white, male, US-accented speech when prompted to convey competence. This work aims to encourage further research applying sociolinguistic knowledge to the study of human-computer interaction with speech technology.
Kind is the Opposite of Competent: Phonetic variation in English TTS Voices

Alice Ross, Lauren Hall-Lew, Catherine Lai, and Nina Markl

Apr 2026

URL
“A natural language interface for everyday life”: The social and political functions of AI capabilities discourses

Justine Zhang, Su Lin Blodgett, and Nina Markl

In Proceedings of the 2026 ACM Conference on Fairness, Accountability, and Transparency

Abs URL

We interpret and historically contextualize AI capabilities discourses: spectacular narratives that ostensibly lay out the capabilities and use cases of the latest generative AI models in marketing and scientific communications. We complement existing work scrutinizing these discourses as components of a wider phenomenon of “AI hype”—which tends to focus on “debunking” them—by exploring how they rationalize wider social projects. Via an analysis of marketing materials produced by companies like OpenAI, we trace how AI capabilities discourses extend longer-running capitalist discourses, which construct aspects of self and social life as segmentable “skills” with economic value, and which responsibilize people for their survival in precarious economic conditions. We then show how AI capabilities discourses build on top of this ideological scaffolding. We suggest that these discourses have power because they reiterate hegemonic ideas about society and personhood that are entwined with material conditions. As such, contesting the power of the AI industry necessitates more capacious forms of political action, beyond interventions aiming to merely correct narratives.

2025

Experiences of Censorship on TikTok Across Marginalised Identities

Eddie L. Ungless, Nina Markl, and Björn Ross

Proceedings of the International AAAI Conference on Web and Social Media, Jun 2025
The predatory fantasy of worker empowerment in AI marketing

Justine Zhang, Nina Markl, and Su Lin Blodgett

In AI x Crisis: Tracing New Directions Beyond Deployment and Use workshop, Aarhus 2025.
Le$bean or lesbian? A survey of marginalised users’ motivations for obfuscation on TikTok

Eddie L. Ungless, Nina Markl, and Björn Ross

Behaviour & Information Technology, Aug 2025

Abs URL

Many TikTok users report the censorship of ‘sensitive’ content by ‘the algorithm’; this is particularly true of marginalised users. In order to evade perceived censorship, users employ a range of linguistic techniques to obfuscate – hide – their intended meaning. This has received significant media attention, and we complement this by conducting a survey to establish users’ motivations for employing these techniques. Our work is informed by linguistic scholarship on self-censorship, anti-languages and platform vernaculars. We conducted a novel survey of 627 UK TikTok users across 2023–2024 (female = 377, male = 224, other = 26) and found that use of obfuscation was relatively low in our sample, and primarily related to the types of content users were posting (historically censored content), rather than acting as a way to establish social identity as we had predicted – though our sample was far from homogeneous on this point. Through a structural equation modelling (SEM) analysis we show that men and people of colour (POC) were significantly more likely to use obfuscation. For POC this is driven partly by positive associations with obfuscation use, suggesting it is seen as a way to be playful with language, as well as evade ‘the algorithm’.
Defining language and managing its use: Language technology as language management

Nina Markl

Language & Communication, Nov 2025

Abs URL

Language technologies such as voice user interfaces, large language models and machine translation tools are embedded in an ever-growing range of digital devices and services used by millions of people every day in contexts as diverse as schools, homes, hospitals, and offices. In this paper, I argue that the way these technologies are used by and used on language workers and other members of language communities can be understood as a type of language management. As social scientists of technology have long pointed out, all technologies are shaped by and expressive of ideologies. In the case of language technologies, some of these are ideologies about language(s) and their speakers. Rather than simply (or only) functioning as linguistic interfaces facilitating interaction between people, language technologies reinforce linguistic ideologies, and contribute to the ideological construction of particular languages and their communities, as well as more abstract notions of ‘language’ and its value and purpose. They are furthermore often directly deployed to manage language work(ers) through surveillance, and partial automation. Understanding the ways in which language technologies reproduce, mediate and shape linguistic behaviours and beliefs as part of “algorithmic language management” allows us to connect them to both the broader sociotechnical and political project of artificial intelligence, and the scholarship on language policy.
Spoken Document Retrieval for an Unwritten Language: A Case Study on Gormati

Sanjay Booshanam, Kelly Chen, Ondrej Klejch, Thomas Reitmaier, and 7 more authors

In Findings of the Association for Computational Linguistics: EMNLP 2025

2024

Language Technologies as If People Mattered: Centering Communities in Language Technology Development

Nina Markl, Lauren Hall-Lew, and Catherine Lai

In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Abs URL

In this position paper we argue that researchers interested in language and/or language technologies should attend to challenges of linguistic and algorithmic injustice together with language communities. We put forward that this can be done by drawing together diverse scholarly and experiential insights, building strong interdisciplinary teams, and paying close attention to the wider social, cultural and historical contexts of both language communities and the technologies we aim to develop.
Cultivating Spoken Language Technologies for Unwritten Languages

Thomas Reitmaier, Dani Kalarikalayil Raju, Ondrej Klejch, Electra Wallington, and 5 more authors

In Proceedings of the CHI Conference on Human Factors in Computing Systems, 2024

Abs URL

We report on community-centered, collaborative research that weaves together HCI, natural language processing, linguistic, and design insights to develop spoken language technologies for unwritten languages. Across three visits to a Banjara farming community in India, we use participatory, technical, and creative methods to engage community members, collect spoken language photo annotations, and develop an information retrieval (IR) system. Drawing on orality theory, we interrogate assumptions and biases of current speech interfaces and create a simple application that leverages our IR system to match fluidly spoken queries with recorded annotations and surface corresponding photos. In-situ evaluations show how our novel approach returns reliable results and inspired the co-creation of media retrieval use-cases that are more appropriate in oral contexts. The very low (< 4h) spoken data requirements makes our approach adaptable to other contexts where languages are unwritten or have no digital language resources available.
Beyond The Binary: Limitations and Possibilities of Gender-Related Speech Technology Research

Ariadna Sanchez, Alice Ross, and Nina Markl

In 2024 IEEE Spoken Language Technology Workshop (SLT)

2023

The Edinburgh International Accents of English Corpus: Towards the Democratization of English ASR

Ramon Sanabria, Nikolay Bogoychev, Nina Markl, Andrea Carmantini, and 2 more authors

In ICASSP 2023

Abs URL

English is the most widely spoken language in the world, used daily by millions of people as a first or second language in many different contexts. As a result, there are many varieties of English. Although the great many advances in English automatic speech recognition (ASR) over the past decades, results are usually reported based on test datasets which fail to represent the diversity of English as spoken today around the globe. We present the first release of The Edinburgh International Accents of English Corpus (EdAcc). This dataset attempts to better represent the wide diversity of English, encompassing almost 40 hours of dyadic video call conversations between friends. Unlike other datasets, EdAcc includes a wide range of first and second-language varieties of English and a linguistic background profile of each speaker. Results on latest public, and commercial models show that EdAcc highlights shortcomings of current English ASR models. The best performing model, trained on 680 thousand hours of transcribed data, obtains an average of 19.7% word error rate (WER) – in contrast to the 2.7% WER obtained when evaluated on US English clean read speech. Across all models, we observe a drop in performance on Indian, Jamaican, and Nigerian English speakers. Recordings, linguistic backgrounds, data statement, and evaluation scripts are released on our website under CC-BY-SA 1 license. 2 We hope that this work will encourage future research on a wider range of English varieties to create more accessible speech technologies.
"I can’t see myself ever living any[w]ere else": Variation in (HW) in Edinburgh English

Nina Markl

Language Variation and Change, 2023

Abs URL

Sociolinguistic research across Scotland in recent decades has documented an erosion of the phonemic contrast between /ʍ/ (as in which) and /w/ (as in witch). Based on acoustic phonetic analysis of 1,400 realizations produced by eighteen Edinburgh women born between 1938 and 1993, I argue that in the context of Edinburgh this is best understood as a complex sociolinguistic variable (HW) encompassing (at least) six fricated and fricationless variants. Realizations vary in type and relative duration of frication, voicing, and glide quality. Bayesian statistical analysis suggests that choice and realization of variants is conditioned by speaker’s social class, style, and phonetic context. Unlike some prior work, I do not find evidence of ongoing (apparent-time) change or an effect of contact with Southern British English. Fricated variants are most prevalent in formal speech styles and in the speech of middle-class women, while working-class speakers favor fricationless variants.</p> </div> </div> </div></li>
Everyone has an accent

Nina Markl, and Catherine Lai

In Proc. INTERSPEECH 2023
Automatic transcription and (de)standardisation

Nina Markl, Electra Wallington, Ondrej Klejch, Thomas Reitmaier, and 5 more authors

In Proceedings - SIGUL 2023, 2nd Annual Meeting of the Special Interest Group on Under-resourced Languages

Abs URL

In this paper we illustrate the gap between real language use and the language use assumed in ASR development through the example of isiXhosa in Langa, South Africa. Understanding speech and writing practices in context is particularly important when developing speech technologies for minoritised and under-resourced languages, and their communities.
Situating Automatic Speech Recognition Development within Communities of Under-heard Language Speakers

Thomas Reitmaier, Electra Wallington, Ondřej Klejch, Nina Markl, and 5 more authors

In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems

Abs URL

In this paper we develop approaches to automatic speech recognition (ASR) development that suit the needs and functions of under-heard language speakers. Our novel contribution to HCI is to show how community-engagement can surface key technical and social issues and opportunities for more effective speech-based systems. We introduce a bespoke toolkit of technologies and showcase how we utilised the toolkit to engage communities of under-heard language speakers; and, through that engagement process, situate key aspects of ASR development in community contexts. The toolkit consists of (1) an information appliance to facilitate spoken-data collection on topics of community interest, (2) a mobile app to create crowdsourced transcripts of collected data, and (3) demonstrator systems to showcase ASR capabilities and to feed back research results to community members. Drawing on the sensibilities we cultivated through this research, we present a series of challenges to the orthodoxy of state-of-the-art approaches to ASR development.

2022

Mind the data gap(s): Investigating power in speech and language datasets

Nina Markl

In Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion, May 2022

Abs URL

Algorithmic oppression is an urgent and persistent problem in speech and language technologies. Considering power relations embedded in datasets before compiling or using them to train or test speech and language technologies is essential to designing less harmful, more just technologies. This paper presents a reflective exercise to recognise and challenge gaps and the power relations they reveal in speech and language datasets by applying principles of Data Feminism and Design Justice, and building on work on dataset documentation and sociolinguistics.
The Lothian Diary Project: Sociolinguistic Methods during the COVID-19 Lockdown

Lauren Hall-Lew, Claire Cowie, Catherine Lai, Nina Markl, and 6 more authors

Linguistics Vanguard, Mar 2022

Abs URL

The Lothian Diary Project is an interdisciplinary effort to collect self-recorded audio or video diaries of people’s experiences of COVID-19 in and around Edinburgh, Scotland. In this paper we describe how the project emerged from a desire to support community members. The diaries have been disseminated through public events, a website, an oral history project, and engagement with policymakers. The data collection method encouraged the participation of people with disabilities, racialized individuals, immigrants, and low-proficiency English/Scots speakers, all of whom are more likely to be negatively affected by COVID-19. This is of interest to sociolinguists, given that these groups have been under-represented in previous studies of linguistic variation in Edinburgh. We detail our programme of partnering with local charities to help ensure that digitally disadvantaged groups and their caregivers are represented. Accompanying survey and demographic data means that this self-recorded speech can be used to complement existing Edinburgh speech corpora. Additional sociolinguistic goals include a narrative analysis and a stylistic analysis, to characterize how different people engage creatively with the act of creating a COVID-19 diary, especially as compared to vlogs and other video diaries.
Language Variation and Algorithmic Bias: Understanding Algorithmic Bias in British English Automatic Speech Recognition

Nina Markl

In 2022 ACM Conference on Fairness, Accountability, and Transparency

Abs URL

All language is characterised by variation which language users employ to construct complex social identities and express social meaning. Like other machine learning technologies, speech and language technologies (re)produce structural oppression when they perform worse for marginalised language communities. Using knowledge and theories from sociolinguistics, I explore why commercial automatic speech recognition systems and other language technologies perform significantly worse for already marginalised populations, such as second-language speakers and speakers of stigmatised varieties of English in the British Isles. Situating language technologies within the broader scholarship around algorithmic bias, consider the allocative and representational harms they can cause even (and perhaps especially) in systems which do not exhibit predictive bias, narrowly defined as differential performance between groups. This raises the question whether addressing or “fixing” this “bias” is actually always equivalent to mitigating the harms algorithmic systems can cause, in particular to marginalised communities.
Language technology practitioners as language managers: arbitrating data bias and predictive bias in ASR

Nina Markl, and Stephen Joseph McNulty

In Proceedings of the Language Resources and Evaluation Conference, Jun 2022

Abs URL

Despite the fact that variation is a fundamental characteristic of natural language, automatic speech recognition systems perform systematically worse on non-standardised and marginalised language varieties. In this paper we use the lens of language policy to analyse how current practices in training and testing ASR systems in industry lead to the data bias giving rise to these systematic error differences. We believe that this is a useful perspective for speech and language technology practitioners to understand the origins and harms of algorithmic bias, and how they can mitigate it. We also propose a re-framing of language resources as (public) infrastructure which should not solely be designed for markets, but for, and with meaningful cooperation of, speech communities.
Imagining the city in lockdown: Place in the COVID-19 self-recordings of the Lothian Diary Project

Claire Cowie, Lauren Hall-Lew, Zuzana Elliott, Anita Klingler, and 2 more authors

Frontiers in Artificial Intelligence, Dec 2022

Abs URL

The COVID-19 pandemic brought about a profound change to the organization of space and time in our daily lives. In this paper we analyze the self-recorded audio/video diaries made by residents of Edinburgh and the Lothian counties during the first national lockdown. We identify three ways in which diarists describe a shift in place-time, or “chronotope”, in lockdown. We argue that the act of making a diary for an audience of the future prompts diarists to contrast different chronotopes, and each of these orientations illuminates the differential impact of the COVID-19 lockdowns across the community.
(Commercial) Automatic Speech Recognition as a Tool in Sociolinguistic Research

Nina Markl

University of Pennsylvania Working Papers in Linguistics, Sep 2022

Abs URL

As speech datasets used in sociolinguistic research increase in size, laborious and time-intensive manual orthographic transcription is a challenge, limiting the amount of (transcribed) data which can be analysed. In this paper, I discuss the use of (commercial) automatic speech recognition (ASR) as a tool in sociolinguistic research in the context of a case study: the Lothian Diary Project. I describe the kinds of errors produced by two commercial ASR systems for British English within the broader context of algorithmic bias in ASR, and suggest some best practices when working with ASR in sociolinguistic work.

2021

Context-sensitive evaluation of automatic speech recognition: considering user experience & language variation

Nina Markl, and Catherine Lai

In Proceedings of the First Workshop on Bridging Human–Computer Interaction and Natural Language Processing, Apr 2021

Abs URL

Commercial Automatic Speech Recognition (ASR) systems tend to show systemic predictive bias for marginalised speaker/user groups. We highlight the need for an interdisciplinary and context-sensitive approach to documenting this bias incorporating perspectives and methods from sociolinguistics, speech & language technology and human-computer interaction in the context of a case study. We argue evaluation of ASR systems should be disaggregated by speaker group, include qualitative error analysis, and consider user experience in a broader sociolinguistic and social context.
The Lothian Diary Project: Investigating the Impact of the COVID-19 Pandemic on Edinburgh and Lothian Residents

Lauren Hall-Lew, Claire Cowie, Stephen Joseph McNulty, Nina Markl, and 7 more authors

Journal of Open Humanities Data, Apr 2021

2020

Querent Intent in Multi-Sentence Questions

Laurie Burchell, Jie Chi, Tom Hosking, Nina Markl, and 1 more author

In Proceedings of the 14th Linguistic Annotation Workshop, Dec 2020

Abs URL

Multi-sentence questions (MSQs) are sequences of questions connected by relations which, unlike sequences of standalone questions, need to be answered as a unit. Following Rhetorical Structure Theory (RST), we recognise that different “question discourse relations” between the subparts of MSQs reflect different speaker intents, and consequently elicit different answering strategies. Correctly identifying these relations is therefore a crucial step in automatically answering MSQs. We identify five different types of MSQs in English, and define five novel relations to describe them. We extract over 162,000 MSQs from Stack Exchange to enable future research. Finally, we implement a high-precision baseline classifier based on surface features.