India Is Using AI to Police Identity and Expel Minorities
Suvradip Maitra / Mar 2, 2026The Indian state of Maharashtra is developing an AI tool that uses “accent, tone and word choices” to identify and deport Bangladeshi Muslims and displaced Rohingyas from Myanmar. The system is intended for use by law enforcement as a preliminary screening mechanism prior to document-based nationality verification.
Framed as objective technology, the tool is in fact grounded in linguistic profiling that risks reinforcing xenophobia, prejudice, and racial discrimination. Its deployment raises serious concerns under international human rights law, including the International Convention on the Elimination of All Forms of Racial Discrimination.
This initiative must be situated within a broader expansion of AI-driven border policing. Agencies such as US Immigration and Customs Enforcement in the United States have adopted data-driven enforcement systems, and several Indian police departments are increasingly turning to “carceral AI” for policing caste, women and religious minorities.
This is also not the first instance of accent and dialect being employed as a way to identify the origin of migrants. As early as 2017, the German Bundesamt für Migration und Flüchtlinge (BAMF, the Federal Office for Migration and Refugees) deployed DIAS, a tool for accent and dialect recognition, following the Syrian refugee crisis, to validate asylum claims. Since then, several EU countries and Turkey have tested the technology but determined that it was not “mature enough” for implementation.
Prejudice, xenophobia and discrimination
Indian states ruled by the Bharatiya Janata Party (“BJP”) have labeled many ethnic Bangladeshi Muslims and Rohingya as “illegal migrants” to justify mass deportations, including Indian citizens. Discriminatory legislation in 2019 provided a pathway for only non-Muslim refugees from Afghanistan, Bangladesh and Pakistan to obtain citizenship.
The Rohingya refugee crisis has been ongoing since the ethnic cleansing in Myanmar against the minority ethnic group. An estimated 1.3 million Rohingya refugees are living in other countries, with more than 1 million in Bangladesh. Over 75% of the refugee population is women and children.
In 2018, the UN issued a letter stating reports that Rohingyas had been the “target of hate speech and violence in India”. In 2021, the Indian Supreme Court affirmed the Indian government’s position that since India was not party to non-refoulment obligations under the Refugee Convention, the Rohingya could be deported back to Myanmar, where they faced persecution. More recently, in July 2024, the UN Committee on Elimination of Racial Discrimination issued a statement saying that “India must end discrimination against Rohingyas”. Human Rights Watch has reported that an estimated 40,000 Rohingya refugees currently living in India are facing a human rights crisis.
Bangladeshi migrant workers crossing into India have long faced persecution in India. For instance, Bengalis suspected of being “non-original” inhabitants in Assam have faced citizenship assessments by quasi-judicial tribunals that had deported 165,992 people as of January 2025. The persecution of these workers intensified since the BJP came into power in the 1990s, enacting regular deportations every year, including of Bengalis from West Bengal. One recent instance is where several Bangladeshi migrant workers were deported from Mumbai despite being Indian citizens.
Harms from using an AI tool for accent and dialect recognition
In this culture of discrimination and violence, the introduction of the AI tool raises several concerns. Presently, the error rate for the tool is 60%, which can have significant consequences for an individual. Reports show that already any errors, including incorrect spellings or contradictions in testimony, are used to justify deportations without further inquiry.
How is that different from the previous process of using language experts? No doubt, previously errors may have been made in dialect recognition, especially as dialect recognition is inherently unreliable. Nonetheless, errors would have been randomly distributed across assessments based on several factors such as assessor’s expertise, pronunciation clarity, dialect and perhaps, stereotypes associated with certain dialects. However, in the standardized outputs of AI accent and dialect recognition AI tools, errors recur consistently, largely dependent only on dialects, if the same algorithmic model is used.
Even if the tool were to become “foolproof” in six months, as promised by the Chief Minister of Maharashtra, researchers have questioned whether such accent- and dialect-recognition tools are theoretically capable of identifying nationality.
Academic Pedro Oliveira conducted a detailed analysis of the German tool DIAS to identify different dialects of Arabic. Oliveira found that datasets contained “similar [power coefficients] extracted from speakers grouped by ethnicity or language.” Power coefficients were treated as correlative with physical characteristics of speech (e.g., vocal tract or mouth size) and behavioral characteristics (e.g., accent). By isolating a power coefficient, DIAS effectively measured the “resonant frequencies of the vocal tract” — something the human ear is incapable of quantifying. The underlying logic was that the physical constitution of the vocal tract is similar among speakers of the same dialect, meaning the accuracy of dialect recognition is essentially proportional to the volume of data available for each dialect. By linking the tool to the validity of asylum claims, DIAS was treating timbre as a reliable marker of identity.
However, dialects vary by region, class, and personal background, risking the generation of arbitrary mismatches. “Bangladeshi” and “Indian” Bengali are often indistinguishable and appear on both sides of the border. For instance, researchers have identified that “linguistic patterns noticed among Bengali Muslims and the words commonly used by Bangladeshis frequently align.” However, the dialect spoken by Indian Bengalis tends to resemble that of Bengali Hindus. Given this, the Bengali Muslims, including those who are Indian citizens, are more likely to sound similar to Bangladeshi dialects. This increases the risk of deportation for Bengali Muslims regardless of nationality. Rather than citizenship, the tool will most likely become a proxy for religious identification, exacerbating the ongoing persecution of Muslims.
A key concern is that the tool will become a de facto marker of identity. As seen with previous uses of AI for accent and dialect identification, the practical pressures of border policing often lead to overreliance on the tool. Automation bias is also a phenomenon that leads humans to over-reliance on machine outputs without critical reflection. While outputs of the tool are verified by subsequent documentary analysis, many refugees do not have valid documentation. Even where valid documentation exists, it is often destroyed by police to justify deportations. In fact, the government had to allow re-entry to those who were erroneously deported without due process once their citizenship claims were established.
The Indian Immigration and Foreigners Act 2025 allows the collection of biometric voice data from foreigners. However, data collection in this way could raise potential constitutional concerns as the Supreme Court has recognized a constitutional right to privacy to all “peoples” in India, which could extend to non-citizens.
As researchers have argued in the context of DIAS, the use of recognition AI tools such as this not only perpetuates legal or social injustice but also an epistemic injustice that denies the capacity of the affected peoples as knowledge subjects. The tool undermines human dignity and dehumanizes by giving undue credibility to an AI system compared to the knowledge of the affected individuals, reifying the existing suspicion and disbelief towards the groups. Further, the supposed objectivity of the tool solidifies their identity as “illegal” immigrants when they are identified as Bangladesh Muslims or Rohingya, potentially impacting how the individuals perceive their own identity. The truth does not originate from the knowledge held by the relevant individual, but rather from the output of the tool. The injustices mirror colonial practices in which authorities classified populations through markers such as skin colour, posture, and other perceived physical or behavioral traits.
The tool is being developed by the Indian Institute of Technology Bombay and the Maharashtra government. The involvement of IIT — a prestigious academic institution — lends the semblance of legitimacy, objectivity and credibility to the tool. Such a use of AI in an already fraught political environment is dangerous and violent. As Nishtha Sood and Jagpreet Singh write, a “world where algorithms are used to determine citizenship is a world where the ideals of justice have already been forgotten.”
With the India-hosted AI Impact Summit just concluded, no doubt the potential uses of AI for social good have dominated headlines. India has predictably positioned itself as a global leader in AI innovation and use cases. Amidst the hype, we must ensure the stories of those who fall prey to the darker side of innovation are not obscured or forgotten.
Authors

