Abhijeet Pokhriyal

Dec 17, 2019

10 min read

Citizenship Amendment Act (CAA/CAB) in Numbers

What were the reasons that lead to this unprecedented act ?

Goal of this research was to stay unbiased and better understand how things got to where they are today with respect to the new act by the government and how it might impact the people.

How I wanted to go about it was to the get numbers from different sources about refugees, visas, undocumented immigration data etc and then try to reason about the things going on.

I am going to list the key findings right up here so that people who probably are too biased or don’t have capacity to reason can save some time. I appreciate constructive criticism, therefore if I have misinterpreted or missed something or if there is another line of reasoning that can be looked into, I am open to suggestions.

Key findings:

  1. It’s difficult for me to debate morality and ethics of the decision and therefore if according to you the decision is wrong in principle then the discussion can end here. I support you. But here I am not taking that route of reasoning, it’s the numbers.

SO now let’s start with the analysis.

First source of data was UNHRC — somewhat credible source right?.

For people who are not interested in the code can skip the grayed out sections.

Loading Data on Refugees from UNHRC.

This dataset has data for last couple of decades and includes the countries from where the migration and to which the migration has been taking place. Idea is look at the numbers and try and understand what really triggered recent developments.

#http://data.un.org/Data.aspx?d=UNHCR&f=indID%3AType-Ref
seekers <- read.csv(“./UNdata_Export_20191216_191045051.csv”)
seekers <- seekers %>% clean_names()
coi <- c(“Pakistan” , “Bangalesh” , “Afghanistan” , “Sri Lanka” , "China")

Above I defined some countries of interest like Afghanistan, Bangadesh, Srilanka , China(Tibet) as coi — these are the countries referenced in CAA.

Below i am just Wrangling data to make column names shorter and accessible

cols <- colnames(seekers)
cols[1] <- "residence"
cols[2] <- "origin"
cols[6] <- "total_refugee_like"
cols[7] <- "total_refugee_like_assisted_unhcr"
colnames(seekers) <- cols

Lets take a look at the data itself.

colnames(seekers)
kable(head(seekers)) %>%
kable_styling(bootstrap_options = c("striped", "hover"))

Columns available

Data Head

Filtering data where residence = India. That is — data for where people are flocking to India.

forindia <- seekers %>% filter(residence == "India")
forindia$maxcount <- apply(forindia %>% select(contains("refugee")) ,1, function(r) { max(r , na.rm=TRUE) })

Plotting overall overtime refugee count for India

yoyref <- forindia %>% group_by(residence ,year) %>% summarize(totals = sum(maxcount))ggplot(data = yoyref , aes(x=year , y = totals)) + geom_line(color = lightlinecol) + 
theme +
scale_y_continuous(labels = scales::comma)

We see that the numbers shot up in early 1990s (reasons below) but over the past few decades things have been pretty constant. No major influx of refugees.

Lets break the counts down by different origin countries

ggplot(data = forindia , aes(x = year , y = maxcount, color=origin , label=origin)) + 
geom_line() +
theme(legend.position = "none") +
geom_dl(aes(label = origin), method = list(dl.trans(x = x - 1 ,y = y + 0.3), "last.points", cex = 0.8)) +
geom_dl(aes(label = origin), method = list(dl.trans(x = x - 0.2), "first.points", cex = 0.8)) + theme

Key Takeaways form the above chart

  • We can see that it’s Srilanka , China, Myanmar and Afghanistan that pop out as the major contributors.
  • It would have been interesting to see how Bangladesh’s Trend has been since 2000s because it’s at the core of the ongoing issue. But unfortunately the data is not available. This itself might be indicative of other problems that we are facing from Bangladesh. I try later in the article to get data from other source and some key points emerge for Bangladesh in particular.

Another thing that comes to mind is where are people from COI actually taking refuge in ? Is it just India that is impacted ?

coiseekers <- seekers %>% filter(origin %in% coi)
coiseekerstotals <- coiseekers %>% group_by(origin , residence) %>% summarise(totals = sum(maxcount)) %>% as.data.frame()
coiseekerstotals <- coiseekerstotals %>% filter(totals > 200000)ggparallel::ggparallel(data = coiseekerstotals,
c("origin" , "residence") ,
method="parset",
weight="totals" , label.size = 8, text.angle=0 , text.offset = 0 ,label=TRUE) +

scale_color_manual(values=sample(color , 34) )+
theme + theme(legend.position="none")

In the above code i have filtered out data based on total refugee count in the residence country to be over 0.2 Million (2 Lakh). That is just to be able to better assess the impact of mass migrations and not focus on the smaller numbers.

Seems like out of the COIs, Afghanistan is the one worst impacted (by Taliban) and people in very large numbers have been seeking refuge in the neighboring countries of Pakistan and Iran. Wikipedia — Afghan Refugees

Compared to Pakistan and Iran, India is taking in a very small numbers from Afghanistan (a Muslim majority country). Let’s zoom in on the impact on India alone.

Now what we see is, what was sort of expected, Pakistan doesn’t even show up on the chart. People from Pakistan don’t seem to be interested in taking refuge in India, sounds reasonable, same argument as for Afghanistan. But the government seems to think that the persecuted minorities are coming from all over the subcontinent.

It also becomes clearer that India hasn’t taken in that many refugees from Bangladesh either , at least the official numbers say so. Compared to Tibet and Sri Lanka — which are predominantly Buddhist, India doesn’t have that many migrants from Bangladesh — another Muslim majority country.

Official numbers from UNHRC show similar results

In these UNHRC figures two new names pop up — Myanmar and Somalia. If i expand the list of COIs to include these two countries as well.

We see that for Somalia the refugees , not just Christians, moved to neighboring countries like Kenya and Ethiopia. For Myanmar, it was Bangladesh and Thailand. Therefore it makes somewhat sense to have location as a parameter in CAA, it’s just natural and convenient.

Myanmar ‘s case is some what more interesting — country that is predominantly Buddhist and recent migration has been mostly from the persecuted Muslims — crisis that we have come to know as Rohingya Refugee Crisis.

Above we see that in a 10 year period from 2007 to 2016, the number of migrants from Myanmar into India has gone from around 2K to 15K. Now that’s a whopping 758% increase. There was a sharp rise in 2012 and it peaked around 2015. Corroborated by the below table.

But again it’s not India that’s worst impacted by the influx

People from Myanmar have mostly settled for Bangladesh, Thailand and Malaysia. Their Neighbors.

Now this puzzles me because, it makes it so much more harder to understand either stances on the act. When people are not even willing to take refuge in India (specially Muslims), when the number of refugees has stayed pretty constant in past few decades, when its other countries like Australia, USA /UNHRC paying for the refugees, what’s the big deal ? Why such a riot ?

North East Story and Role of Bangladesh

Now before we even go there. This is an amazing read by Elena Dabova.

Key Takeaways

  1. One of the reasons for such situation is that the social, economic and other troubles of Muslim population in India are not a result of the oppressive politics of the Indian state. Vast inequalities between the elites and the most of the populations were present in precolonial Indian feudalism, deepened during British rule, and remained the same all along before and after the creation of the independent Indian state in 1948

To assess this threat of illegal immigration, I felt it was reasonable to look into visa allocations by India.

Data is from data.gov.in

visa <- read.csv("./VISA_Details_2010-2013-oct.csv")
visa <- visa %>% clean_names()
visa <- visa %>% gather(key="type" , value="counts" ,-country , -mission , -visa_issue_date)
visa$visa_issue_date_d <- visa$visa_issue_date %>% strptime(format="%d-%m-%y") %>% as.Date()
visa$visa_issue_date_year <- visa$visa_issue_date_d %>% format("%Y")
visa$visa_issue_date_month <- visa$visa_issue_date_d %>% format("%m")
visa

First step was to take a look at the different visa categories and how many India issues for each

Business Visa Makes up quite a big proportion but its clearly the Tourist Visa that’s issued the most.

But to whom are they being issued ?

ggplot(data= visa %>% group_by(country ,type) %>% summarise(totals = sum(counts))
, aes(x = country %>% reorder(totals) , y = totals, fill = type)) +
geom_bar(stat="identity") +
coord_flip()+
scale_fill_manual(values=visatype_color)+
theme +
theme(axis.text.y = element_text(angle=0), legend.position="top" )

It’s not that clear just yet. Let me zoom in for you.

Well well. It’s Bangladesh that seems to have most people looking for a tourist visa. Followed by UK and USA. Having seen this, I would expect that Bangladeshi people make up most of the foreign tourists that Visit India right ? Let’s pull more data in.

Tourist’s data from Data.gov.in

tourists <- read.csv("./InternationToursits2001_2010.csv")
tourists <- tourists %>% clean_names()

Let’s take a look at number of foreign tourists visiting India over the years

touristsperyear <- tourists %>% gather(key="year" , value="numberoftourists"  , -name_of_countries)
touristsperyear$year <- touristsperyear$year %>% str_remove("x") %>% strptime(format="%Y") %>% as.Date()
ggplot(data= touristsperyear
, aes(x = year , y= numberoftourists, color=name_of_countries)) +
geom_line() +
geom_dl(aes(label = name_of_countries), method = list(dl.trans(x = x - 1 ,y = y + 0.3), "last.points", cex = 0.8)) +
geom_dl(aes(label = name_of_countries), method = list(dl.trans(x = x - 0.2), "first.points", cex = 0.8)) + theme

Now this seems odd. Even though Bangladesh has most number of Visa’s Issued over the years, yet very few people tend to be using that Visa. My likely guess is that people are moving in and then overstaying their welcome. It’s one of the symptoms of the whole “Illegal Immigrant” line of reasoning. But still the numbers are insignificant. Even if we had more data, data for unaccounted migrations, even if the actual number of illegal immigrants are many fold these numbers, it still seems more like politics than a real concern. National Security ? again , I would defer that to Elena’s article.