Guest post by Stephan Scheel and Funda Ustek-Spilda. Stephan is currently working as a post-doctoral researcher in the ERC-funded project “Processing Citizenship” at the University of Twente in the Netherlands. Before Stephan was working, together with Funda Ustek-Spilda, as a post-doctoral researcher on the ERC-funded project “ARITHMUS – How Data Make a People” at Goldsmiths, University of London. Stephan’s first research monograph “Autonomy of Migration? Appropriating Mobility within Biometric Border Regimes” is about to be published by Routledge. Funda is currently a Research Officer at the London School of Economics, Department of Media and Communications. She is part of the Horizon 2020 research project, Virt-EU. She also continues her work at ARITHMUS as a Visiting Researcher. Funda is working on her first research monograph “Choosing to be Invisible, Dying to be Visible” on the invisible work of women workers in the informal sector, which will be published by Routledge. This is the final post of Border Criminologies’ themed series ‘Migrant Digitalities and the Politics of Dispersal’, organised by Glenda Garelli and Martina Tazzioli.

We are witnessing the datafication of mobility and migration management across the world. In the context of Europe, programs like Eurosur use satellite images for surveilling the EU’s maritime borders, while the so-called hotspot approach aims to register all newly arriving migrants in biometric databases. Similarly, in the field of asylum, biometric databases are built for purposes of refugee management, while asylum seekers in Greece are distributed cash-cards. These new types and collections of data do not only change border and migration management practices. They also reconfigure how human mobility and migration are known and constituted as intelligible objects of government. The crucial innovation driving this datafication is the digitization of information that was previously stored – if at all – on paper files. This information is now available in a range of databases and can – at least in theory – be searched, exchanged, linked, and analysed with unprecedented scope and efficiency.

As a consequence, ‘Big Data’ are promoted as promising alternative sources for producing more reliable statistics on international migration. Several national statistical institutes (NSIs), international organisations and private actors are currently developing alternative methodologies for the production of migration statistics, for instance, by analysing mobile phone data, geotagged social media data from platforms like Twitter or Facebook or internet searches with particular search terms. Likewise, the UNHCR stresses the (potential) role of social media to inform humanitarian response.

Visualization of commuting patterns of people in Estonia commuting across municipal borders, based on mobile positioning data. The visualization has been produced by the company POSITIUM in Tartu, Estonia.

The ‘huge potential of Big Data’ to provide accurate and up-to-date accounts of international migration is promoted. Nevertheless, the promises driving these efforts are just as big as the data they refer to. In this post, we briefly discuss three reasons why it is rather unlikely that Big Data will simply solve the most important known limitations of migration statistics. Each reason is related to a form of politics which, taken together, shape the quantification of migration.

The politics of numbers: policy-driven evidence for evidence-based policy making

The first issue that innovative methodologies are unlikely to solve is the so-called politics of numbers. This politics concerns how institutional interests and agendas of the actors of a particular policy field shape decisions about how migrants are counted and what kind of numbers are ultimately disseminated in the public sphere.  For example, according to a tweet by FRONTEX ‘more than 710,000 migrants’ ‘entered [the] EU in first 9 months of 2015’. Migration studies scholar Nando Sigona remarked that this number, published at the height of the ‘migration crisis’ in October 2015, was likely to be inflated. After a twitter exchange, FRONTEX admitted that the figure might be too high since it was based on recorded border crossings. It is likely to have included double-counts, in particular of the thousands of migrants who had entered the EU in Greece and then, after crossing the ‘Balkan route’, again in Hungary. Although FRONTEX added a clarification to its news release, Nando Sigona concluded ‘that Frontex needs to be made more accountable for its actions, including how & why they “inflate” figures – especially given their expanding mandate & budget.’

In late 2017 in the context of an uncovered corruption scandal with refugee aid in Uganda it emerged that the officially reported number of 1.4 million refugees was probably too high. NGOs accused the Ugandan government of inflating the size of the refugee population to receive more financial aid from international donors. They estimated that Uganda’s refugee population is no more than one million people. The question of who is reporting the numbers is critical in migration statistics. For instance at the Supporting Syria & the Region meeting held in London in 2016, the number of refugees reportedly hosted by Turkey ranged from 1,5 to 3 million, depending on who was tweeting. These examples demonstrate that migration policy actors may count migrants in particular ways to produce numbers that provide evidence in support of certain policy objectives or institutional agendas. Importantly, these politics of numbers will not cease with alternative Big Data-based methodologies.

The politics of method: chasing the tail of accuracy

The second form of politics that will not simply wither away in the proclaimed ‘Age of Big Data’ is what we call the politics of method. This is interrelated with the politics of numbers insofar as different methods produce different numbers of the object to be quantified. In brief, methodological heterogeneity – the usage of different definitions, methods and data sources by different NSIs and other producers of migration statistics – makes cross-country comparison of migration data ‘difficult and confusing’. For example, according to Eurostat figures, the UK reported 42,403 immigrants from Poland in 2015, while Poland reported sending only 11,682 emigrants to the UK. One reason for this divergence lies in the usage of different methods for the production of migration statistics across countries.

Number of immigrants from Poland in England based on google searches featuring the term “polski”, compared with number of immigrants with Polish citizenship according to the Labor Force Survey (LFS). Source: Williams and Ralphs 2013

In this context, it is important to note that methodological heterogeneity is not necessarily a bad thing. Rather, statisticians can only assess the reliability and accuracy of any method, as well as its strengths and weaknesses, by comparing it with another method. To illustrate, in England and Wales, the International Passenger Survey (IPS) –the principal method used by the National Office for Statistics (ONS) for the production of migration statistics– became a matter of concern after the last census in 2011. According to the census results, the population size of England and Wales was 464,000 people larger than what had previously been reported by ONS. The latter was based on the so-called ‘cohort component method’, which adjusts the population size of the previous census on an annual basis by recorded births, deaths and net migration figures. An investigation concluded that the ‘largest single cause’ for the divergence was a ‘substantial underestimation’ of immigration from the eight new Eastern European member states by the IPS in the early 2000s. The questionable reliability of ONS migration statistics became a matter of public debate in the context of the promise of then-Prime Minister David Cameron to reduce net-migration to the UK to the ‘tens of thousands each year’, down from an estimated 252,000 in 2010. In light of the inherently probabilistic results of the IPS, a report of the Migration Observatory concludes that ‘efforts to meet the government’s [migration] target lack, for the time being at least, an adequate measure of success.’

The availability of established methodologies for evaluating the results of innovative methods is particularly important in the context of Big Data, since these data sources have usually been generated for different purposes than the production of migration statistics. Consequently, the usage of alternative data sources like mobile phone or Twitter data raise several methodological issues, such as selection bias. Mobile phones and Twitter are, for instance, not used equally by all groups of migrants. This is, why contrary to what their proponents may claim, Big Data-based methods are unlikely to replace established methodologies for migration statistics any time soon. They might rather complement them, thus adding to the already existing methodological heterogeneity.

The politics of (national) distinction: quantifying migration, enacting the nation

The politics of method are also intertwined with a politics of (national) distinction. These politics arise because migration concerns a core issue of national sovereignty: the claimed authority of nation-states to decide on the terms and conditions of entry to and stay within their respective jurisdiction. This claimed prerogative of nation-states results in different migration regimes across nation-states, including different ways of categorising and counting migrants and asylum seekers. Since migration policies are shaped by and are a source of national identity and distinction ‘the harmonization of migration and asylum statistics and policy is controversial as it intervenes in the nation state’s [claimed] sovereign control of who should stay on its territory’, Marianne Takle rightly notes.

The persistence of these differences can be illustrated through the European Statistical System (ESS) that comprises EU member states as well as associated countries. The ESS resembles a ‘hard case’ insofar as it constitutes one of the most advanced, harmonized and robust statistical systems in the world. Principle 14 of the European Statistical Code of Practice stipulates that ‘Statistics are compiled on the basis of common standards with respect to scope, definitions, units and classifications in the different surveys and sources’ to ensure ‘European Statistics are consistent internally, over time and comparable between regions and countries.’

However, our study into the operationalization of otherwise well-established legal categories of asylum-seekers and refugees demonstrates that their conversion into statistical categories entails various moments of adaptation to national contexts. These adaptations, in turn, result in important differences across EU member states. For instance, the harmonised statistical categories for ‘forced migrants’ of the ESS include ‘refugee’ and ‘first time [asylum] applicant’ only, despite the plethora of nationally varying sub-categories. DeStatis, the NSI of Germany, provides an explanatory note on the German asylum regime which distinguishes between asylum seekers whose applications are still pending, have been rejected and have been granted protection status. Each group comprises further sub-categories. These range from migrants who still have to lodge their asylum application or those appealing a decision, to five different types of recognised asylum seekers and various types of rejected asylum seekers, including 154,780 people whose presence in Germany is ‘tolerated’ as they are not deportable.

How asylum seekers and refugees are counted in migration statistics and in the overall population also differ between EU member states. DeStatis counts people from all the aforementioned  subcategories in its migration statistics and its population count. Other NSIs in Europe pursue a different policy. For instance, the NSI of Norway, excludes all asylum seekers from its population statistics, as they are not included in the national population register, on which these statistics are based. This is because asylum seekers are not issued personal registration numbers until their application is granted. Eurostat metadata indicates that in many EU countries, only accepted refugees are included in migration and population statistics. The legal limbo asylum seekers find themselves in is reflected in whether and how they are included in migration and population statistics.

Taken together, the three types of politics discussed here demonstrate that Big Data-based methodologies are unlikely to revolutionised migration statistics. Many of the known limitations of migration statistics are related to political issues that cannot be addressed through a technological fix. Rather, the politics of numbers, the politics of method and the politics of national distinction will also shape the development and use of innovative Big Data-based methodologies for migration statistics. So, it is not only the newness of methods per se, but why and how these methods are developed and by whom, that require our attention. 

Any comments about this post? Get in touch with us! Send us an email, or post a comment here or on Facebook. You can also tweet us.


How to cite this blog post (Harvard style) 

Scheel, S. and Ustek-Spilda, F. (2018) Big Data, Big Promises: Revisiting Migration Statistics in Context of the Datafication of Everything. Available at: (Accessed [date]).