Data Migration

Data-Migration-Square

At Thesis, we deliver high tech that is human-centered. We understand that change is hard because we’ve worked in higher education, too. When it comes to data migration, we’ll walk you through the process to ensure that each data point is securely transferred to your new SIS.

Group 14686

Smooth transition to a modern SIS 

When it comes to launching your student information system software, a seamless data migration process is the key to success. Our team is here to make the transition hassle free.   

We have an entire data migration strategy to transfer student records, enrollment information, academic history and other crucial data points to your new SIS. Rest assured, we'll maintain data integrity and accuracy throughout the process, so you can focus on the students – not their data.  

Data migration done right: Security and compliance guaranteed 

We take data security seriously, especially when it comes to migrating student information for your SIS software. With years of experience in the higher education industry, we have a deep understanding of industry best practices to ensure security and compliance of student and institutional data.   

Data-Migration

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

Data Migration

Profile image of Sarmah S.

2018, Scientific & Academic Publishing

This document gives the overview of all the process involved in Data Migration. Data Migration is a multi-step process that begins with an analysis of the legacy data and culminates in the loading and reconciliation of data into new applications. With the rapid growth of data, organizations are in constant need of data migration. The document focuses on the importance of data migration and various phases of it. Data migration can be a complex process where testing must be conducted to ensure the quality of the data. Testing scenarios on data migration, risk involved with it are also being discussed in this article. Migration can be very expensive if the best practices are not followed and the hidden costs are not identified at the early stage. The paper outlines the hidden costs and also provides strategies for roll back in case of any adversity.

Related Papers

International Journal of Recent Research Aspects ISSN 2349-7688

—Data migration is one of the vital tasks of Data integration process. It is always assumed to be most tedious as there will never be a systematic defined procedure. Each migration process is to be treated as unique as the input data sets will be different and the output format required is always unique based on the services provided as well as the user and data handler requirements. In the recent years data migration became the most vital process in various departments of public and private services due to technological advancements and big data handling requirements caused by the increase in acquired data volume. This paper discusses about data migration requirement, data migration strategy finalization and various stages of data migration process discussion of each stage and why complete automation of data migration is not feasible etc.

data migration thesis

Publisher ijmra.us UGC Approved

Every type of system may replace or enhance the functionality currently delivered by legacy systems to new system, regardless of the type of project/application; some data conversion may take place. Difficulties arise when we take the information currently maintained by the legacy system and transform it to fit into the new system. We refer to this process as data migration. Data migration is a common element among most system implementations. It can be performed once, as with a legacy system redesign, or may be an ongoing process as in storage of historical data in the form of a data warehouse. Some legacy system migrations require ongoing data conversion if the incoming data requires continuous cleansing. It should be that any two systems that maintain the same sort of data must be doing very similar things and, therefore, should map from one to another with ease. Legacy systems have historically proven to be far too lenient with respect to enforcing integrity at the atomic level of data. Another common problem has to do with the theoretical design differences between hierarchical and relational systems. In data migration one method apply in twice (i.e. automated and manual). This paper explores the steps to migrate date in form of manual, i.e. process of data migration without the help of any special tool those made for data migration. Manual data cleaning is commonly performed in migration to improve data quality, eliminate redundant or obsolete information, and match the requirements of the new system in correct and efficient form.

IJESRT Journal

Data migration is the process of moving data from one environment to a new one; it may be used to support migration from one database to another or between major upgrades of a database. The implementation of master data management may also require data migration. The data integration, ETL,ELT and replication, which are primarily concerned with moving data between existing environments, may be used in order to support the migration process .Data migration is often undertaken as a part of a broader application migration (for example: migrating from SAP to Oracle, consolidating SAP environments or migrating from one version of SAP to another. or when migrating to SaaS (software as a service) environments it is important for data migration to be automated as much as possible, especially where these applications have been acquired directly by the business rather than via IT. Data migration projects are undertaken because they support business objectives. There are costs to the business if it goes wrong or if the project is delayed, and the most important factor in ensur¬ing the success of such projects is close collaboration between the busi¬ness and IT. Whether this means that the project should be owned by the business— treated as a business project with support from IT—or wheth¬er it should be delegated to IT with the business overseeing the project is debatable, but what is clear is that it must involve close collaboration. This white paper is about the acquisition of FMG(Fast Moving Goods)business of one company(COM-1) by another company (COM-II) resulting in the merger of the FMG business of COM-I into COM-IIs. It involves migration of huge amount of data from one company to the other resulting from partial M&A between COM-I and COM-II keeping the following parameters in check -Data integrity, Time(duration of engagement),Cost of technology, Man hours sent, Downtime, Maintaining availability of application. Huge data migration is not only cumbersome but requires special tools and techniques for maintaining integrity of the data. Migration of data from one source (company) to the other (company) requires time and effort and has huge cost implications that are not visible on the surface and hence extensive design, planning and funding are needed. Various tools are evaluated for migration of data but owing to the complexity of the existing system which involves Open road as front end, tuxedo at middle tier and Oracle 10g at the backend and there were many critical business rules applied at all the three tiers, that needs to be taken into account while migrating the data. This involved lots of study and research in term of determining the best methodology for migration data from one landscape to another landscape. Please note the two landscapes may be on two entirely different platforms involving lots of complexity and contradictions. There was a need to study in details the application and hardware architecture of both the systems for the purpose of data migration /integration. For the purpose of data migration from one environment to another, all the validation (including Biz validation at front end and middleware and data referential and integrity validations at backend) should be considered and cannot be bypassed for the sake of migration.

Proceedings of the 2004 ACM SIGMOD international conference on Management of data - SIGMOD '04

Paulo Carreira , Helena Galhardas

Big Data and Cognitive Computing

Dr.-Ing. Otmane Azeroual

Data migration is required to run data-intensive applications. Legacy data storage systemsare not capable of accommodating the changing nature of data. In many companies, data migrationprojects fail because their importance and complexity are not taken seriously enough. Data migrationstrategies include storage migration, database migration, application migration, and business processmigration. Regardless of which migration strategy a company chooses, there should always be astronger focus on data cleansing. On the one hand, complete, correct, and clean data not only reducethe cost, complexity, and risk of the changeover, it also means a good basis for quick and strategiccompany decisions and is therefore an essential basis for today’s dynamic business processes. Dataquality is an important issue for companies looking for data migration these days and should notbe overlooked. In order to determine the relationship between data quality and data migration,an empirical study with 25 large German and Swiss companies was carried out to find out theimportance of data quality in companies for data migration. In this paper, we present our findingsregarding how data quality plays an important role in a data migration plans and must not beignored. Without acceptable data quality, data migration is impossible.

International Research Group - IJET JOURNAL

IJMER Journal

Abstract: Data migration has become one of the most demanding proposals for IT company managers. Even though these projects earn high business benefits, such as reduced costs, improved productivity, and data manageability, they likely to involve a high level of risk due to the huge volume and criticalness of moved data. In order to reduce risk and guarantee that the data has been migrated and transformed successfully, it is essential to employ a thorough Quality Assurance (QA) strategy in migration projects. Testing is a key phase of migration project for delivering a successful migrated data and addressing any issues prior and after the migration process. Manual testing for data validation process is time consuming and inaccurate; so automated data validation assure data quality with highly reduced time, cost and maintaining good data quality. The paper proposed automation of data migration validation testing process for quality assurance and risk control across industries.

Proceedings of the 17th …

Deirdre Lawless

International Journal of Advance Research in Computer Science and Management Studies [IJARCSMS] ijarcsms.com

The main objective of this paper is about the migration of database from source database server to target database server based on the conditions, before moving to target database if any Transformation/Manipulation has to be done. In this process based on the client requirements, the business analyst will develop the Business requirement specification documents, Entity-Relationship document, mapping documents whereas developers will design and develop the High level and Low level design document as well develop the code and testers has to identify the bottlenecks and process to resolve the issues in data warehouse testing. Testing team will design and develop the test scenarios, test cases and traceability matrix documents. Testers will execute the test cases on the test environment of the database (Source & Target) to make sure the source and target data is matching as expected. This document details the testing process involved in BI which is ETL and Reporting testing the requirement and increases the test coverage. By following this approach and process will provide a defect free product with quality deliverables within the timeline.

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

RELATED PAPERS

GRD JOURNALS , Sushant S. Sule , Syed Khizer

Bernhard Thalheim

Fernando Camutari MSc/EngTech/CEng/MBCS/MIEEE

Empirical Software Engineering

Danilo Ardagna

International Journal of Computer …

Ravindra Hegadi

2007 IEEE International Conference on Software Maintenance

Jean-luc Hainaut

International journal of engineering research and technology

Sivagnana Ganesan

International Journal for Research in Applied Science & Engineering Technology (IJRASET)

IJRASET Publication

Maro Vlachopoulou , Theodora Zarmpou

Independent IJCEROnline

research.ijcaonline.org

RELATED TOPICS

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

Toronto Metropolitan University

Data migration: relational RDBMS to non-relational NoSQL

  • Master of Science
  • Computer Science

Granting Institution

Lac thesis type, usage metrics.

Computer Science (Theses)

  • Database management

Advertisement

Advertisement

Human migration: the big data perspective

  • Regular Paper
  • Open access
  • Published: 23 March 2020
  • Volume 11 , pages 341–360, ( 2021 )

Cite this article

You have full access to this open access article

data migration thesis

  • Alina Sîrbu 1 ,
  • Gennady Andrienko 2 , 3 ,
  • Natalia Andrienko 2 , 3 ,
  • Chiara Boldrini 4 ,
  • Marco Conti 4 ,
  • Fosca Giannotti 5 ,
  • Riccardo Guidotti 5 ,
  • Simone Bertoli 6 ,
  • Jisu Kim 7 ,
  • Cristina Ioana Muntean 5 ,
  • Luca Pappalardo 5 ,
  • Andrea Passarella 4 ,
  • Dino Pedreschi 1 ,
  • Laura Pollacci 1 ,
  • Francesca Pratesi 1 &
  • Rajesh Sharma 8  

15k Accesses

51 Citations

8 Altmetric

Explore all metrics

A Correction to this article was published on 21 May 2021

This article has been updated

How can big data help to understand the migration phenomenon? In this paper, we try to answer this question through an analysis of various phases of migration, comparing traditional and novel data sources and models at each phase. We concentrate on three phases of migration, at each phase describing the state of the art and recent developments and ideas. The first phase includes the journey , and we study migration flows and stocks, providing examples where big data can have an impact. The second phase discusses the stay , i.e. migrant integration in the destination country. We explore various data sets and models that can be used to quantify and understand migrant integration, with the final aim of providing the basis for the construction of a novel multi-level integration index. The last phase is related to the effects of migration on the source countries and the return of migrants.

Similar content being viewed by others

data migration thesis

New Migration Data: Challenges and Opportunities

data migration thesis

Migration Statistics

data migration thesis

Geography of Migration: An Introduction

Explore related subjects.

  • Artificial Intelligence
  • Medical Ethics

Avoid common mistakes on your manuscript.

1 Introduction

The phenomenon of human migration has been a constant of human history, from the earliest ages until now. As such, the study of migration spans various research fields, including anthropology, sociology, economics, statistics and more recently physics and computer science. We are at a moment where various types of data not typically used to study migration are becoming increasingly available. These include so-called social big data: digital traces of humans generated by using mobile phones, online services, online social networks (OSNs), devices within the internet of things. At the same time, new technologies are able to extract valuable information from these large data sets. Both traditional and novel models and data are currently being employed to understand different questions on migration, including monitoring migration flows and the economic and cultural effects on the migrants and also on the source and destination communities. In this paper, we provide a survey of existing approaches, both traditional and data-rich, and we propose new methods and data sets that could contribute significantly to the study of human migration. We concentrate on three different phases of migration: the journey—analysing migration flows and stocks; the stay—studying migrant integration and changes in the communities involved; the return—the study of migrants returning to the origin country.

1.1 The journey

At the moment, information about migration flows and stocks comes from official statistics obtained either from national censuses or from the population registries. Given that migration intrinsically involves various nations, data are often inconsistent across databases and offer poor time resolution. With the availability of social big data, we believe it should be possible to estimate flows and stocks from available data in real time, by building models that map observed measures extracted from these unconventional data sources to official data, i.e. now-casting stocks and flows. We also look at migration phenomena within smaller communities, such as scientific migration, where even prediction of migration events can be possible. An important step in understanding migration flows is suitable visualization, which we also explore.

1.2 The stay

Migration might generate cultural changes with both long- and short-term effects on the local and incoming population. Migrant integration is generally measured through indicators related to the labour market, economic status or social ties. Again, these statistics are available with low resolution and not for all countries. A new direction is that of observing integration and perception on migration through big data. For instance, OSN sentiment analysis specific to immigration topics can allow us to evaluate perception of immigration. Analysis of retail data can enable us to understand whether immigrants are integrated economically but also whether they change their habits during their stay. Scientific data can help us understand how migration benefits both the host countries and the migrants themselves. Through these data, we can derive novel integration indices that take into account the traces of human activity observed.

1.3 The return

Besides effects on the receiving communities, the source communities may also see effects of migration. In fact, migrants can maintain a strong attachment to their home countries and eventually return there. This can bring multiple benefits: economic growth, new skills, entrepreneurship, better healthcare, different participation in governance issues and many others. We discuss various approaches to analysing these cases based on existing data.

Both traditional and new methods to analyse migration depend highly on the availability of data. Hence, infrastructures that can catalogue the various data sets and make them available to the community, ensuring privacy and ethical use, are very useful. At the same time, with new methods being developed, means of facilitating their use by the research community are necessary. An example of framework that aims to achieve these requirements is the SoBigData infrastructure [ 78 ] ( www.sobigdata.eu ). This includes a catalogue of methods, data sets and training material, grouped in so-called exploratories . Virtual research environments allow users to use some of the data and methods directly in the SoBigData engine. The exploratory on migration studies includes many of the methods and data sets presented below.

The rest of the paper is organised as follows: The study of migration flows and stocks is discussed in Sect.  2 . This compares traditional data (Sect.  2.1 ) with social big data (Sect.  2.2 ) including scientific migration (Sect.  2.2.1 ), providing also a review of tools for visualization of migration data (Sect.  2.3 ). Section  3 concentrates on migrant integration and perception of migration. We start by looking at approaches based on traditional data sources (Sect.  3.1 ) and move on to social big data including retail data (Sect.  3.2.1 ), mobile data (Sect.  3.2.2 ), language and sentiment in OSNs (Sects.  3.2.3 and 3.2.4 ), ego networks (Sect.  3.2.5 ). The return of migrants is discussed in Sect.  4 , while Sect.  5 concludes the paper with a summary and a discussion on ethical issues.

2 The journey: migration flows and stocks

In this section, we discuss various means of analysing migration flows and stocks. We start with traditional approaches and data types and then move to new data sets that can be employed for the task, underlining advantages and disadvantages of each approach.

2.1 Traditional data sources and challenges

Tracking international migrants’ flows and stocks is an important task but also challenging. At the moment, many researchers and policy makers rely on traditional data sources to study the journey of migrants. Such data sources come from either official statistics or administrative data. Studying the journey of migrants with these traditional data sources, however, come with various limitations as migration intrinsically involves various nations. For instance, the data are often inconsistent across databases as different countries employ various definitions of a migrant . A lot of efforts have been made so far from both researchers and international organizations to improve quality and harmonize traditional data sources [ 50 , 148 , 171 ]. International organizations such as the United Nations provide also guidelines and suggestions Footnote 1 which countries should employ when dealing with migration statistics. In this section, each type of data source is described in detail and evaluated.

Census data and surveys are official statistics collected by institutions. They provide socio-demographic information of the population, including immigrants. However, the two types of data have different focus. The census data are collected once in five years or once in ten years, depending on the country. For example, the most recent data available in the USA is the 2010 census data, while in Europe the last census was performed in 2011. By the recommendation given by the United Nations, Footnote 2 countries should collect the data every year that ends with zero in order to establish a consistency across different migration data sets. But as the process of collecting data is expensive and time-consuming, some developing countries do not collect the data as it is recommended, creating inconsistency across different countries’ databases. The high cost is due to the fact that the majority of countries carry out door-to-door or phone interviews to a randomly selected sample of population to collect the data. For instance, the Chinese population is almost 1.4 billion, Footnote 3 so about 6 million enumerators are needed to conduct all the interviews. On the other hand, most European countries retrieve the data from administrative registries which makes the procedure faster [ 62 , 149 ].

In the census data, migration-related information collected is the following: citizenship, country of birth, last place of residence as well as length of stay. However, depending on the countries’ characteristics of immigrants and the immigration system, Footnote 4 they do not use the same information to count the number of immigrants. In Europe for example, the focus is also given on different migrant groups depending on whether they are from the European Member States or third country. Footnote 5 On the other hand, the United States counts everyone born outside of their territory as immigrants. Yet, the recommendation of the United Nations defines an international migrant as “a person who moves to a country other than that of his or her usual residence for a period of at least a year”. The difference in the definition of immigrants creates incomparability across different migration data. Furthermore, information about returning migrants is not well captured through the census data. This is due to the fact that returning migrants are not obliged to declare their departure. In the leaving country’s data, they would simply exit from the data, meaning that information about these migrants is difficult to track.

Census data are usually published in aggregated form by the authorities that organized the census. Typically, immigration rates are made available at country or at most regional level. For instance, historical immigration data can be found on the websites of Eurostat [ 63 ], the WorldBank [ 165 ], OECD [ 164 ] and other local authorities and research institutions [ 61 , 67 , 95 , 96 , 97 ]. However, in certain situations having data with higher spatial resolution can be useful. Recently, the Joint Research Centre of the European Union published a data challenge Footnote 6 where they make available for research high-resolution immigration data from the 2011 census, for selected European countries. However, similar data are more difficult to obtain for other regions.

Surveys also collect information about the flows and stocks of immigrants, and they are retrieved more often than the census data. Unlike the census data, they are generally conducted to collect information on households, labour market or community, depending on their main purpose. As a result, there are very few questions related to migration. For instance, in the employment survey in France, there are two questions which are about country of origin and date of arrival. With these two details, it is difficult to infer the immigrants’ journey since a clear definition of immigrants cannot be established. As a consequence, it has low accuracy level in capturing immigrants’ flows and stocks and real-time observation cannot be done. In addition, information retrieved from surveys refers to a small subset of the entire population.

Administrative data are retrieved from registries. It can be from health insurance, residence permits, labour permits or border statistics, which gather also information about immigrants. Registry data can provide more detail and are less costly than official statistics as the information is intrinsically and directly given by the individuals. For instance, data collected from the residence permits include details about intention and length of stay. They also require specific details on place of origin and address in the country of stay. The same applies to labour permit data. Nevertheless, in Europe where the freedom of movement and work is established, it is difficult to know flows and stocks of EU immigrants using these administrative data unless all the individuals are registered. An alternative is to use health insurance data. With these, it is possible to infer the stocks more accurately, provided the immigrants register for health insurance. In addition, registries can also collect information about asylum seekers Footnote 7 and refugees. Footnote 8 However, this information is not always present in all migration data. In some countries like France, Italy, UK and so on, asylum seekers residing at least 12 months in a country are included in the data. In other countries like Belgium, Sweden and Finland, they are excluded [ 62 ]. Again, an application of different definitions makes it difficult to compare data across different countries. When studying the journey with administrative data, caution should be used when inferring the immigrants’ journey as it is difficult to identify the true movements of immigrants.

The use of traditional data in studying the journey of immigrants is definitely useful. These can be used for building models of migration [ 144 ] and understanding the determinants of migration. But for the reasons discussed above, several drawbacks have to be taken into account. To improve data quality, institutions provide estimates to impute the gaps between years, or use the double-entry matrix Footnote 9 firstly introduced by UNECE Footnote 10 to establish comparability across different nations’ data (see, for instance, [ 50 , 142 , 143 ]). Nevertheless, despite the efforts, the data still appear inconsistent and unreliable. With the availability of social big data sources, researchers hope not only to overcome the limitations of traditional data, but also to be able to conduct real-time analyses at a higher accuracy level.

2.2 Alternative data sources: Is now-casting possible?

In recent studies, the use of social big data in the study of immigrants’ journey is increasing. A variety of data types can fall under this category. They can be data from social media, internet services, mobile phones, supermarket transaction data and more. These data sets contain detailed information about their users. Furthermore, they cover larger sets of population than some of the traditional data sources, which are limited in terms of sample size. Yet, the literature points out that the data may be biased because of users’ characteristics in the sample. For instance with Twitter data, it is known that the majority of the users are young and that it cannot represent the whole population. Nevertheless, various studies state that the observed estimates of immigrants’ flows and stocks extracted from these unconventional data sources can still improve the understandings of migration patterns (see, for instance, [ 87 , 126 , 183 ]).

Big data allows researchers to study immigrants’ movements in real time. Twitter data, for instance, provide geolocated time-stamped messages. Geolocated messages are often the key variable in estimating the flows and stocks but not the only one. In the work of [ 183 ], the authors infer migration patterns from Twitter data by looking at where the tweets were posted. Other studies like [ 126 ] assume origins of immigrants from language used in tweets, whether the local language was used or not. These studies conclude that Twitter data allow researchers to localize the flows and stocks of immigrants and to observe recent trends even before the official statistics are published. The results of these studies are validated by matching the big data results to official data.

In one of our recent works, we have also analysed geolocalized Twitter data, with the aim of quantifying diversity in communities, by computing a superdiversity index [ 139 ] (see also Sect.  3 ). This index correlates very well with migration stocks; hence, we believe it can become an important feature in a now-casting model. A different line of work we are pursuing is that of estimating user nationality from Twitter data. As seen above, language can be important in understanding nationality; however, we believe that this can be refined by employing also the connections among users. The model can be validated with data collected through monitoring frameworks such as that presented in [ 21 ]. Once users are assigned a nationality, we can use these for a now-casting model of migration stocks. Additionally, we can define communities on Twitted based on nationality and study the flow of ideas among communities, and the role of migrants in the spreading of information. Furthermore, these data could enable analysis of ego networks of migrants (Sect.  3.2.5 ).

Skype ego networks data can also be used to explain international migration patterns [ 101 ]. In this case, the IP addresses that appear when users login to their account can be used to infer the place of residence. More precisely, they look at how often the users login to their IP address, which allows them to label the location as the users’ place of residence. The users’ place of residence then can be used to observe whether migration took place or not.

Big data can also be used to study movements of individuals in the time of crisis. For instance, [ 30 ] propose to use mobile phone data to trace individuals’ movements in the occurrence of earthquake in Haiti. With these data, the authors are able to trace users as the phone towers provide information about their locations. They conclude that big data can be used to observe movements in real time, which cannot be done through traditional data.

Another limitation in using traditional data source is that it is difficult to anticipate immigrants’ movement. In the work of [ 36 ], they study whether the GTI Footnote 11 can now-cast the immigrants’ journey. However, as authors point out, not every search means that searchers have intention to migrate. To address this issue, they compare Gallup World Poll data Footnote 12 with the results obtained with GTI data. The Gallup data is a survey done on more than 160 countries and it contains questions on whether the individuals are planning to move to another country and if so, whether the plan will take place within 12 months and lastly, whether they have made any action to do so, i.e. visa applications or research for information. The comparison validates that the GTI data can indeed now-cast the “genuine migration intention”.

Unconventional big data have their limitations like traditional data. Nevertheless, new big data methods are developing in order to address the newly arising issues. In addition, big data cover worldwide users with very fine granularity of information on immigrants’ journey. The hope is that by merging knowledge from both traditional and novel data sets we will be able to overcome some of the issues and build accurate models for now-casting immigrant journeys and immigration rates.

2.2.1 Scientific migration

Given its importance to scientific productivity and education, the study of scientific migration has attracted a growing interest in the last years, fostered by the availability of massive data describing the publications and the careers of scientists in several disciplines [ 47 , 129 , 138 , 155 ]. Understanding the mechanisms driving scientists’ decision to relocate can help institutions and governments manage scientific mobility, implement policies to attract the best scientists or prevent their departure, hence improving the quality of research. At the same time, predictive models explaining when, and where, scientists migrate can facilitate the design of job recommender systems for scientists based on their profile [ 156 ], or help search committees seek successful candidates for their research jobs.

The studies proposed in the literature on scientific migration can be grouped into three main strands of research. A first group of studies focus on country-level movements or on movements between universities [ 20 , 125 , 131 ]. Relying on a large-scale survey, Appelt et al. find that geographic distance, as well as socio-economic disparities and scientific proximity, negatively correlates with the mobility of scientists between two countries [ 14 ]. By investigating the professional and personal determinants of the decision to relocate to a new institution, Azoulay et al. [ 22 ] find that scientists are more likely to move when they are highly productive and their local collaborators are fewer and less accomplished than their distant collaborators, while they find it costly to disrupt the social networks of their children. Gargiulo and Carletti [ 68 ] investigate the movements of scientists between universities and find that starting from a lower-rank institution lowers the probability of reaching a top-rank academy and makes higher the probability to remain in a low-rank one and, on the contrary, starting from a high-ranked university strongly lowers the probability of ending in a low-ranked one.

A second strand of research focuses on understanding the impact of a scientist’s relocation to their scientific impact. In this context, it has been discovered that while moves from elite to lower-rank institutions lead to a moderate decrease in scientific performance, moves to elite institutions do not necessarily result in subsequent performance gain [ 52 ]. Sugimoto [ 161 ] analyses the migration traces of scientists extracted from Web of Science and reveals that, regardless of the nation of origin, scientists who relocate are more highly cited than their non-moving counterparts.

In the context of studying labour mobility, the availability of massive data sets of individuals’ career path fostered works on predicting individuals’ next jobs (outside the academia) [ 156 ]. Paparrizos et al. [ 135 ] build a system to recommend new jobs to people who are seeking a job, using all their past job transitions as well as their employees data. They train a predictive model to show that job transitions can be accurately predicted, significantly improving over a baseline that always predicts the most frequent institution in the data. Recently, Li et al. [ 111 ] propose a system to predict next career moves based on profile context matching and career path mining from a real-world LinkedIn data set. They show that their system can predict future career moves, revealing interesting insights in micro-level labour mobility.

Our recent work, conducted within the SoBigData projects, is placed on the line of conjunction of the aforementioned strands of research. In particular, we investigate how a scientist’s scientific profile influences the decision to move, based on a massive data set consisting of all the publications in journals of the American Physical Society (APS) from 1950 to 2009—360,000 publications, 3500 institutions and 60,000 scientists [ 98 ]. We approach the problem by constructing a two-stage predictive model. We first predict, using data mining, which scientist will change institution in the next year. We describe a scientist’s profile as a multidimensional vector of variables describing three aspects: the recent scientific career, the quality of scientific environment and the structure of the scientific collaboration network. From the constructed predictive model, we identify the main factors influencing scientific migration. Secondly, for those scientists who are predicted to move, we predict which institution they will choose using the performance-social-gravity model , an adaptation of the gravity model of human mobility to include the above-mentioned factors.

A different recent line of work in the SoBigData project is to understand, by using ORCID data, what was the effect of the Brexit referendum on scientific migration in and out of the UK. Preliminary results (still unpublished) show an increase in UK researchers moving from the EU to the UK and an increase in EU researchers moving out of the UK.

2.3 Visual analytics of migration data

The phenomenon of migration is strongly associated with human movement. Analysis of movement data is one active topics in Visual Analytics research. The monograph [ 11 ] systematically considers a variety of possible representations of movement data. Frequently used representations are trajectories (sequences of time-stamped positions of individuals), time series (e.g. counts of departing, arriving or transit visitors over time), and events (e.g. movement with abnormal speed or unusual concentration of moving objects). A special case of trajectories is a trajectory consisting of only two time-stamped positions, origin and destination of a trip. This representation is frequently used in migration studies, since more detailed information is often not available.

The following three main classes of techniques are applied for visualization of origin-destination (OD) flows: OD matrix [ 84 ], OD flow map [ 167 ] and a hybrid of a matrix and a map called OD map [ 182 ]. In an OD matrix, the rows and columns correspond to locations and the cells contain flow magnitudes represented by colour shades. The rows and columns can be automatically or interactively reordered for uncovering connectivity patterns. Disadvantages of the matrix display are the lack of spatial context and the limited number of different locations that can be represented. In OD flow maps, links between locations are represented by straight or curved linear symbols analogously to node-link diagrams. Various possible representations of directed links are discussed and evaluated by Holten et al. [ 92 ]. Flow magnitudes are shown by proportional line widths or by colour shades. OD maps [ 182 ] use a map-like grid layout with embedded maps that represent movement from/to selected locations to/from all other locations that correspond to remaining maps.

A straightforward approach to showing time-variant flows is to use multiple displays (e.g. OD matrices, OD flow maps or OD maps) arranged either temporally in animation or spatially in a small multiple display. Map animation is not effective [ 170 ] because the user cannot memorize and mentally compare multiple spatial situations. In small multiples, a limited number of spatial situations can be shown simultaneously; hence, this approach is not suitable for long time series. Clustering of spatial situations [ 13 ] can be used to reduce the number of distinct situations that need to be shown. A completely different approach is to show the time series of flow magnitudes separately from maps, for instance, as it is done in FlowStrates [ 38 ]; however, the spatial situations and their changes over time cannot be seen.

The paper [ 12 ] defines a workflow for analysis of long-term origin–destination data. The approach starts with aggregation of flows by origin or destination regions, directions and distances of move, and time intervals. Next, time intervals are clustered according to feature vectors composed from descriptors of all origins representing magnitudes of flows in all considered directions and distances. The proposed system enables exploration and continuous refinement of clustering results. The process is supported by space- (flow maps, diagram maps) and time-based (calendar showing temporal dynamics of situations by colours of dates) visualizations.

The techniques described in this section have been successfully applied or are potentially applicable to analysis of long-term migration data, for detecting patterns and changes of migration.

3 The stay: effects on communities, immigrant integration

The study of the effects of migration on the communities involved includes various traditional lines of study. Immigrant integration is a complex process that can reflect a progressive adoption of the norms that prevail in the destination country or a return to the habits of the home countries. Integration has been analysed from multiple viewpoints. Here, we outline some of these lines of work, with some recent examples, and we provide a few directions for development using big data. However, this section is not intended to be a complete survey of methods, since the complexity of the issue would require much more than a few pages to describe. For more comprehensive reviews on migrant integration, please see, e.g. [ 41 ].

3.1 Current practices

In general, immigrant integration and cultural changes have been traditionally analysed using census data, administrative registries and surveys. In this section, we describe the different criteria used for analysis. We start with a discussion of research studying social integration (social network, mixed marriages), and then, we move on to labour market integration and language adoption of immigrants. We conclude the section with a discussion of the effects of immigration on educational attitudes, on economic prosperity and on political attitudes.

The effect of the social network on migration was analysed by [ 118 ] using survey data on Mexican migrants to the USA. The richness of the social networks is shown to promote migration of low-skill migrants, while for communities where the social networks are weak, high-skill migrants are present. In terms of migrant integration, social networks in schools were analysed in European countries by [ 158 ]. They show that homophilic attitudes develop differently for immigrants and natives, with the former being positively influenced by multi-ethnicity in class. Ego networks of Turks and Moroccans in the Netherlands are studied in [ 172 ], using survey data. The authors show that in general closest friends come from the same ethnic group. The effect is stronger for women and those that are culturally more dissimilar to the natives.

In terms of marriage relationships, in the USA, marriage with whites is analysed for different ethnicities and education levels [ 145 , 146 ]. Divorce rates are shown to be higher for mixed than for non-mixed couples in the Netherland, particularly for couples coming from very distant cultures [ 157 ]. The relation between mixed marriages and the immigration rate in Italian communities was studied by [ 4 ]. The authors show that there are differences between large cities and smaller municipalities, and they argue, based on probabilistic interaction models, that this is due to the structure of the social network, which is disconnected in large communities. The presence of female immigrants was found to increase the risk of separation of native couples in Italy, using survey data and official statistics [ 177 ].

Integration in the labour market has been analysed for various western and non-western countries by [ 159 ]. They show that general patterns of integration and factors affecting it are very similar between western and non-western countries. Factors that affect the probability to find a job are language exposure, cultural distances, economic advancement of the origin country. Recent work shows that language training has an important effect on labour market integration of immigrants in France [ 112 ]. The effect of education on employment is analysed in [ 132 ] for Mexican immigrants in the USA. Integration in the labour market can also depend on the location where immigrants settle. In some cases, such as refugee situations, locations are assigned centrally. Recent work[ 24 ] has used data on past employment success to provide better matches between locations and refugees, showing that the probability of being employed can be increased by 40 to 70%.

Both mixed marriages and labour market integration were analysed using official data from Spain by [ 26 ]. They show using insights from statistical physics that while mixed marriages seem to be driven by peer interaction, this is missing when it comes to labour market integration. The same approach can be used to forecast integration from the two points of view [ 49 ].

Language adoption is a very important factor contributing to the success of an immigrant in the host country, since it provides opportunities for education, employment, social interaction. Integration in the USA was analysed by [ 6 ], looking at the language spoken at home by third-generation immigrants. The study shows that while Asians and European adopted the language at a similar pace, Spanish-speaking families were still preserving some of their mother language. A different study [ 173 ] looks at the dynamics of language adoption in the USA and shows that education is an important factor positively influencing speed of adoption, while group size provides negative influence. A related issue is that of naming children [ 1 ]. A recent study of early US census data shows that people coming from families where children were given foreign names were less successful in terms of education and earnings, and were more likely to marry foreign spouses. The bilingual settings were studied in [ 174 ], i.e. language adoption of immigrants in Belgium. The study shows that immigrants adopt faster the more international language.

The above-mentioned works study integration by looking mostly at the immigrant population. However, effects on the local population due to integration of migrants exist too. For instance, educational expectations of middle school children were shown to change in children both from native and immigrant communities, in Italy, based on survey data [ 122 , 123 ]. Immigrant children increased their expectations in the presence of native children with high expectations. Native children studying in multiethnic classes seemed more prone to high expectations. The effect of school class composition and ethnic attitudes was analysed in [ 39 ], showing that a balanced composition is beneficial for all ethnic groups involved.

A different effect that can be studied is related to economic prosperity of the target society. Diversity of birthplace was shown to increase economic prosperity [ 7 ], especially in the case of high-skill migrants moving to rich countries. The cultural diversity of the origin country was also analysed, showing that there is an optimal cultural distance for immigrants to maximize the beneficial economic effects. At the same time, however, [ 25 ] show that competition in the labour market and public services, together with cultural differences, generates a shift in political inclination. For instance, a shift of votes towards the left-wing parties was observed in Italy. Similar changes were observed in Austria, where one factor was the concern about the quality of the neighbourhood [ 86 ].

figure 1

Association to Italian Supermarket Chain. Trends of the number of customers with fidelity card for Albania, France and Romania

3.2 Towards a novel integration index using alternative data sources

While the type of studies exemplified in the previous section have been instrumental in understanding the effects of migration, the fact that they are based on traditional data makes them inherit the disadvantages of these data. Big data can help to analyse the issues above, and others, with the advantage of producing real-time results, and enabling analysis at higher spatial resolution. For instance, retail data can help understand how immigrants adopt habits and values of the new community they live in. Mobile call data records (CDR) can be used to describe social interaction and mobility patterns of immigrants, and understand segregation. OSN data can help study various topics, such as social integration, language adoption, changes in the local language and sentiment towards immigrants. All these data types can be also combined to build a novel multi-level integration index than takes into account all of these criteria. In the following, we will exemplify some of these topics, including existing results from our project and new directions to pursue.

3.2.1 Retail data: tell me what you eat, I will tell you who you are?

The measures for immigrant integration discussed in Sect.  3.1 capture choices that can be easily observed and potentially exposed to social sanctions. Moreover, they are usually measured at one point in time, while integration is a dynamic phenomenon. The analysis of retail data from a supermarket chain can enable us to understand whether immigrants are converging to or diverging from the norms and habits of the destination country. By observing immigrants’ food consumption baskets, we can estimate the degree of integration and how this varies in time. This behaviour is less prone to social sanction, since the food basket is not generally known to people outside a family. Furthermore, we can identify which are the most relevant factors for the integration. The degree of integration can be considered both with respect to economic aspects but also based on how immigrant customers change their habits during their stay in terms of purchased products.

Market basket analysis and the study of food consumption have been widely used in the literature for different purposes, such as defining individual indicators of customer predictability [ 79 ], studying GDP trends [ 80 ], analysing customers with respect to their temporal purchasing patterns [ 82 ] and classifying them as residents or tourists according to their shopping profile [ 81 ]. Exploiting retail data to study the migration phenomenon from an individual and collective point of view that is not exposed to social sanctions and with multiple observations in time can bring to the light novel results useful for better understanding the migration phenomenon and also for developing well-being policies.

Our project owns a key data source for these analyses, composed of scanner data from a large Italian retail market chain, that are available since January 2007 for more than 1.1 million customers holding a fidelity card. The data set includes the price, quantity, promotional sales (if any) and the name of the good purchased out of a set of around 600,000 products. Besides this information, for each customer the country of birth is available and the date on which the fidelity card was obtained. About 7% of the customers are foreign-born, when the immigration rate in Italy is currently around 8.5%. On average, a foreign customer is observed 5 times per month, with a mean monthly food expenditure of about EUR150. In Fig.  1 , we report the cumulative number of customers joining the fidelity club for Albania, Romania and France. We observe how the trend is stable for Albania, while the number of customers with fidelity card is growing for Romania and decreasing for France. These indices are in line with the immigration trends from European official statistics, indicating that these data could be representative of the migrant population. In the following, we discuss research directions that our project is pursuing.

To understand whether there is a convergence in food consumption choices of immigrants (by country of birth), two orthogonal approaches can be followed. A top-down approach aims at analysing aggregated variables among the various items purchased that take into account for each foreign-born customer the difference between the normalized amount spent on a specific period and the mean spent in that period by Italian customers. In this way for each foreign-born customer we can obtain a time series indicating if that customer is converging or diverging from the Italian norms. Hence, we can find foreign countries having customers with homogeneous behaviours but also countries with different integration behaviours.

A weakness of the top-down approach is that it is not easy to understand which are the products leading to the convergence/divergence. A bottom-up approach analysing the basket composition can provide this kind of information. In particular, our idea is to extract for different periods for each customer their individual representative baskets using the algorithm defined in [ 83 ]. Then re-cluster for each country the representative basket of the customers and develop national collective representative baskets . This can allow, through a set-based distance measure, to develop an indicator of shopping divergence/convergence with respect to the Italians typical baskets.

Finally, we underline that 14 per cent of the foreign-born customers disappear from the data set after some activity. The purchases of these customers could also be used for studying the return to the origin country.

3.2.2 Call data records

A large amount of work has been done using call data records (CDRs) in understanding individual [ 70 , 75 , 137 , 181 ] as well as group mobility [ 89 , 114 , 136 , 168 ]. These range from empirical analyses of large CDR data sets [ 70 , 75 , 136 , 137 , 181 ] to proposal of theoretical mobility models [ 154 ]. Initiatives to motivate researchers to analyse CDR data have also appeared, through data challenges such as the Data for Development (D4D) challenge in Senegal [ 35 ] or the recent Data for Refugees (D4R) challenge in Turkey [ 152 , 169 ]. Readers can refer to [ 34 ] for a survey of works related to using CDR data for individual mobility studies and models.

A recent example is the study of the flocking and mobility behaviour of the population after the Haiti earthquake using CDR data [ 113 ]. Researchers found that mobility patterns of the population after natural calamities is predictable. People tend to move to destinations where they have been making more calls before the disaster. In another natural calamity study done in New Zealand with respect to Christchurch earthquake which happened in February 2011 [ 2 ], the researchers found that people either moved to Big cities like Auckland or to the small towns. However, no correlation between the mobile phone calls before and after the disaster has been reported. In all cases, this is an important outcome, as it can help in timely and effective infrastructural decisions in the time of emergencies or natural disaster [ 51 ].

In a different dimension, mobility patterns have also been studied with respect to socio-economic development [ 136 ]. Authors found a strong correlation between human mobility patterns with socio-economic indicators. It has also been shown that mobility patterns can be used for creating detailed maps of population distribution which are more accurate and recent. This approach is in particular useful for poor countries. This in turn can help in creating proper socio-economic policies for the population [ 51 ].

However, while mobility analyses are abundant, not much work has been done to analyse the international migration phenomenon using CDR. This is due to several reasons. First, CDR data sets typically span only one nation. Secondly, in general, due to privacy reasons, no information on the nationality of the customer is provided. Without these pieces of information, studying migration with these data is difficult. One exception is the above-mentioned D4R challenge, where refugee status of customers is made available. Our project has participated in this challenge, together with several other teams, concentrating on five different aspects: health, integration, unemployment, safety and security, and education. For details on result obtained by other teams, please see the published collection of articles [ 151 ]. Our objective was to analyse integration and combine the Turktelekom data with other data sets [ 31 ]. We observed that integration seemed to increase in time for refugees and also that the presence of refugees influenced the house market in Turkey, decreasing housing prices.

Another recent example where CDR data were used to analyse transnational mobility is [ 5 ], using CDR data that includes mobile roaming events. Transnational population mobility can be defined as living and working in two or more countries. Understanding this phenomenon with traditional statistics and register-based data is impossible. The authors show that roaming data can enable the analysis of travel behaviour and social profile of visitors. They can differentiate between tourists, cross-border commuters, foreign workers and transnationals.

3.2.3 Language in online social networks

Language allows us to express needs, feelings and achieve our communication goals. Society changes and grows more complex over time; thus, language must evolve and adapt itself to the new needs of its population. As a consequence, this evolution leads to changes, creation and vanishing of expressions, dialects and even whole languages [ 74 ]. Over the past two decades, globalization has driven social, cultural and linguistic changes panorama in societies all over the world. The earlier multiculturalism, since the 1990s, intended as the ethnic minorities paradigm, turned in what Vertovec [ 176 ] calls Superdiversity . The concept aims to acquire the increasingly complex and less predictable set of relationships between ethnicity, citizenship, residence, origin and language. Thanks to the influence of pioneering works of linguistic anthropologists, mixing, mobility patterns and historical framework became key issues in the study of the languages and of the language groups [ 33 ]. Over time, linguists and sociologists analysed variation and changes in both oral [ 105 ] and written [ 29 ] languages by exploiting surveys, corpora and records [ 74 ]. In the last decade, the pervasive use of online social networking and micro-blogging services led to the availability of freely made contents never seen before. This unprecedented wealth of written data allows us to recover a detailed picture of language evolution from both the geographical and the time points of view [ 130 ].

The literature regarding the language in social networks applied to migration studies is wide and involves several research fields, including but not limited to mobility patterns, migrations stocks and flows, well-being and sentiment analysis. Even though some works focused more on metadata instead of the real data contents, the text bears a wealth of information, starting from the language in which is written [ 107 ]. For instance, Kulkarni et al. [ 104 ] have proposed a novel method allowing to detect English linguistic variation and quantify its significance among geographic regions; Ibrahim et al. [ 94 ] have combined different data to present a sentiment analysis system for standard Arabic and Egyptian dialectal Arabic; the language has been also investigated in the spatial distribution as well as the spatial extension of dialects. In [ 116 ], geolocated tweets are exploited to identify localized patterns in language usage and to analyse the language diversity over different countries; Mocanu et al. [ 124 ] have characterized the worldwide linguistic geography by aggregating multi-scale OSN data; Jurdak et al. [ 99 ] have compared Twitter mobility patterns with patterns observed through other technologies, e.g. CDRs, by using individuals’ spatial orbit as the measure of how far they move; Gonçalves et al. [ 74 ] have found two global super-dialects in the modern-day Spanish; and Doyle [ 54 ] have proposed a Bayesian method to build conditional probability distributions of the spatial extension of English dialects.

figure 2

Superdiversity index (left) and immigration levels (right) across UK regions at NUTS2 level [ 139 ]

Within the SoBigData project, we have analysed the concept of Superdiversity theorized by Vertovec (2007) and proposed a measure to quantify it [ 139 ]. We focus on the conjunct analysis of both language and geographic dimensions starting from a Twitter data set. Our ground hypothesis is built on the idea that different cultures use the language in different ways and, in consequence, the emotional value associated with words changes depending on the culture of the person that writes a tweet. We introduced a Superdiversity Index (SI), which is based on the diversity of the emotional content expressed in texts of different communities. Specifically, we extract the emotional valences of words used by a community from Twitter data produced by that community. We compare the obtained valences with a standard dictionary tagged with sentiment. The distance between the community and the standard valences is a measure of superdiversity for the community. This SI measure is computed at different geographical scales based on the Classification of Territorial Units for Statistics (NUTS) for two different nations: Italy and UK, and validated with data from the above-mentioned D4I challenge (Sect.  2 ). We observe a very high correlation with immigration rates at all geographical levels. Figure  2 shows the case of the UK, where we observe that the geographical distribution of the SI proposed matches very well that of official immigration rates. Thus, we believe that, besides quantifying the cultural changes that migrants instil on the community, our SI can also become a key measure in a now-casting model for migration stocks.

3.2.4 Migration and sentiment

One way of studying migrant integration is by analysing the opinions of the locals related to migration topics and different migrant groups. While performing targeted surveys is one way of collecting such opinions, using online social networks (OSNs) is a novel direction that can overcome some limitations of survey data. Using Twitter for opinion mining and to study sentiment and user polarization is a vast subject [ 134 ]. The existence of polarization in social media was first studied by Adamic et al. [ 3 ] who identified a clear separation in the hyperlink structure of political blogs. Conover et al. [ 48 ] studied afterwards the same phenomenon on Twitter, evaluating the polarization based on the retweets. Most of the studies on polarization are still based on sentiment analysis of the content. The sentiment analysis methods proposed are numerous, and they are mainly based on dictionaries and on learning techniques through unsupervised [ 133 ] and supervised methods (lexicon-based method [ 163 ]) and combinations [ 103 ]. Opinion mining techniques are widely used in particular in the political context [ 3 ] and in particular on Twitter [ 45 ]. Recently new approaches based on polarization, controversy and topic tracking in time have been proposed [ 46 , 69 ]. The idea of these approaches is to divide users of a social network in groups based on their opinion on a particular topic and tracking their behaviour over time. These approaches are based on network measures and clustering [ 69 ] or hashtag classification through probabilistic models [ 46 ] with no use of dictionary-based techniques.

Regarding the migration topic, in Coletto et al. [ 44 ] we propose an analytical framework aimed at investigating different views of the discussions regarding polarized topics which occur in OSNs. The framework supports the analysis along multiple dimensions, i.e. time, space and sentiment of the opposite views about a controversial topic emerging in an OSN, and is applied to the perception of the refugee crisis in Europe and Brexit. The sentiment analysis method adopted is efficient in tracking polarization over Twitter compared to other methods. Concerning other approaches for studying social phenomena, we do not base our analyses on the change of location of Twitter users to measure the flow of individuals through space, but rather we aim at understanding the impact on the EU citizens perception of migrants’ movements and their resulting decision to vote for Brexit.

figure 3

Sentiment related to the refugee crisis across European countries (from [ 44 ]: red (dark grey in print) corresponds to a higher predominance of positive sentiment, yellow (light grey in print) indicates lower positive sentiment. a The whole data set. b Is limited to users when mentioning locations in the their own country. c Is limited to users otherwise (colour figure online)

The framework, initially presented in [ 43 ], allows to monitor in a scalable way the raw stream of relevant tweets and to automatically enrich them with location information (user and mentioned locations) and sentiment polarity (positive vs. negative). The analyses we conducted show how the framework captures the differences in positive and negative user sentiment over time and space. The resulting knowledge supports the understanding of complex dynamics by identifying variations in the perception of specific events and locations.

We used the Twitter Streaming API under the Gardenhose agreement (granting access to 10% of all tweets) to collect the English tweets posted in two periods: from mid-August to mid-September 2015 for the refugees data set, and from mid-June to the beginning of July 2016 for the Brexit data set, respectively. We filtered out the tweets not related to the specific events analysed. The first data set refers to the Refugees crisis and contains about 1.2 M tweets, while the second one refers to the Brexit referendum and contains about 4.3 M tweets. The data sets Footnote 13 are available for use through Transnational Access in the SoBigData project infrastructure.

In our study, we try to answer the following analytical questions: What is the evolution of the discussions about refugees migration in Twitter? What is the sentiment of users across Europe in relation to the refugee crisis? What is the evolution of the perception in the countries affected by the phenomenon? Are users more polarized in the countries that are most impacted by the migration flow? Is the polarization of the users about refugees and the Brexit referendum somehow correlated? For this purpose, we analyse the ratio between pro- and against-refugee users across Europe. For example, Fig.  3 shows the geographical distribution of this ratio considering all users residing in a country, but also internal and external perception (perception of the users residing inside/outside a country C related to the refugees in C). We observe that Eastern countries in general are less positive than Western countries. Also, we note that for internal perception Russia, France and Turkey have a really low sentiment. We conjecture that the sentiment of a person, when the problem involves directly his/her own country, could be more negative since we are generally more critical when issues are closer to ourselves. External perception is generally higher in countries most affected by the refugee crisis, such as France, Russia and Turkey, with the exception of Germany where the decision to open borders seems to have produced positive internal sentiment.

3.2.5 Ego networks and their effect on migration

Personal networks of migrants have been shown to play a strategic role in the destination country chosen by the migrant, in the well-being of the migrant (once settled in), and in the professional outcome [ 10 , 65 , 72 , 175 , 180 ]. For this reason, studying the properties of migrants’ personal network is a particularly promising avenue of research in digital demography, in order to characterize both the journey and the stay. In this section, we review the basic concepts of ego networks and some existing applications, and we argue that studying ego networks from OSN platforms can be a powerful tool in the analysis of migration.

It is a well-established result from sociology that personal networks, i.e. the ensemble of social relationships that an individual entertains with other people, have a significant influence on the quality of life of the individual in terms of, for example, job opportunities [ 76 , 77 ], social support [ 100 ], power and influence in organization/communities [ 108 , 121 , 128 , 153 ]. Personal networks are also closely related to the concept of social capital, i.e. the network of connections, loyalties and mutual obligations [ 72 ] that translates into favours and preferential treatment. In this perspective, studying the evolution of personal networks over time is the ideal approach to characterize the modification of migrants’ social structures (or lack thereof), due to the migration process. This is related to one of the main subjects of study in this area, i.e. the characterization of integration of migrants. Integration is typically measured in terms of assimilation and transnationalism . Assimilation is defined as the gradual adoption of customs and traditions from the receiving country by the migrant and can be full [ 8 , 9 ], partial [ 71 ] or segmented [ 141 ]. As a consequence of assimilation, the composition of a migrant’s personal network is expected to change significantly over time. At the opposite side of assimilation, there is the phenomenon of transnationalism, whereby migrants continue to participate in the political, economic and cultural life of origin societies and of fellow migrants from the same country [ 140 ]. Many researchers have postulated that the widespread availability of Internet connectivity and OSNs has made easier to keep alive these transnational links with the origin country [ 109 ]. Again, this should be reflected into the personal network of migrants, in terms of number and relationship strength of links towards migrants and non-migrants from the same origin area. These changes can be studied using traditional data coming from targeted surveys, but also from OSN data that can fill some of the gaps present in survey data.

While most migration studies of personal networks are qualitative, quantitative studies are available in the literature on generic social networks. Quantitative studies often explore the graph-theoretical concept of ego networks . An ego network is the graph-based abstraction that models the personal network of an individual (called ego) . Beside the ego, the nodes in the ego network correspond to the people the ego entertains social relationships with. These people are referred to as alters . The ego and each alter are connected by an edge, whose weight corresponds to the strength of their social relations (often referred to as emotional closeness ). Depending on the ego network model used, ties between alters can also be included [ 64 ]. More rarely, only the alter–alter ties are considered for extracting ego network properties [ 117 ]. Several structural properties of ego networks can be derived [ 85 ].

Ego network models have been used in the literature to characterize human cognitive constraints and their impact on the social processes. In particular, evolutionary anthropology has studied the structure of ego networks (as a representation of human personal networks) in terms of the cognitive investment required from the ego to actively maintain it. Dunbar [ 55 ] has found that the humans’ neocortex size places an upper limit on the number of meaningful relationships that can be maintained. Specifically, the group size predicted by the human neocortex size is around 150 alters and it has been validated studying tribal, traditional and modern societies [ 58 , 90 ]. This limit on the size of the ego network determined by the cognitive effort required to maintain active social relationships is known as the social brain hypothesis  [ 57 ]. Additional investigations of this cognitive constraint have shown that the alters in the ego networks are organized into concentric circles around the ego, where the emotional closeness decreases and the number of alters increases as we move from the ego outwards [ 90 , 186 ]. When looking at the size of the circles, a typical scaling ratio around 3 between the size of consecutive circles has been observed [ 186 ], with the size of individual circles concentrating around the values of 5, 15, 50, 150, respectively.

Quite interestingly, ego networks formed through many interaction means, including face-to-face contacts [ 57 ], letters [ 90 , 186 ], phone calls [ 115 ], co-authorships [ 17 ] and, remarkably, also OSN, are well aligned with the above model. Specifically, very similar properties have been found also in Facebook and Twitter ego networks [ 19 , 56 ]. In this sense, OSN become one of the outlets that is taking up the brain capacity of humans, and thus are subject to the same limitations that have been measured for more traditional social interactions, and are not capable of “breaking” the limits imposed by cognitive constraints to our social capacity [ 59 ]. Tie strengths and how they determine ego network structures have been the subject of several additional works. For example, in [ 73 ] authors provide one of the first evidences of the existence of an ego network size comparable to the Dunbar’s number in Twitter. The relationship between ego network structures and the role of users in Twitter was analysed in [ 147 ]. In general, ego network structures are also known to impact significantly on the way information spreads in OSN, and the diversity of information that can be acquired by users [ 15 ]. More in general, many traits of human social behaviour (resource sharing, collaboration, diffusion of information) are chiefly determined by the structural properties of ego networks [ 162 ]. Less studied (typically due to the lack of data) but equally important are the dynamic properties of ego networks, which characterize the evolution of personal networks over time. Arnaboldi et al. [ 16 , 18 ] found that, unexpectedly, the strongest social relations in Twitter change frequently for the majority of generic users and also for the special class of politicians. This is a marked difference with respect to offline networks, where high-frequency relationships correspond to stable and intimate ties [ 90 ].

While data from OSNs have been recently used for migration studies, as detailed in previous sections, the graph-theoretical perspective has been rarely taken into account. The only exceptions are [ 88 , 91 ], and [ 107 ]. In [ 88 ], community-centric metrics are used to study cultural assimilation as a function of the number of social ties between migrant communities and local people using the set of friendship links extracted from Facebook. The graph in this case is unweighted, i.e. the effect of different emotional closeness between node pairs is not taken into account. Lamanna et al. [ 107 ] again focus on cultural assimilation but from the spatial segregation standpoint. In this case, they use a bipartite graph structure, connecting tweet languages and cities. In [ 91 ], Facebook is used to study the network of teenagers in the Netherlands, concentrating on ethnicity and gender. The analysis shows that ethnicity plays a stronger role in link formation. However, the extended Facebook networks are less segregated, in general, compared to core ego networks.

To the best of our knowledge, ego networks of migrants built from OSN data have never been investigated in the related literature. This is quite surprising, as it is well known that many facets of the human behaviour chiefly depend on the ego network structure. This includes features intrinsically related to migration and integration, such as willingness to cooperate with alters, resilience to problems and possibility of seeking for assistance from trusted alters [ 160 ]. As discussed before, migrants’ ego networks have been studied previously in the sociology literature, but only traditional data sources had been considered, and the approach to the analysis is typically more qualitative than quantitative. Here we advocate, along the lines of digital demography, that it is crucial to integrate traditional and innovative data sources to provide a timely and deeper understanding of personal networks and their impact on the migratory phenomena. For non-migrant users, the integration of OSN data has already proven successful and has highlighted properties that would have been impossible to extract from offline data alone [ 56 ]. Given the role played by personal networks on migration flows and integration, we believe it is crucial to fill this gap. OSN is particularly appealing for accomplishing this task. In fact, they allow to reach scales far beyond what can be obtained from traditional data sources and they can also allow researcher to easily analyse temporal variations in the ego networks, ultimately allowing forms of now-casting of the migration phenomena.

Two research questions are particularly pressing: understanding and quantifying the relationship between the migrant’s online ego network and their migration choices, as well as measuring cultural assimilation and transnationalism through the evolution of online ego networks over time. With respect to the first question, it would be important to study the influence that alters in the different layers of migrants’ ego networks exert on the ego’s migration choices, distinguishing between the role played by weak and strong ties. These results can then be used to attempt predictions of the future migration choices of people, similarly to what is discussed in [ 98 ] for scientists. With respect to the second question, online ego networks can be a strategic asset for studying cultural assimilation, as they are typically easy to monitor for a prolonged amount of time, going beyond the single snapshot problem mentioned in [ 150 ]. As the migrant “moves” into the receiving society, we expect to observe a turnover in the ego network layers, reflecting the changes in his/her social relationships. This turnover can be measured in terms of similarity between layers across different temporal snapshots and observing the jumps that alters perform in the ego’s network (similar to what [ 18 , 37 ] do for the ego networks of politicians and journalists on Twitter). Special attention should be reserved to the movements, inside the ego network, of co-nationals vs natives of the receiving country. Cultural assimilation predicts that the first class of ties should weaken progressively, while the latter should thrive. As a result, we expect to observe outward movements for co-nationals and inwards movements for natives inside the ego network. If this is not the case, we can postulate poor or imperfect assimilation and/or strong transnational ties linking migrants to their origin country.

4 The return: migrants returning to the country of origin

Migration is commonly seen as a permanent change in residence habits. However, when considered as a temporary phenomenon, several implications arise. Return migration is increasing in several countries, i.e. Mexico [ 40 ], China [ 185 ], Jamaica [ 166 ], Tunisia [ 119 , 120 ] and Mali [ 42 ], with several effects observed. The most recent literature almost completely agrees in underlining the benefits led by returning migrants. These advantages concern a very wide range of fields and include the rise of business activity, and the wages increase [ 178 , 179 ], the improvement of educational attainment and health conditions, the increase in electoral participation [ 42 ], and the decrease in violence [ 40 ].

The origin country can benefit economically from temporary migration in at least two ways [ 119 , 120 ]. The authors show, taking the example of Tunisia, that money transfers from abroad to the migrant families are a sizeable income. Secondly, new skills learned and savings can enable return migrants to start their own business in the origin country. The SoBigData project also performed research in this field, with an approach based on data journalism that resulted in a documentary on return migration in Senegal: “Demal Te Niew” [ 23 ]. Zhao [ 185 ] has analysed the determinants of return migration and the economic behaviour of return migrants in China. Its findings result partially in mild contrast with those already discussed. The author found that out-migration is still dominant, while the return migration led by both push and pull factors is limited in scale. However, inspecting the employment-related field, the results show that return migrants invest more in productive farm activity. However, they do not show higher tendencies to engage in local non-farm activities than natives and migrants. As well as most of the literature, Zhao findings testify the return migrants key role in the modernization process of developing or less rich countries.

A lot of research has been focused on the “brain gain” provided by the return of high-skilled individuals, i.e. scientists returning in the country of birth. Scholars found that even if migration leads to a brain drain over the short-term, return migration can contribute to brain gain [ 53 , 179 ]. Moreover, the most recent researches demonstrate that return migrants contribute to the own community’s long-term well-being independently by skills they have gained abroad [ 40 ].

Regarding the health field, Levitt et al. [ 110 ] have investigated dynamics between social practices gained abroad and health care. They show that social practices introduced by return migrants positively affect health care. These results seem related to the better social conditions of households with links to migrants and return migrants [ 60 ]. A different aspect relates to family-related decisions of return migrants. A recent study shows that Egyptian males returning from other Arab countries have more children than average [ 32 ], which could be due to the effect of the foreign culture on the decisions of the migrant.

The impact of return migrants on their origin country governance has been examined in [ 28 , 42 ]. Results show that local policies are positively affected by returning migrants since these contribute to increase political participation and enhance political accountability. Political orientation of the home community can also be affected by the migration phenomenon. For instance, for Moldova, a recent study[ 27 ] shows how West-bound migration slowly changed the voting behaviour leading to the fall of the communist government in 2009.

Concerning education, research results agree that return migrants can be associated with increases and improvement of educational attainment. Taking the example of Mexico, Montoya et al. [ 127 ] have found an increase of 26% in school attendance in households linked to at least a return migrant. This could mean that return migrants give higher priority to education.

Although the study of return migration is a long-standing area, most, if not all, analyses are based on traditional data. There is, however, great potential in employing novel data types such as mobile data or OSN to study return migration, and it remains an open research area.

5 Discussion and conclusions

We have discussed three lines of research where social big data can complement existing approaches to provide small area and high-time resolution methods for analysis of migration. In terms of estimating flows and stocks, some research already exists trying to use social big data to now-cast immigration. However, models still need to be refined and validated. An important issue here is that a proper gold standard does not exist: exact current immigration rates are unknown, and those in the past can be noisy, so validation of now-casting models is not straightforward. Finding the relations between policies and immigration could be a step forward in finding means to validate model output. Another big data type that has not been included here and that can help make predictions in terms of migration related to climate is satellite data. To measure migrant integration, we believe that several new data types can be used to introduce novel integration indices, based on retail consumer behaviour, mobile data, OSN language, sentiment and network analysis. Research in this direction is slightly less developed, mostly due to low availability of ready-to-use data sets. Our consortium is making steps in this direction, using existing data sets, participating to data challenges or collecting new data. For the return of migrants, again research is limited, although potential exists in data such as retail, mobile or OSN.

In all three dimensions, research has to carefully consider the issues with the data that is being used. It is important that each study includes a well-planned data collection phase where available data are analysed to identify gaps and to devise strategies to fill the gaps by integrating other types of data. This in order to ensure that the problem being studied is thoroughly covered by the data used. In this process, research infrastructures such as SoBigData can be of great help. On the one hand, they can provide means to catalogue data, so that new data sets are available to the community for integration. On the other hand, they enable the community to share methods and experiences so that gaps identified and the solutions taken to fill these gaps can be reused. This applies not only to traditional data sources, but also to social big data. The complexity of digital demography implies that there is no free lunch with digital traces either [ 106 ]. One problem relates to the representativeness of the collected samples. For example, Facebook and Twitter penetration rates are different worldwide and tend to be different depending on the considered age of users [ 184 ]. Being unable to track specific categories of users can steer policies on migration in a direction that unwillingly perpetuates discriminations or neglects the needs of the invisible groups. For the above reasons, analytical and technical challenges to extract meaning from this kind of data, in synergy with more traditional data sources, remain an open and very important research area, with some recent efforts made in this direction [ 93 ]. Model validation using existing statistics and the relation to migration policies is important. Furthermore, careful data integration could help in overcoming some of the selection bias, resulting in novel, multi-level indices based on big data.

A different issue is that related to the ethics dimension of processing personal data, including sensitive personal data, describing human individuals and activities. As also stated in [ 187 ], the first rule that a researcher must follow is to acknowledge that data are people and can do harm. In particular, the context of migration is very sensitive to this problem, since individuals described in the data are often particularly vulnerable: refugees and their families might be persecuted in their home countries, so avoiding their re-identification is a critical matter. Moreover, mass media and social media impact our society and integration itself since a negative tone systematically relates to lower acceptance rates of asylum practices [ 102 ], so extreme care has to be taken in publishing results. Nevertheless, migration studies can have a significant impact to improve our society and to help the inclusion process of migrants; thus, encouraging data sharing is one of our main goals for achieving public good.

For all these reasons, it is essential that legal requirements and constraints are complemented by a solid understanding of ethical and legal views and values such as privacy and data protection, composing an actual ethical and legal framework. To this end, a number of infrastructural, organizational and methodological principles have been developed by the SoBigData Project, in order to establish a Responsible Research Infrastructure, allowing users to make full use of the functionalities and capabilities that big data can offer to help us solve our problems, while at the same time allowing them to respect fundamental rights and accommodate shared values, such as privacy, security, safety, fairness, equality, human dignity and autonomy [ 66 ]. In particular, we strongly rely on Value Sensitive Design and Privacy-by-Design methodologies, in order to develop privacy-enhancing technologies, privacy-aware social data mining processes and privacy risk assessment methodologies. These methods are developed mainly in the fields of mobility data (such as GPS trajectories), mobile and retail data, which are some of the (unconventional) big data used in our migration studies. Moreover, some other general tools have been implemented to assist researchers in their activities, create a new class of responsible data scientists and inform the data subjects and the society about our work and our goals, such as an online course, ethics briefs and public information documents.

Change history

21 may 2021.

A Correction to this paper has been published: https://doi.org/10.1007/s41060-021-00260-6

Recommendations on Statistics of International Migration, Revision1(p.113). United Nations, 1998.

“The Statistics Portal.” Statista. Retrieved from www.statista.com .

“Sources and comparability of migration statistics”. OECD, Retrieved from https://www.oecd.org/migration/mig/43180015.pdf .

Those born outside of Europe.

Data Challenge on Integration of Migrants in Cities (D4I), https://bluehub.jrc.ec.europa.eu/datachallenge/ .

Asylum seekers are individuals who seek to obtain refugee status.

Individuals with subsidiary protection are also referred as refugees.

It compares statistics of both immigrants and emigrants between a set of country. The degree of underestimation of number of emigrants can be inferred by doing so.

United Nations Economic Commission for Europe.

Google Trend Index, https://trends.google.com/trends/ .

http://gallup.com .

https://sobigdata.d4science.org/group/resourcecatalogue/data-catalogue?path=/dataset/hpc_twitter_dumps .

Abramitzky, R., Boustan, L.P., Eriksson, K.: Cultural assimilation during the age of mass migration. Technical Report, National Bureau of Economic Research (2016)

ACAPS: Call detail records: the use of mobile phone data to track and predict population displacement in disasters (2013)

Adamic, L.A., Glance, N.: The political blogosphere and the 2004 US election: divided they blog. In: Proceedings of the 3rd International Workshop on Link Discovery, ACM, pp. 36–43 (2005)

Agliari, E., Barra, A., Contucci, P., Pizzoferrato, A., Vernia, C.: Social interaction effects on immigrant integration. Palgrave Commun. 4 (1), 55 (2018)

Google Scholar  

Ahas, R., Silm, S., Tiru, M.: Measuring transnational migration with roaming datasets. In: Kiefer, P., Huang, H., Van de Weghe, N., Raubal, M., (eds.) Adjunct Proceedings of the 14th International Conference on Location Based Services (14th International Conference on Location Based Services (LBS 2018); Conference Location: Zurich, Switzerland; Conference Date: January 15-17), pp. 105 – 108. ETH Zurich (2018-01-15) (2018). https://doi.org/10.3929/ethz-b-000225599

Alba, R., Logan, J., Lutz, A., Stults, B.: Only english by the third generation? Loss and preservation of the mother tongue among the grandchildren of contemporary immigrants. Demography 39 (3), 467–484 (2002)

Alesina, A., Harnoss, J., Rapoport, H.: Birthplace diversity and economic prosperity. J. Econ. Growth 21 (2), 101–138 (2016)

Allport, G.W.: The Nature of Prejudice. Addison-Wesley, Boston (1954)

Amir, Y.: Contact hypothesis in ethnic relations. Psychol. Bull. 71 (5), 319–342 (1969)

Amuedo-Dorantes, C., Mundra, K.: Social networks and their impact on the earnings of Mexican migrants. Demography 44 (4), 849–863 (2007)

Andrienko, G., Andrienko, N., Bak, P., Keim, D., Wrobel, S.: Visual Analytics of Movement. Springer, Berlin (2013). https://doi.org/10.1007/978-3-642-37583-5

Book   Google Scholar  

Andrienko, G., Andrienko, N., Fuchs, G., Wood, J.: Revealing patterns and trends of mass mobility through spatial and temporal abstraction of origin-destination movement data. IEEE Trans. Vis. Comput. Gr. 23 (9), 2120–2136 (2017). https://doi.org/10.1109/TVCG.2016.2616404

Article   Google Scholar  

Andrienko, N., Andrienko, G., Stange, H., Liebig, T., Hecker, D.: Visual analytics for understanding spatial situations from episodic movement data. KI Künstliche Intel. 26 (3), 241–251 (2012). https://doi.org/10.1007/s13218-012-0177-4

Appelt, S., van Beuzekom, B., Galindo-Rueda, F., de Pinho, R.: Which factors influence the international mobility of research scientists? OECD STI Working Papers (2015). https://doi.org/10.1787/5js1tmrr2233-en

Aral, S., Alstyne, M.V.: The diversity-bandwidth trade-off source. Am. J. Sociol. 117 (1), 90–171 (2011)

Arnaboldi, V., Conti, M., Passarella, A., Dunbar, R.: Dynamics of personal social relationships in online social networks. In: Proceedings of the First ACM Conference on Online Social Networks—COSN ’13, ACM Press, New York, pp. 15–26 (2013)

Arnaboldi, V., Dunbar, R.I.M., Passarella, A., Conti, M.: Analysis of co-authorship ego networks. In: LNCS-Advances in Network Science, Springer, Cham, pp. 82–96 (2016)

Arnaboldi, V., Passarella, A., Conti, M., Dunbar, R.: Structure of ego-alter relationships of politicians in twitter. J. Comput. Mediat. Commun. 22 (5), 231–247 (2017)

Arnaboldi, V., Passarella, A., Conti, M., Dunbar, R.I.M.: Online Social Networks: Human Cognitive Constraints in Facebook and Twitter Personal Graphs. Elsevier, Amsterdam (2015)

Auriol, L.: Careers of doctorate holders. OECD STI Working Papers 4 (2010). https://doi.org/10.1787/5kmh8phxvvf5-en

Avvenuti, M., Bellomo, S., Cresci, S., La Polla, M.N., Tesconi, M.: Hybrid crowdsensing: a novel paradigm to combine the strengths of opportunistic and participatory crowdsensing. In: Proceedings of the 26th International Conference on World Wide Web Companion, International World Wide Web Conferences Steering Committee, pp. 1413–1421 (2017)

Azoulay, P., Ganguli, I., Zivin, J.G.: The mobility of elite life scientists: professional and personal determinants. Res. Policy 46 (3), 573–590 (2017). https://doi.org/10.1016/j.respol.2017.01.002 . http://www.sciencedirect.com/science/article/pii/S0048733317300021

Bachini, V et al.: Demal te niew (go and come back), documentary (2016). http://speciali.espresso.repubblica.it/interattivi-2016/va-e-torna/index.html

Bansak, K., Ferwerda, J., Hainmueller, J., Dillon, A., Hangartner, D., Lawrence, D., Weinstein, J.: Improving refugee integration through data-driven algorithmic assignment. Science 359 (6373), 325–329 (2018). https://doi.org/10.1126/science.aao4408 . http://science.sciencemag.org/content/359/6373/325

Barone, G., D’Ignazio, A., de Blasio, G., Naticchioni, P.: Mr. Rossi, Mr. Hu and politics: the role of immigration in shaping natives’ voting behavior. J. Public Econ. 136 , 1–13 (2016)

Barra, A., Contucci, P., Sandell, R., Vernia, C.: An analysis of a large dataset on immigrant integration in spain. the statistical mechanics perspective on social action. Sci. Rep. 4 , 4174 (2014)

Barsbai, T., Rapoport, H., Steinmayr, A., Trebesch, C.: The effect of labor migration on the diffusion of democracy: evidence from a former soviet republic. Am. Econ. J. Appl. Econ. 9 (3), 36–69 (2017)

Batista, C., Vicente, P.C.: Do migrants improve governance at home? Evidence from a voting experiment. World Bank Econ. Rev. 25 (1), 77–104 (2011)

Bauer, L.: Inferring variation and change from public corpora. In: The Handbook of Language Variation and Change, pp. 97–114 (2002)

Bengtsson, L., Lu, X., Thorson, A., Garfield, R., Von Schreeb, J.: Improved response to disasters and outbreaks by tracking population movements with mobile phone network data: a post-earthquake geospatial study in Haiti. PLoS Med. 8 (8), e1001083 (2011)

Bertoli, S., Cintia, P., Giannotti, F., Madinier, E., Özden, Ç., Packard, M., Pedreschi, D., Rapoport, H., Sîrbu, A., Speciale, B.: Integration of Syrian refugees: insights from D4R, media events and housing market data. In: Guide to Mobile Data Analytics in Refugee Scenarios, Springer (2019)

Bertoli, S., Marchetta, F.: Bringing it all back home-return migration and fertility choices. World Dev. 65 , 27–40 (2015)

Blommaert, J., Arnaut, K., Rampton, B., Spotti, M.: Language and Superdiversity. Routledge, Abingdon (2016)

Blondel, V.D., Decuyper, A., Krings, G.: A survey of results on mobile phone datasets analysis. EPJ Data Sci. 4 (1), 10 (2015)

Blondel, V.D., Esch, M., Chan, C., Clérot, F., Deville, P., Huens, E., Morlot, F., Smoreda, Z., Ziemlicki, C.: Data for development: the D4D challenge on mobile phone data. CoRR abs/1210.0137 (2012)

Böhme, M.H., Gröger, A., Stöhr, T.: Searching for a better life: predicting international migration with online search keywords. J. Dev. Econ. 102347 (2019)

Boldrini, C., Toprak, M., Conti, M., Passarella, A.: Twitter and the press. In: Companion of the Web Conference 2018 on The Web Conference 2018—WWW’18, ACM Press, New York, pp. 1471–1478 (2018)

Boyandin, I., Bertini, E., Bak, P., Lalanne, D.: Flowstrates: An approach for visual exploration of temporal origin-destination data. Comput. Gr. Forum 30 (3), 971–980 (2011). https://doi.org/10.1111/j.1467-8659.2011.01946.x . https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1467-8659.2011.01946.x

Bubritzki, S., van Tubergen, F., Weesie, J., Smith, S.: Ethnic composition of the school class and interethnic attitudes: a multi-group perspective. J. Ethnic Migr. Stud. 44 (3), 482–502 (2018)

Bucheli, J.R., Fontenla, M., Waddell, B.J.: Return migration and violence. World Dev. 116 , 113–124 (2019)

Carmon, N.: Immigration and Integration in Post-industrial Societies: Theoretical Analysis and Policy-related Research. Springer, Berlin (2016)

Chauvet, L., Mercier, M.: Do return migrants transfer political norms to their origin country? Evidence from Mali. J. Comp. Econ. 42 (3), 630–651 (2014)

Coletto, M., Esuli, A., Lucchese, C., Muntean, C.I., Nardini, F.M., Perego, R., Renso, C.: Sentiment-enhanced multidimensional analysis of online social networks: perception of the mediterranean refugees crisis. In: 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2016, San Francisco, CA, USA, August 18–21, 2016, pp. 1270–1277 (2016)

Coletto, M., Esuli, A., Lucchese, C., Muntean, C.I., Nardini, F.M., Perego, R., Renso, C.: Perception of social phenomena through the multidimensional analysis of online social networks. Online Soc. Netw. Media 1 , 14–32 (2017). https://doi.org/10.1016/j.osnem.2017.03.001 . http://www.sciencedirect.com/science/article/pii/S246869641630009X

Coletto, M., Lucchese, C., Orlando, S., Perego, R.: Electoral predictions with Twitter: a machine-learning approach. In: IIR 2015, Cagliari, Italy (2015)

Coletto, M., Lucchese, C., Orlando, S., Perego, R.: Polarized user and topic tracking in twitter. In: SIGIR 2016, Pisa, Italy (2016)

Colizza, V., Flammini, A., Serrano, M.A., Vespignani, A.: Detecting rich-club ordering in complex networks. Nat. Phys. 2 (2), 110–115 (2006)

Conover, M., Ratkiewicz, J., Francisco, M.R., Gonçalves, B., Menczer, F., Flammini, A.: Political polarization on Twitter. ICWSM 133 , 89–96 (2011)

Contucci, P., Sandell, R., Seyedi, S.: Forecasting the integration of immigrants. J. Math. Sociol. 41 (2), 127–137 (2017)

MathSciNet   MATH   Google Scholar  

De Beer, J., Raymer, J., Van der Erf, R., Van Wissen, L.: Overcoming the problems of inconsistent international migration data: a new method applied to flows in europe. Eur. J. Popul. 26 (4), 459–481 (2010)

Deville, P., Linard, C., Martin, S., Gilbert, M., Stevens, F.R., Gaughan, A.E., Blondel, V.D., Tatem, A.J.: Dynamic population mapping using mobile phone data. In: Proceedings of the National Academy of Sciences 111 (45), 15888–15893 (2014). https://doi.org/10.1073/pnas.1408439111 . http://www.pnas.org/content/111/45/15888

Deville, P., Wang, D., Sinatra, R., Song, C., Blondel, V.D., Barabási, A.L.: Career on the move: geography, stratification, and scientific impact. Sci. Rep. 4 , 4770 EP (2014). https://doi.org/10.1038/srep04770

Docquier, F., Rapoport, H.: Globalization, brain drain, and development. J. Econ. Lit. 50 (3), 681–730 (2012)

Doyle, G.: Mapping dialectal variation by querying social media. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp. 98–106 (2014)

Dunbar, R.: Neocortex size as a constraint on group size in primates. J. Hum. Evol. 22 (6), 469–493 (1992)

Dunbar, R., Arnaboldi, V., Conti, M., Passarella, A.: The structure of online social networks mirrors those in the offline world. Soc. Netw. 43 , 39–47 (2015)

Dunbar, R.I.: The social brain hypothesis. Evolut. Anthropol. 6 (5), 178–190 (1998)

Dunbar, R.I.M.: Coevolution of neocortical size, group size and language in humans. Behav. Brain Sci. 16 (04), 681 (1993)

Dunbar, R.I.M.: Do online social media cut through the constraints that limit the size of offline social networks? R. Soc. Open Sci. 3 (1), 150292 (2016)

MathSciNet   Google Scholar  

Duryea, S., López-Córdova, E., Olmedo, A.: Migrant Remittances and Infant Mortality: Evidence from Mexico. Inter-American Development Bank, Mimeo (2005)

EU Knowledge Centre on Migration and Demography: KCMD Data Catalogue (Accessed July 2019). https://bluehub.jrc.ec.europa.eu/catalogues/data/

Eurostat: Migration and migrant population statistics (2018). http://ec.europa.eu/eurostat/statistics-explained/index.php/Migration_and_migrant_population_statistics#Migration_flows

EUROSTAT: Asylum and managed migration data. Accessed July 2019. https://ec.europa.eu/eurostat/web/asylum-and-managed-migration/data/database

Everett, M., Borgatti, S.P.: Ego network betweenness. Soc. Netw. 27 (1), 31–38 (2005)

Faist, T.: The volume and dynamics of international migration and transnational social spaces. Refugee Surv. Q. 20 (1) (2001)

Forgó, N., Hänold, S., van den Hoven, J., Krügel, T., Lishchuk, I., Mahieu, R., Monreale, A., Pedreschi, D., Pratesi, F., van Putten, D.: SoBigData: a research infrastructure for ethical and legal data science. submitted to JDSA special issue (2019)

FRONTEX: Illegal border crossing. Accessed July 2019. https://www.asktheeu.org/en/request/illegal_boarder_crossing#incoming-10314

Gargiulo, F., Carletti, T.: Driving forces of researchers mobility. Sci. Rep. 4 (4860) (2014). https://doi.org/10.1038/srep04860

Garimella, K., De Francisci Morales, G., Gionis, A., Mathioudakis, M.: Quantifying controversy in social media. In: ACM International Conference on Web Search and Data Mining, WSDM ’16 (2016)

Giannotti, F., Nanni, M., Pedreschi, D., Pinelli, F., Renso, C., Rinzivillo, S., Trasarti, R.: Unveiling the complexity of human mobility by querying and mining massive trajectory data. VLDB J. Int. J. Very Large Data Bases 20 (5), 695–719 (2011)

Glazer, N.: Is assimilation dead? Ann. Am. Acad. Polit. Soc. Sci. 530 (1), 122–136 (1993)

Gold, S.J.: Migrant networks: a summary and critique of relational approaches to international migration. In: The Blackwell companion to social inequalities, Blackwell Publishing Ltd, Oxford, UK, pp. 257–285 (2007)

Gonçalves, B., Perra, N., Vespignani, A.: Modeling users’ activity on twitter networks: validation of dunbar’s number. PLoS ONE 6 (8) (2011)

Gonçalves, B., Sánchez, D.: Crowdsourcing dialect characterization through Twitter. PLoS ONE 9 (11), e112074 (2014)

Gonzalez, M.C., Hidalgo, C.A., Barabasi, A.L.: Understanding individual human mobility patterns. Nature 453 (7196), 779–782 (2008)

Granovetter, M.: The strength of weak ties: a network theory revisited. Sociol. Theory 1 , 201 (1983)

Granovetter, M.S.: Getting a Job: A Study of Contacts and Careers, vol. 25. University of Chicago press, Chicago (2018)

Grossi, V., Rapisarda, B., Giannotti, F., Pedreschi, D.: Data science at SoBigData: the European research infrastructure for social mining and big data analytics. Int. J. Data Sci. Anal. 6 (3), 205–216 (2018)

Guidotti, R., Coscia, M., Pedreschi, D., Pennacchioli, D.: Behavioral entropy and profitability in retail. In: IEEE International Conference on Data Science and Advanced Analytics (DSAA), 36678, IEEE, pp. 1–10 (2015)

Guidotti, R., Coscia, M., Pedreschi, D., Pennacchioli, D.: Going beyond GDP to nowcast well-being using retail market data. In: International Conference and School on Network Science, Springer, pp. 29–42 (2016)

Guidotti, R., Gabrielli, L.: Recognizing residents and tourists with retail data using shopping profiles. In: International Conference on Smart Objects and Technologies for Social Good, Springer, pp. 353–363 (2017)

Guidotti, R., Gabrielli, L., Monreale, A., Pedreschi, D., Giannotti, F.: Discovering temporal regularities in retail customers’ shopping behavior. EPJ Data Sci. 7 (1), 6 (2018)

Guidotti, R., Monreale, A., Nanni, M., Giannotti, F., Pedreschi, D.: Clustering individual transactional data for masses of users. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, pp. 195–204 (2017)

Guo, D.: Visual analytics of spatial interaction patterns for pandemic decision support. Int. J. Geogr. Inf. Sci. 21 (8), 859–877 (2007). https://doi.org/10.1080/13658810701349037

Gupta, S., Yan, X., Lerman, K.: Structural properties of ego networks. In: International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction, Springer, pp. 55–64 (2015)

Halla, M., Wagner, A.F., Zweimüller, J.: Immigration and voting for the far right. J. Eur. Econ. Assoc. 15 (6), 1341–1385 (2017)

Hawelka, B., Sitko, I., Beinat, E., Sobolevsky, S., Kazakopoulos, P., Ratti, C.: Geo-located Twitter as proxy for global mobility patterns. Cartogr. Geogr. Inf. Sci. 41 (3), 260–271 (2014)

Herdagdelen, A., State, B., Adamic, L., Mason, W.: The social ties of immigrant communities in the United States. In: WebSci (2016)

Hiir, H., Sharma, R., Aasa, A., Saluveer, E.: Impact of natural and social events on mobile call data records—an estonian case study. In: International Conference on Complex Networks and their Applications, Springer (2019)

Hill, R.A., Dunbar, R.I.M.: Social network size in humans. Hum. Nat. 14 (1), 53–72 (2003)

Hofstra, B., Corten, R., van Tubergen, F., Ellison, N.B.: Sources of segregation in social networks: a novel approach using Facebook. Am. Sociol. Rev. 82 (3), 625–656 (2017)

Holten, D., Isenberg, P., van Wijk, J.J., Fekete, J.D.: An extended evaluation of the readability of tapered, animated, and textured directed-edge representations in node-link graphs. In: 2011 IEEE Pacific Visualization Symposium, pp. 195–202 (2011). https://doi.org/10.1109/PACIFICVIS.2011.5742390

Iacus, S.M., Porro, G., Salini, S., Siletti, E.: A proposal to deal with sampling bias in social network big data. In: 2nd International Conference on Advanced Reserach Methods and Analytics (CARMA 2018), Editorial Universitat Politècnica de València, pp. 29–37 (2018)

Ibrahim, H.S., Abdou, S.M., Gheith, M.: Sentiment analysis for modern standard Arabic and colloquial. arXiv preprint arXiv:1505.03105 (2015)

Instituto Nacional de Estadistica: Ine microdata. Accessed July 2019. https://www.ine.es/en/prodyser/microdatos_en.htm

IPUMS: IPUMS census and survey data. Accessed July 2019. https://ipums.org/

Istituto Nazionale di Statistica: Immigrati.stat: Dati e indicatori su immigranti e nuovi cittadini. Accessed July 2019. http://stra-dati.istat.it/

James, C., Pappalardo, L., Sirbu, A., Simini, F.: Prediction of next career moves from scientific profiles. ArXiv e-prints (2018)

Jurdak, R., Zhao, K., Liu, J., AbouJaoude, M., Cameron, M., Newth, D.: Understanding human mobility from Twitter. PLoS ONE 10 (7), e0131469 (2015)

Kadushin, C.: Social density and mental health. In: Social Structure and Network Analysis, pp. 147–158 (1982)

Kikas, R., Dumas, M., Saabas, A.: Explaining international migration in the skype network: the role of social network features. In: Proceedings of the 1st ACM Workshop on Social Media World Sensors, ACM, pp. 17–22 (2015)

Koch, C.M., Moise, I., Donnay, K., Boudemagh, E., Helbing, D.: Dynamics between mass media and asylum acceptance rates. SSRN Electron. J. (2017). https://doi.org/10.2139/ssrn.2957362

Kolchyna, O., Souza, T.T., Treleaven, P., Aste, T.: Twitter sentiment analysis: lexicon method, machine learning method and their combination. Handbook of Sentiment Analysis in Finance (2015)

Kulkarni, V., Perozzi, B., Skiena, S.: Freshman or fresher? Quantifying the geographic variation of language in online social media. In: ICWSM, pp. 615–618 (2016)

Labov, W., Ash, S., Boberg, C.: The Atlas of North American English: Phonetics, Phonology and Sound Change. Walter de Gruyter, Berlin (2005)

Laczko, F.: Improving data on international migration and development: towards a global action plan? Improving data on international migration-towards agenda 2030 and the global compact on migration (2015)

Lamanna, F., Lenormand, M., Salas-Olmedo, M.H., Romanillos, G., Gonçalves, B., Ramasco, J.J.: Immigrant community integration in world cities. PLoS ONE 13 (3), e0191612 (2018)

Laumann, E.O., Pappi, F.U.: Networks of Collective Action : A Perspective on Community Influence Systems. Academic Press, Cambridge (1976)

Levitt, P., Jaworsky, B.N.: Transnational migration studies: past developments and future trends. Ann. Rev. Sociol. 33 (1), 129–156 (2007)

Levitt, P., Lamba-Nieves, D.: Social remittances revisited. J. Ethnic Migr. Stud. 37 (1), 1–22 (2011)

Li, L., Jing, H., Tong, H., Yang, J., He, Q., Chen, B.C.: Nemo: next career move prediction with contextual embedding. In: Proceedings of the 26th International Conference on World Wide Web Companion, WWW ’17 Companion, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, pp. 505–513 (2017). https://doi.org/10.1145/3041021.3054200

Lochmann, A., Rapoport, H., Speciale, B.: The effect of language training on immigrants’ economic integration-empirical evidence from France. Eur. Econ. Rev. 113 , 265–296 (2019)

Lu, X., Bengtsson, L., Holme, P.: Predictability of population displacement after the 2010 Haiti earthquake. Proc. Natl. Acad. Sci. 109 (29), 11576–11581 (2012). http://www.pnas.org/content/109/29/11576

Lulli, A., Gabrielli, L., Dazzi, P., Dell’Amico, M., Michiardi, P., Nanni, M., Ricci, L.: Scalable and flexible clustering solutions for mobile phone-based population indicators. Int. J. Data Sci. Anal. 4 (4), 285–299 (2017)

Mac Carron, P., Kaski, K., Dunbar, R.: Calling Dunbar’s numbers. Soc. Netw. 47 , 151–155 (2016)

Magdy, A., Ghanem, T.M., Musleh, M., Mokbel, M.F.: Exploiting geo-tagged tweets to understand localized language diversity. In: Proceedings of Workshop on Managing and Mining Enriched Geo-Spatial Data, ACM, p. 2 (2014)

McCarty, C.: Structure in personal networks. J. Soc. Struct. 3 , 1–29 (2002)

McKenzie, D., Rapoport, H.: Self-selection patterns in Mexico-US migration: the role of migration networks. Rev. Econ. Stat. 92 (4), 811–821 (2010)

Mesnard, A.: Temporary migration and capital market imperfections. Oxford Econ. Pap. 56 (2), 242–262 (2004)

Mesnard, A., et al.: Temporary migration and self-employment: evidence from tunisia. Brussels Econ. Rev. 47 (1), 119–138 (2004)

Miller, J.: Pathways in the Workplace: The Effects of Gender and Race on Access to Organizational Resources. Cambridge University Press, Cambridge (1986)

Minello, A.: The educational expectations of Italian children: the role of social interactions with the children of immigrants. Int. Stud. Sociol. Educ. 24 (2), 127–147 (2014)

Minello, A., Barban, N.: The educational expectations of children of immigrants in Italy. Ann. Am. Acad. Polit. Soc. Sci. 643 (1), 78–103 (2012)

Mocanu, D., Baronchelli, A., Perra, N., Gonçalves, B., Zhang, Q., Vespignani, A.: The Twitter of babel: mapping world languages through microblogging platforms. PLoS ONE 8 (4), e61981 (2013)

Moed, H.F., Aisati, M., Plume, A.: Studying scientific migration in scopus. Scientometrics 94 , 929–942 (2013)

Moise, I., Gaere, E., Merz, R., Koch, S., Pournaras, E.: Tracking language mobility in the Twitter landscape. In: 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW), IEEE, pp. 663–670 (2016)

Montoya Arce, J., Salas Alfaro, R., Soberón Mora, J.A.: La migración de retorno desde Estados Unidos hacia el Estado de México: oportunidades y retos. Cuadernos Geográficos (2011)

Moore, G.: The structure of a national elite network. Am. Sociol. Rev. 44 (5), 673 (1979)

Newman, M.E.: The structure of scientific collaboration networks. Proc. Natl. Acad. Sci. 98 (2), 404–409 (2001)

Nguyen, D., Doğruöz, A.S., Rosé, C.P., de Jong, F.: Computational sociolinguistics: a survey. Comput. Linguist. 42 (3), 537–593 (2016)

Noorden, R.V.: Global mobility: science on the move. Nature 490 , 326–329 (2012)

Oropesa, R.S., Landale, N.S.: Why do immigrant youths who never enroll in us schools matter? School enrollment among mexicans and non-hispanic whites. Sociol. Educ. 82 (3), 240–266 (2009)

Pak, A., Paroubek, P.: Twitter as a corpus for sentiment analysis and opinion mining. LREc 10 , 1320–1326 (2010)

Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2 (1–2), 1–135 (2008)

Paparrizos, I., Cambazoglu, B.B., Gionis, A.: Machine learned job recommendation. In: Proceedings of the Fifth ACM Conference on Recommender Systems, RecSys ’11, ACM, New York, NY, pp. 325–328 (2011). https://doi.org/10.1145/2043932.2043994

Pappalardo, L., Pedreschi, D., Smoreda, Z., Giannotti, F.: Using big data to study the link between human mobility and socio-economic development. In: 2015 IEEE International Conference on Big Data (Big Data), pp. 871–878 (2015). https://doi.org/10.1109/BigData.2015.7363835

Pappalardo, L., Simini, F., Rinzivillo, S., Pedreschi, D., Giannotti, F., Barabási, A.L.: Returners and explorers dichotomy in human mobility. Nat. Commun. 6 , 8166 EP (2015). https://doi.org/10.1038/ncomms9166

Perra, N., Gonçalves, B., Pastor-Satorras, R., Vespignani, A.: Activity driven modeling of time varying networks. Sci. Rep. 2 (2012)

Pollacci, L., Sîrbu, A., Giannotti, F., Pedreschi, D.: Measuring the Salad Bowl: superdiversity on Twitter (Submitted) (2019)

Portes, A., Guarnizo, L.E., Landolt, P.: The study of transnationalism: Pitfalls and promise of an emergent research field. Ethnic Racial Stud. 22 (2), 217–237 (1999)

Portes, A., Zhou, M.: The new second generation: segmented assimilation and its variants. Ann. Am. Acad. Polit. Soc. Sci. 530 (1), 74–96 (1993)

Poulain, M.: Confrontation des statistiques de migrations intra-européennes: Vers plus d’harmonisation? Eur. J. Populat. 9 (4), 353–381 (1993)

Poulain, M., Herm, A., Depledge, R.: Central population registers as a source of demographic statistics in europe. Population 68 (2), 183–212 (2013)

Prieto Curiel, R., Pappalardo, L., Gabrielli, L., Bishop, S.R.: Gravity and scaling laws of city to city migration. PLoS ONE 13 (7), 1–19 (2018). https://doi.org/10.1371/journal.pone.0199892

Qian, Z., Glick, J.E., Batson, C.D.: Crossing boundaries: nativity, ethnicity, and mate selection. Demography 49 (2), 651–675 (2012)

Qian, Z., Lichter, D.T.: Social boundaries and marital assimilation: Interpreting trends in racial and ethnic intermarriage. Am. Sociol. Rev. 72 (1), 68–94 (2007)

Quercia, D., Capra, L., Crowcroft, J.: The social world of Twitter: topics, geography, and emotions. ICWSM 12 , 298–305 (2012)

Raymer, J., Wiilekens, F.: Obtaining an overall picture of population movement in the European union. In: International Migration in Europe: Data, Models and Estimates, pp. 209–234 (2008)

Ruotsalainen, K.: A census of the world population is taken every ten years (2016). http://www.stat.fi/tup/vl2010/art_2011-05-17_001_en.html

Ryan, L., D’Angelo, A.: Changing times: migrants’ social network analysis and the challenges of longitudinal research. Soc. Netw. 53 , 148–158 (2018)

Salah, A.A., Pentland, A., Lepri, B., Letouze, E. (eds.): Guide to Mobile Data Analytics in Refugee Scenarios. Springer, Berlin (2019)

Salah, A.A., Pentland, A., Lepri, B., Letouzé, E., de Montjoye, Y.A., Dong, X., Dağdelen, Ö., Vinck, P.: Introduction to the data for refugees challenge on mobility of Syrian refugees in Turkey. In: Guide to Mobile Data Analytics in Refugee Scenarios, Springer, pp. 3–27 (2019)

Scott, W.R., Laumann, E.O., Knoke, D.: The organizational state: social choice in national policy domains (1989)

Simini, F., Gonzalez, M.C., Maritan, A., Barabasi, A.L.: A universal model for mobility and migration patterns. Nature 484 (7392), 96–100 (2012)

Sinatra, R., Wang, D., Deville, P., Song, C., Barabási, A.L.: Quantifying the evolution of individual scientific impact. Science (2016). https://doi.org/10.1126/science.aaf5239 . http://science.sciencemag.org/content/354/6312/aaf5239

Siting, Z., Wenxing, H., Ning, Z., Fan, Y.: Job recommender systems: a survey. In: 2012 7th International Conference on Computer Science Education (ICCSE), pp. 920–924 (2012). https://doi.org/10.1109/ICCSE.2012.6295216

Smith, S., Maas, I., van Tubergen, F.: Irreconcilable differences? Ethnic intermarriage and divorce in the Netherlands, 1995–2008. Soc. Sci. Res. 41 (5), 1126–1137 (2012)

Smith, S., Van Tubergen, F., Maas, I., McFarland, D.A.: Ethnic composition and friendship segregation: differential effects for adolescent natives and immigrants. Am. J. Sociol. 121 (4), 1223–1272 (2016)

Spörlein, C., van Tubergen, F.: The occupational status of immigrants in western and non-western societies. Int. J. Comp. Sociol. 55 (2), 119–143 (2014)

State, B., Rodriguez, M., Helbing, D., Zagheni, E.: Migration of professionals to the U.S. In: SocInfo, Springer, Cham, pp. 531–543 (2014)

Sugimoto, C.R.: Scientists have most impact when they’re free to move. Nature 550 , 29–31 (2017). https://doi.org/10.1038/550029a

Sutcliffe, A., Dunbar, R., Binder, J., Arrow, H.: Relationships and the social brain: integrating psychological and evolutionary perspectives. Br. J. Psychol. 103 (2), 149–168 (2012)

Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. Linguist. 37 (2), 267–307 (2011)

The OECD: Database on immigrants in OECD and non-OECD countries: Dioc. Accessed July 2019. http://www.oecd.org/els/mig/dioc.htm

The Worldbank: Migration and remittances data. Accessed July 2019. https://www.worldbank.org/en/topic/migrationremittancesdiasporaissues/brief/migration-remittances-data

Thomas-Hope, E.: Return migration to Jamaica and its development potential. Int. Migrat. 37 (1), 183–207 (1999)

Tobler, W.R.: Experiments in migration mapping by computer. Am. Cartogr. 14 (2), 155–163 (1987). https://doi.org/10.1559/152304087783875273

Tosi, D.: Cell phone big data to compute mobility scenarios for future smart cities. Int. J. Data Sci. Anal. 4 (4), 265–284 (2017)

Turktelekom: Data for refugees turkey (2018). http://d4r.turktelekom.com.tr/

Tversky, B., Morrison, J.B., Betrancourt, M.: Animation: can it facilitate? Int. J. Hum. Comput. Stud. 57 (4), 247–262 (2002). https://doi.org/10.1006/ijhc.2002.1017 . http://www.sciencedirect.com/science/article/pii/S1071581902910177

United Nations: Recommendations on statistics of international migration. Department of Economic and Social Affairs, Statistics Division, United Nations, New York (1998)

Van Tubergen, F.: Ethnic boundaries in core discussion networks: a multilevel social network study of Turks and Moroccans in the Netherlands. J. Ethnic Migrat. Stud. 41 (1), 101–116 (2015)

Van Tubergen, F., Kalmijn, M.: A dynamic approach to the determinants of immigrants’ language proficiency: the United States, 1980–2000. Int. Migr. Rev. 43 (3), 519–543 (2009)

Van Tubergen, F., Wierenga, M.: The language acquisition of male immigrants in a multilingual destination: Turks and Moroccans in Belgium. J. Ethnic Migr. Stud. 37 (7), 1039–1057 (2011)

Verdery, A.M., Mouw, T., Edelblute, H., Chavez, S.: Communication flows and the durability of a transnational social field. Soc. Netw. 53 , 57–71 (2018)

Vertovec, S.: Super-diversity and its implications. Ethnic Racial Stud. 30 (6), 1024–1054 (2007)

Vignoli, D., Pirani, E., Venturini, A.: Female migration and native marital stability: insights from Italy. J. Family Econ Issues 38 (1), 118–128 (2017)

Wahba, J.: Selection, selection, selection: the impact of return migration. J. Popul. Econ. 28 (3), 535–563 (2015)

Wahba, J., Zenou, Y.: Out of sight, out of mind: migration, entrepreneurship and social capital. Reg. Sci. Urban Econ. 42 (5), 890–903 (2012)

Waldinger, R.: 12 networks and niches: the continuing significance of ethnic connections. Ethnicity, social mobility, and public policy: Comparing the USA and UK. p. 342 (2005)

Wang, D., Pedreschi, D., Song, C., Giannotti, F., Barabasi, A.L.: Human mobility, social ties, and link prediction. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, pp. 1100–1108 (2011)

Wood, J., Dykes, J., Slingsby, A.: Visualisation of origins, destinations and flows with od maps. Cartogr. J. 47 (2), 117–129 (2010). https://doi.org/10.1179/000870410X12658023467367

Zagheni, E., Garimella, V.R.K., Weber, I., et al.: Inferring international and internal migration patterns from Twitter data. In: Proceedings of the 23rd International Conference on World Wide Web, ACM, pp. 439–444 (2014)

Zagheni, E., Weber, I., Gummadi, K.: Leveraging facebook’s advertising platform to monitor stocks of migrants. Popul. Dev. Rev. 43 (4), 721–734 (2017)

Zhao, Y.: Causes and consequences of return migration: recent evidence from china. J. Comp. Econ. 30 (2), 376–394 (2002)

Zhou, W.X., Sornette, D., Hill, R.A., Dunbar, R.I.M.: Discrete hierarchical organization of social group sizes. Proc. Biol. Scie. R. Soc. 272 (1561), 439–444 (2005)

Zook, M., Barocas, S., Boyd, D., Crawford, K., Keller, E., Gangadharan, SPn, Goodman, A., Hollander, R., Koenig, B.A., Metcalf, J., Narayanan, A., Nelson, A., Pasquale, F.: Ten simple rules for responsible big data research. PLoS Comput. Biol. (2017). https://doi.org/10.1371/journal.pcbi.1005399

Download references

Acknowledgements

This work was supported by the European Commission through the Horizon2020 European project “SoBigData Research Infrastructure—Big Data and Social Mining Ecosystem” (Grant Agreement 654024). The funders had no role in developing the research and writing the manuscript.

Author information

Authors and affiliations.

Department of Computer Science, University of Pisa, Pisa, Italy

Alina Sîrbu, Dino Pedreschi, Laura Pollacci & Francesca Pratesi

Fraunhofer Institute IAIS, Sankt Augustin, Germany

Gennady Andrienko & Natalia Andrienko

City, University of London, London, UK

IIT - CNR, Pisa, Italy

Chiara Boldrini, Marco Conti & Andrea Passarella

ISTI-CNR, Pisa, Italy

Fosca Giannotti, Riccardo Guidotti, Cristina Ioana Muntean & Luca Pappalardo

CNRS, IRD, CERDI, Université Clermont Auvergne, Clermont-Ferrand, France

Simone Bertoli

Scuola Normale Superiore, Pisa, Italy

University of Tartu, Tartu, Estonia

Rajesh Sharma

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Alina Sîrbu .

Ethics declarations

Conflict of interest.

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Sîrbu, A., Andrienko, G., Andrienko, N. et al. Human migration: the big data perspective. Int J Data Sci Anal 11 , 341–360 (2021). https://doi.org/10.1007/s41060-020-00213-5

Download citation

Received : 23 July 2019

Accepted : 11 March 2020

Published : 23 March 2020

Issue Date : May 2021

DOI : https://doi.org/10.1007/s41060-020-00213-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Human migration
  • Migration flows
  • Migration stocks
  • Integration
  • Return of migrants
  • Find a journal
  • Publish with us
  • Track your research

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications You must be signed in to change notification settings

BSc thesis: System for migrating social network data from heterogeneous sources to a graph database. Authors: Gabriel Kępka, Piotr Makarewicz

PiotrMakarewicz/social-network-data-migration

Folders and files.

NameName
205 Commits
workflows workflows
wrapper wrapper

Repository files navigation

System for migrating social network data from heterogeneous sources to a graph database.

BSc thesis. Authors: Gabriel Kępka, Piotr Makarewicz

Documentation available here .

Contributors 2

Francesco Lelli

Where to get data: a collection of resources for your thesis.

If you are wondering where to get data for your thesis this article is for you.

Data come in all shapes and forms. If you are doing your thesis, you are in search of a proof of concept In other words, you are attempting to prove the validity of an idea or concept, not to produce an industry/ready solution . Therefore, most of the time, you do not need particularly large datasets. However, you want to be sure that they are of a sufficient quality. After all, you need to elaborate them without introducing too much noise.

What follows is a selected set of resources. It is not exhaustive and I am pretty sure that you will be able to find more by searching the Internet and asking your supervisor. However, this list is a good starting point in case you are trying to build a research/thesis proposal or you are stuck and in search of an idea.

Google datasets search:

  • Link: https:// datasetsearch.research.google.com /

You guessed right. Google has a dedicated search engine for datasets. It is freely available and index data that implement a particular schema.org format. Some of the data may be behind paywalls. However, academics and students usually can contact the data provider in order to get access to them or to a selected portion that is sufficient for their study.

EU Open Data Portal:

  • Link: https://data.europa.eu/euodp/en/data/

Once again, you guessed right. The European Union is committed to foster a strong data transparency policy. At the time of writing this article, I was able to find 15399 different datasets that are freely available. They cover a large variety of topics and are related to all the various arguments under the jurisdiction of the EU.

Yahoo Finance:

  • Link: https://finance.yahoo.com/
  • Link to the help center: https://help.yahoo.com/kb/download-historical-data-yahoo-finance-sln2311.html

There are plenty of financial DB in the web. Yahoo finance is the most known and straightforward. In case you are looking for daily quotations of various financial assets, this is the place for you. The second link will explain how to download historical data.

More on financial data:

  • Link Thomson Reuters API: https://customers.reuters.com/developer/apis_tech.aspx

Some programming knowledge is required . An Application Programming Interface (API) gives you the possibility to access to the data of a website in a programmatic way. Reuters has a dataset that includes financial news and press releases . All you have to do is to write a few lines of code for accessing this information.

Kaggle.com, a community approach:

  • Link: https://www.kaggle.com/datasets

Kaggle offers for its members the possibility to access and share data. In addition, it offers also a set of demo code for accessing and manipulating the data. The community is data science driven. However, some of the datasets do not require particular programming skills.

Get your data by yourself:

  • link: https://www.programmableweb.com/

There is an ever growing amount of websites that offer the possibility to access them programmatically. Programmableweb is a directory that tries to list them. In addition, it lists links to the proper resources for using them. Note that use the APIs will require some programming . However, you may end up creating original datasets and this is already a valuable outcome of your thesis.

Where to get data for your thesis? In summary:

I hope that by now you realized that datasets are far from been a sparse resource. This collection or resources can serve you as the starting point for finding what you are looking for or for refining/developing your research idea.

This article (Where to get Data: a collection of resources for your thesis) is part of the miniseries on how to do a good thesis, you can see the full list of posts at the following link:

How to Do a Good Thesis: the Miniseries

Share this:

  • Share on Tumblr
  • ← Publish Your Thesis in your University Library
  • Issue in Automatic Combination of Cloud Services →

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed .

Privacy Overview

Terms and Conditions - Privacy Policy

data migration process

Data Migration: Process, Types, and Golden Rules to Follow

  • 10 min read
  • Engineering
  • Last updated: 23 Nov, 2020
  • No comments Share

In our daily lives, moving information from one location to another is no more than a simple copy-and-paste operation. Everything gets far more complicated when it comes to transferring millions of data units into a new system. However, many companies treat even a massive data migration as a low-level, two-clicks task. Such an initial underestimation translates to spending extra time and money. Recent studies revealed that 55 percent of data migration projects went over budget and 62 percent appeared to be harder than expected or actually failed. How to avoid falling into the same trap? The answer lies in understanding the essentials of the data migration process, from its triggers to final phases. If you are already familiar with theoretical aspects of the problem, you may jump to the section Data Migration Process where we give practical recommendations. Otherwise, let’s start from the most basic question: What is data migration?

What is data migration?

data migration triggers

What makes companies migrate their data assets.

Usually, data migration comes as a part of a larger project such as

  • legacy software modernization or replacement,
  • the expansion of system and storage capacities,
  • the introduction of an additional system working alongside the existing application,
  • the shift to a centralized database to eliminate data silos and achieve interoperability ,
  • moving IT infrastructure to the cloud, or
  • merger and acquisition (M&A) activities when IT landscapes must be consolidated into a single system.

Explore how AltexSoft helped

ASL Aviation migrate to cloud

Data migration is sometimes confused with other processes involving massive data movements. Before we go any further, it’s important to clear up the differences between data migration, data integration , and data replication.

Data migration vs data integration

Unlike migration dealing with the company’s internal information, integration is about combining data from multiple sources outside and inside the company into a single view. It is an essential element of the data management strategy  that enables connectivity between systems and gives access to the content across a wide array of subjects. Consolidated datasets are a prerequisite for accurate analysis, extracting business insights, and reporting. Data migration is a one-way journey that ends once all the information is transported to a target location. Integration, by contrast, can be a continuous process, that involves streaming real-time data and sharing information across systems.

Data migration vs data replication

In data migration, after the data is completely transferred to a new location, you eventually abandon the old system or database. In replication, you periodically transport data to a target location, without deleting or discarding its source. So, it has a starting point, but no defined completion time. Data replication can be a part of the data integration process. Also, it may turn into data migration — provided that the source storage is decommissioned. Now, we’ll discuss only data migration — a one-time and one-way process of moving to a new house, leaving an old one empty.

Main types of data migration

data migration types

Six major types of data migration.

Storage migration

Storage migration occurs when a business acquires modern technologies discarding out-of-date equipment. This entails the transportation of data from one physical medium to another or from a physical to a virtual environment. Examples of such migrations are when you move data

  • from paper to digital documents,
  • from hard disk drives (HDDs) to faster and more durable solid-state drives (SSDs), or
  • from mainframe computers to cloud storage.

Mainframes and data to be migrated

Many big enterprises still rely on mainframes to run their business processes. Source: TechRepublic

The primary reason for this shift is a pressing need for technology upgrades rather than a lack of storage space. When it comes to large-scale systems, the migration process can take years. Say, Sabre, the second-largest global distribution system (GDS), has been moving its software and data from mainframe computers to virtual servers for over a decade. Its Migration Period is expected to be entirely completed in 2023.

Database migration

A database is not just a place to store data. It provides a structure to organize information in a specific way and is typically controlled via a database management system (DBMS). So, most of the time, database migration means

  • an upgrade to the latest version of DBMS (so-called homogeneous migration ),
  • a switch to a new DBMS from a different provider — for example, from MySQL to PostgreSQL or from Oracle to MSSQL (so-called heterogeneous migration )

The latter case is tougher than the former, especially if target and source databases support different data structures. It makes the task still more challenging when you have to move data from legacy databases — like Adabas, IMS, or IDMS.

Application migration

When a company changes an enterprise software vendor — for instance, a hotel implements a new property management system or a hospital replaces its legacy EHR system — this requires moving data from one computing environment to another. The key challenge here is that old and new infrastructures may have unique data models and work with different data formats.

Data center migration

A data center is a physical infrastructure used by organizations to keep their critical applications and data. Put more precisely, it’s the very dark room with servers, networks, switches, and other IT equipment. So, data center migration can mean different things: from relocation of existing computers and wires to other premises to moving all digital assets, including data and business applications to new servers and storages.

Business process migration

This type of migration is driven by mergers and acquisitions, business optimization, or reorganization to address competitive challenges or enter new markets. All these changes may require the transfer of business applications and databases with data on customers, products, and operations to the new environment.

Cloud migration

Cloud migration is a popular term that embraces all the above-mentioned cases, if they involve moving data from on-premises to the cloud or between different cloud environments. Gartner expects that by 2024 the cloud will attract over 45 percent of IT spending and dominate ever-growing numbers of IT decisions. Depending on volumes of data and differences between source and target locations, migration can take from some 30 minutes to months and even years. The complexity of the project and the cost of downtime will define how exactly to unwrap the process.

Approaches to data migration

Choosing the right approach to migration is the first step to ensure that the project will run smoothly, with no severe delays.

Big bang data migration

Advantages: less costly, less complex, takes less time, all changes happen once Disadvantages: a high risk of expensive failure, requires downtime In a big bang scenario, you move all data assets from source to target environment in one operation, within a relatively short time window. Systems are down and unavailable for users so long as data moves and undergoes transformations to meet the requirements of a target infrastructure. The migration is typically executed during a legal holiday or weekend when customers presumably don’t use the application. The big bang approach allows you to complete migration in the shortest possible time and saves the hassle of working across the old and new systems simultaneously. However, in the era of Big Data, even midsize companies accumulate huge volumes of information while the throughput of networks and API gateways is not endless. This constraint must be considered from the start. Verdict. The big bang approach fits small companies or businesses working with small amounts of data. It doesn’t work for mission-critical applications that must be available 24/7.

Trickle data migration

Advantages: less prone to unexpected failures, zero downtime required Disadvantages: more expensive, takes more time, needs extra efforts and resources to keep two systems running Also known as a phased or iterative migration, this approach brings Agile experience to data transfer. It breaks down the entire process into sub-migrations, each with its own goals, timelines, scope, and quality checks. Trickle migration involves parallel running of the old and new systems and transferring data in small increments. As a result, you take advantage of zero downtime and your customers are happy because of the 24/7 application availability. On the dark side, the iterative strategy takes much more time and adds complexity to the project. Your migration team must track which data has been already transported and ensure that users can switch between two systems to access the required information. Another way to perform trickle migration is to keep the old application entirely operational until the end of the migration. As a result, your clients will use the old system as usual and switch to the new application only when all data is successfully loaded to the target environment. However, this scenario doesn’t make things easier for your engineers. They have to make sure that data is synchronized in real time across two platforms once it is created or changed. In other words, any changes in the source system must trigger updates in the target system. Verdict. Trickle migration is the right choice for medium and large enterprises that can’t afford long downtime but have enough expertise to face technological challenges.

Data migration process

No matter the approach, the data migration project goes through the same key phases — namely

  • data auditing and profiling,
  • data backup,
  • migration design,
  • testing, and
  • post-migration audit.

data migration stages

Key phases of the data migration process.

Below, we’ll outline what you should do at each phase to transfer your data to a new location without losses, extansive delays,  or/and ruinous budget overrun.

Planning: create a data migration plan and stick to it

Data migration is a complex process, and it starts with the evaluation of existing data assets and careful designing of a migration plan. The planning stage can be divided into four steps. Step 1 — refine the scope. The key goal of this step is to filter out any excess data and to define the smallest amount of information required to run the system effectively. So, you need to perform a high-level analysis of source and target systems, in consultation with data users who will be directly impacted by the upcoming changes. Step 2 — assess source and target systems. A migration plan should include a thorough assessment of the current system’s operational requirements and how they can be adapted to the new environment. Step 3 — set data standards. This will allow your team to spot problem areas across each phase of the migration process and avoid unexpected issues at the post-migration stage. Step 4 — estimate budget and set realistic timelines. After the scope is refined and systems are evaluated, it’s easier to select the approach (big bang or trickle), estimate resources needed for the project, set schedules, and deadlines. According to Oracle estimations , an enterprise-scale data migration project lasts six months to two years on average.

Data auditing and profiling: employ digital tools

This stage is for examining and cleansing the full scope of data to be migrated. It aims at detecting possible conflicts, identifying data quality issues, and eliminating duplications and anomalies prior to the migration. Auditing and profiling are tedious, time-consuming, and labor-intensive activities, so in large projects, automation tools should be employed. Among popular solutions are Open Studio for Data Quality , Data Ladder , SAS Data Quality , Informatica Data Quality , and IBM InfoSphere QualityStage , to name a few.

Data backup: protect your content before moving it

Technically, this stage is not mandatory. However, best practices of data migration dictate the creation of a full backup of the content you plan to move — before executing the actual migration. As a result, you’ll get an extra layer of protection in the event of unexpected migration failures and data losses.

Migration design: hire an ETL specialist

The migration design specifies migration and testing rules, clarifies acceptance criteria, and assigns roles and responsibilities across the migration team members. Though several technologies can be used for data migration, extract, transform, and load (ETL) is the preferred one. It makes sense to hire an ETL developer — or a dedicated software engineer with deep expertise in ETL processes, especially if your project deals with large data volumes and complex data flow. At this phase, ETL developers or data engineers create scripts for data transition or choose and customize third-party ETL tools. An integral part of ETL is data mapping. In the ideal scenario, it involves not only an ETL developer, but also a system analyst knowing both source and target system, and a business analyst who understands the value of data to be moved. The duration of this stage depends mainly on the time needed to write scripts for ETL procedures or to acquire appropriate automation tools. If all required software is in place and you only have to customize it, migration design will take a few weeks. Otherwise, it may span a few months.

Execution: focus on business goals and customer satisfaction

This is when migration — or data extraction, transformation, and loading — actually happens. In the big bang scenario, it will last no more than a couple of days. Alternatively, if data is transferred in trickles, execution will take much longer but, as we mentioned before, with zero downtime and the lowest possible risk of critical failures. If you’ve chosen a phased approach, make sure that migration activities don’t hinder usual system operations. Besides, your migration team must communicate with business units to refine when each sub-migration is to be rolled out and to which group of users.

Data migration testing: check data quality across phases

In fact, testing is not a separate phase, as it is carried out across the design, execution, and post-migration phases. If you have taken a trickle approach, you should test each portion of migrated data to fix problems in a timely manner. Frequent testing ensures the safe transit of data elements and their high quality and congruence with requirements when entering the target infrastructure. You may learn more about the details of testing the ETL process from our dedicated article.

Post-migration audit: validate results with key clients

Before launching migrated data in production, results should be validated with key business users. This stage ensures that information has been correctly transported and logged. After a post-migration audit, the old system can be retired.

Golden rules of data migration

While each data migration project is unique and presents its own challenges, some common golden rules may help companies safely transit their valuable data assets, avoiding critical delays.

  • Use data migration as an opportunity to reveal and fix data quality issues. Set high standards to improve data and metadata as you migrate them.
  • Hire data migration specialists and assign a dedicated migration team to run the project.
  • Minimize the amount of data to be migrated.
  • Profile all source data before writing mapping scripts.
  • Allocate considerable time to the design phase as it has a high impact on project success.
  • Don’t be in a hurry to switch off the old platform. Sometimes, the first attempt of data migration fails, demanding rollback and another try.

Data migration is often viewed as a necessary evil rather than a value-adding process. And this seems to be the key root of many if not all difficulties. Considering migration an important innovation project worthy of special focus is half the battle won.

  • Prospective students
  • International students
  • Companies and organizations
  • Part-Time - Compulsory Requirements

Former Thesis Topics

  • Plagiarism warning
  • Scholarships
  • Practical information
  • Equality Policy

Master in Migration Studies

Best master thesis in migration studies award.

Every year we select the most outstanding Master thesis, which is offered the opportunity to be published in our GRITIM-UPF Working Paper Series . 

Master Thesis Details of the 13th Edition of the Master´s program in Migration Studies (2021-2022)

 

Iris Egea Quijada

 

A case study on the revitalisation of shrinking Spain: Migrant reception in rural areas

 

Martin Lundsteen

 

Marta Egidi

 

The Externalization of Borders: a Postcolonial Analysis of the Bilateral Agreements between Italy and Libya and the Political Discours 

 

Oriol Puig Cepero

 

Aida Casanovas i Oliveres

 

The External Relations of Mediterranean Cities with Civil Society Organisations in Migration Governance

 

Ricard Zapata-Barrero

 

Eréndira León Salvador

 

The Holy Trinity of SAR NGOs images in the Mediterranean: Colonialism, Racialization, and Gender biases

 

Lorenzo Gabrielli

 

Anna Maria Oms Graells  

 

Examining the role of gender in Unaccompanied Foreign Minors’ socioeconomic transition to adulthood in Catalonia

 

Evren Yalaz

 

Theresa Rappold

 

Between Colonialism, Clientelism and Migration: How Malta’s Past and Present Influence Civil Society Organisations Working with Migrants  

 

Juan Carlos Triviño

 

Camille de Sélys

 

Asylum-seeking women’s perception of empowerment The case study of the Red Cross center of Fraipont, Belgium  

 

Evren Yalaz

 

Nuria Pedros Barnils

 

Covid-19 short-term socioeconomic impacts on the migrant population in pain 

 

Evren Yalaz

 

Zelal Yekbun Kiraz

 

Kurdish film festivals as political participation activity

 

Lorenzo Gabrielli 

 

Marc Borràs 

 

‘I don't want to have poor migrants around, move’em all to the neighboring town!’: The municipal register of inhabitants as a tool for selected resident population.

 

Dirk Gebhardt 

 

Lillie Stephens

 

The Law of Historical Memory as Racialized: A Postcolonial Approach to Spanish Citizenship Policy

 

Zenia Hellgren

 

Henry Manning

 

Title 42 as exception and routine: Analyzing pandemic-era US border policy as securitization

 

Lorenzo Gabrielli

 

Mathilde Eiksund

 

Partial Colorblindness in the Norwegian Parliament A Discourse Analysis of Norwegian Parliamentary Debates

 

 

Zenia Hellgren 

 

Liana Wool

 

Rethinking Refugee Camps and Integration: Assessing the Needs for Workforce Training in Refugee Camps

 

John Palmer 

 

Tatiana Lucas Rodrigues 

 

Interculturalism in practice: an analysis of the Catalog of Anti-rumor Activities adopted by the Anti-rumor Strategy promoted by the Barcelona Interculturality Plan 

 

Dirk Gebhardt

 

Roosmarijn Sybesma

 

‘THE SOUND OF SILENCE’ Violating Border Control Practices in the case of Gran Canaria

 

Silvia Morgades Gil

 

Alessandra Amati 

 

The psycho-social phenomena of migration: How integration policies impact the acculturation and mental health of undocumented immigrants living in the United States - A Research Proposal

 

Verónica Benet 

Pauline Habraken

Watching from the sideline. On the Emergency Trust fund in Niger and a lack of transparency within the European Union

 

Silvia Morgades Gil
Ruby Knight

From Changing Climates to Changing Homes. Exploring how adaptation policies are used to reduce vulnerabilities for internal climate migrants in South-West Kenya during periods of heavy rain.

Oriol Puig Cepero
María del Mar Torreblanca Rodríguez State of the Art : The instrumentalization of migration in the frame of the European Union externalization policy Oriol Puig Cepero
Michela Messina

Refugees Crisis 2015: the continuum of the emergency approach in the Italian reception system, the case of CAS (Extraordinary reception centers)

Zenia Hellgren
Marie-Ève Lacroix Unaccompanied Minors Who Transitioned to Adulthood’s Experience of Liminal Legality: A Case Study of Barcelona Juan Carlos Triviño
Heléne DeKorne Institutional bias: Salvadoran derivative refugee applicants and the material support bar John Palmer

Master Thesis Details of the 12th Edition of the Master´s program in Migration Studies (2020-2021)

 

Federica Peloso

 

The consequences of climate migration with a focus on gender and intersectionality 

 

Zenia Hellgren

 

Victor Tretter

 

Conspiracy Theories and migration: how the conspirational discourse against immigrant shapes political and violent action from the far-right 

 

Martin Lundsteen

 

Giulia Marasca

 

The role of feminist organization for the protections of immigrants rights during the Covid-19 pandemic: the case of Barcelona

 

Juan Carlos Treviño

 

Philine Linh Matzen

 

The differences between foreign and national criminals represented in media coverage

 

Evren Yalaz

 

Mireia Ros

 

Canarian narratives after the migration phenomenon of 2020 

 

Juan Carlos Treviño

 

Isabel Clifford

 

Barriers to Homing and Integration for Asylum-Seeker and Refugee Children in British Primary Schools: A Research Proposal

 

Zenia Hellgren

 

Lina Montoya 

 

Rural Migration in Catalunya 

 

Juan Carlos Treviño

 

Nuria Pedros Barnils

 

Covid-19 short-term socioeconomic impacts on the migrant population in pain 

 

Evren Yalaz

 

Gharam Al Yousef

 

Refugee Return to Syria: The case of Lebanon 

 

Lorenzo Gabrielli 

 

Silvia Caraballo 

 

Transnationalism and the Cuban Diaspora's Political Participation in Spain and the United States of America

 

Evren Yalaz 

 

Afroditi Konstantopoulou

 

How violence against female migrants in the border is depicted in photojounalism

 

Evren Yalaz

 

Hania Eid 

 

The commodification of Migrants and Refugees: Modern-day Slavery in Libya and Lebanon

 

Lorenzo Gabrielli

 

Vasiliki Iliopoulou

 

The integration policies about immigrants in Grecee

 

Martin Lundsteen 

 

Federica Rossi

 

“Decreto Sicurezza”: a new hope for the recognition of environmental migrants in Italy

 

Lorenzo Gabrielli 

 

Mary Carmen Loor Cañarte

 

Sociolaboral effects of COVID19 irregular migrants the (in)existance measures of Spanish goverment to alleviate them

 

Aida Torrez Pérez

 

Laura Valerie Fritz

 

How could the situation of forcibly displaced unaccompanied children and youth be improved? A case study taking into consideration the impact of psychosocial programs on the lives of unaccompanied children and youth 

 

Dirk Gebhardt

 

Cansu Segur 

 

Intersectionality of Identity and Migration: A case study on Dom Refugees in Turkey

 

Dirk Gebhardt

Master Thesis Details of the 11th Edition of the Master´s program in Migration Studies (2019-2020)

 

 

Anna Porta Pi-Sunyer

 

The Role of Interculturalism in Catalan Universities.

 

Gemma Pinyol-Jiménez

 

Aurelia Eleonora Tolloy

 

The Five Pillars of Identity: Integration of Syrian Refugees in Austria.

 

Veronica Benet-Martinez

 

Bianca Steffenhagen

 

Recognizing Super-Diversity? How the Offices of Interculturality in Munich and Barcelona Meet the Challenges of a Diverse Population.

 

Dirk Gebhardt

 

Chigozie Ruth Ogbonna

 

Economic Migration and Brain Drain in Nigeria.

 

Lorenzo Gabrielli

 

Emre Sepici

 

Transnational Practices of Syrian Refugees in Turkey.

 

Lorenzo Gabrielli

 

Irene Rocchi

 

Discourse on Islam in Italy: The Socio-political Effects of the Politization of Religion by the Far-right.

 

Evren Yalaz

 

Janis Janowsky

 

Knowledge Exchange in the Spanish Network of Intercultural Cities.

 

Daniel de Torres Barderi

 

Jelena Luyts

 

The Representation of Migrants in the Belgian Press: Before and
after the Terrorist Attacks of March 22nd 2016.

 

Evren Yalaz

 

Malin Johnsson

 

A Change of Heart? A Comparative Study of the Framing of Immigration in Swedish Newspapers in 2015.

 

Zenia Hellgren

 

Maria del Rosario Perea Garcés

 

The Rap of the Outcasted: A Discourse Analysis of Spanish Rap Music and its Role in Migrants’ Political Participation in Spain.

 

Martin Lundsteen

 

Michelle Crijns

 

Queer Asylum: Homonationalism, Orientalist Narratives, and the Fight for Identity Recognition.

 

Ricard Zapata-Barrero

 

Mostafa El Kordy

 

Rafah Border: Terrorism and Border Control throughout Different Regimes.

 

Lorenzo Gabrielli

 

Niki Pyrovolaki

 

Teitiota´s Case and its Impact on the International Legal Framework on Climate-forced Displacement.

 

Daniel de Torres Barderi

 

Rebecca Massaro

 

The (In)Effectiveness of the U.S. Immigration Policy.

 

Aida Torres Pérez

 

Rose Mirene Mouansie Mapiemfou

 

Making the Invisible of Migration Visible. Highly Skilled Migrant Women: How to Enforce their Agencies?

 

Zenia Hellgren

 

Shannon Gouppy

 

Identity (De-)Construction of Muslim Artists in French-speaking Belgium: A gender Comparison.

 

Marco Martiniello

 

Sonay Barazesh 

 

 

The Syrian Refugee Labor Supply Shock in Turkey, Jordan and Lebanon: Literature on the impacts on labor markets, economies and policies 

Ivan Martín

 

Tarek Saliba Rodriguez

 

Interculturalism and the Catalan Pro-independence Movement: The End of Catalan Nationalism?

 

Gemma Pinyol-Jiménez

Master Thesis Details of the 10th Edition of the Master´s program in Migration Studies (2018-2019)

Adnane Derj

Football Supporterism´s Influence on Migrants´Integration: The Case of Standard de Liège.

Marco Martiniello

Ahmed Kadiri

How did Canada Become One of the Most Popular Destination Countries for Immigration in the World? An Analysis of the “Canadian Exceptionalism” regarding Immigration.

 

Lorenzo Gabrielli

Arife Demir

How do Racist Crimes against Immigrants have Repercussions in Society? The Analysis of the NSU Case in Germany

 

Martin Lundsteen

Aylin Huri Kuyucu

Highly Skilled Turkish Migrants in Barcelona and Berlin: Negotiating Boundaries of Turkishness.

Evren Yalaz

Joelle Nicole Spahni

EU´s Responsibility on Libya´s Detention Centers.

Silvia Morgades Gil

Philippa Sophie Fraas 

 

The Impact of the “Burka Ban” in Denmark The Veiled Women’s Lived Experiences

 Zenia Hellgren

Jordan Astyn Kaye

A Crisis in the Making: The Latinx Threat Narrative and U.S. Border Enforcement Spiral.

Daniel de Torres Barderi

Maria Claret Campana

At the Intersection of Security and Religious Management: The Case of Salafism and Salafi Imams in Catalonia.

Ricard Zapata-Barrero

Marieke A.H. Ekenhorst

“Don´t Touch My Hair” – Unheard Voices of Afrofeminisim in Spain.

Gemma Pinyol-Jiménez

Michéle Foege

Building Peace by Distance: Taking the Example of Palestinians and Israelis Living in Barcelona.

Martin Lundsteen

Micol Montesano

Social Capital within the Camp: Italy and the Case of Asylum Seekers in ´Extraordinary Reception Centers´.

Dirk Gebhardt

Pablo Martínez Roca

Nowhere over the Rainbow: Discrimination, Migratory Syndrome and Stigma in LGBT + Migrants and Refugees.

Veronica Benet-Martinez

Pablo André Viteri Moreira

The Quito Process: The First Step towards a Regional Agreement on Migration in South America.

Dirk Gebhardt

Stephanie Halperin

A Reinterpretation of Spanish Identity: Dual Citizenship for Sephardic Jews.

Zenia  Hellgren

Master Thesis Details of the 9th Edition of the Master´s program in Migration Studies (2017-2018)

Alejandra Chávez Tristancho

How Could We Take Advantage of Diversity? An Analysis from the Private to the Public Sector.

Gemma Pinyol-Jiménez

Ana Calvo Sierra

Offshoring Asylum in the EU: An Analysis of the Limits Imposed by the European Standards of Human Rights.

Silvia Morgades Gil

Chiara Scalera

Residential Segregation and Islamic Radicalisation: The Case of Second-generation Muslim Immigrants in Catalonia.

Zenia Hellgren

Dino Islamagic

National and Cultural Identity among Second Generation Immigrants Case Study: Second Generation Bosnians in Norway.

Dirk Gebhardt

Fernanda Honesko

The Lack of International Protection for Environmental Migrants.

Aida Torres Pérez

Giulia Dagonnier

Access to Healthcare among Migrant Women in Brussels: Residential Segregation and Intersectionality.

Zenia Hellgren, Jean-Michel Lafleur, Daniela Vintila

Gülce Şafak Özdemir

Solidarity Building in Practice: The Case Study of Barcelona.

Ricard Zapata-Barrero

Gulperi Destina Eryigit

Motivations behind the Study of Catalan by Immigrants in Barcelona.

Evren Yalaz

Julia Koopmans

Local Integration Policies for Temporary Migrants in the European Union: Filling the Gap between the Integration Needs of Transient Migrants and Settlement-oriented Policies.

Dirk Gebhardt

Juni Van Kleef

The Discourse of ‘Dutchness’: A Case Study about the Segregation and Discrimination in Amsterdam.

John Rossman Bertholf Palmer

Karina Melkonian

A Study of the Prevalence of Compassion Fatigue Among Humanitarian Workers.

John Rossman Bertholf Palmer

Kristina Rumenova Stankova

Bulgarian Elderly Population´s Perception of Immigrants: The Case of the “Migrant Hunters”.

Juan Carlos Triviño Salazar

Natasha Tavares

Different Immigrant Groups, Varying Threats and Distinct Emotions.

Verónica Benet-Martínez

Paola Aiello

Narratives of Migration through Political Discourses. The Italian Case of Salvini: 2014  European Parliament Election - 2018  National Political Election.

Marco Martiniello

Saskia Natalia Basa

Is the Cooption of LGBTI Claims Fuelling Racism and Islamophobia? Reflections on the Rise of Right-Wing Homonationalism in Europe.

Gemma Pinyol-Jiménez

Shaden Anwar Masri

EU Remote Control Policies and the Implications on Migrants´ and Asylum Seekers´ Rights: A Security-Based Approach. Turning a Blind Eye on Human Rights of Migrants.

Silvia Morgades Gil

Steffy Dubois

Political Mobilization of Irregular Street Vendors: The Case of Barcelona.

Marco Martiniello

Stéphanie Monique Martin

Mobilisation Contre les Centres de Retention pour Migrants: Comparaison entre la Belgique et l’Espagne.

Christophe Dubois

Stephen Bolmain

Immigrants and Nationalists: Political Participation of Immigrants in the Contemporary Catalan Nationalist Movement.

Marco Martiniello

Yuri Yu

Right to Work vs Self-reliance: A Critical Analysis of Economic Integration of Refugees in the Segmented Labour Markets in Europe.

Iván Martín

IMAGES

  1. Legacy Data Migration: A Comprehensive Guide

    data migration thesis

  2. Critical Steps in a Data Migration Strategy

    data migration thesis

  3. Data Migration: Process, Strategy, Types, and Key Steps

    data migration thesis

  4. The basics of data migration: A comprehensive guide

    data migration thesis

  5. Data Migration Layers shows Data Migration Layers architecture. This

    data migration thesis

  6. Configuration Data and Data Migration

    data migration thesis

VIDEO

  1. HOW TO DO DATA INTERPRETATION IN THESIS EASILY (UNDER 30 MINS)

  2. Tesis Doctoral Ph.D. Doctor Internacional en Estudios Migratorios Doctoral Thesis Migration Studies

  3. Data Migration Demo: Mapping & Transformation

  4. What People Don't Tell you about Self-Funded PhD from New Zealand!

  5. Seamless On-Premises MySQL Migration to Azure Database

  6. CHAPTER-4 OF A THESIS

COMMENTS

  1. PDF Data migration, a practical example from the business world

    There is several different types of data migration, the data migration process is simply the process of moving data between different storage units, computer systems or formats. This thesis is focusing upon a practical example of data migration between database systems and the work that is needed to make a migration possible.

  2. PDF Developing a Structured Approach or Framework for Evaluationg the

    Developing a structured approach or framework for evaluating the efficiency of data migration process Master's thesis 2024 67 pages, 19 figures, 5 tables Examiners: Professor Aki Mikkola and Ilkka Donoghue, D.Sc. (Tech.) Keywords: Data migration, Product life cycle, Product data management, Engineering change

  3. PDF DATA MIGRATION FROM STANDARD SQL TO NoSQL A Thesis Submitted to the

    research in this document is to suggest a methodology for data migration from the RDBMS databases to the document-based NoSQL databases. Data migration between the RDBMS and the NoSQL systems is anticipated because both systems are currently in use by many industry leaders. This thesis presents a Graphical User

  4. PDF Development of a General Data Migration Framework in a Case ...

    The outcome of this thesis was a general data migration framework. 1.3 Thesis Outline The first section introduces the business challenge as well as the objective of the thesis. The second section describes the research approach, data collection and project plan. In the third section, the current state analysis is described.

  5. PDF Improving master data quality in data migration of ERP implementation

    This thesis studied the data from ERP implementation project perspective. The main interest was to understand how the data quality can be improved in the data migration of an ERP system implementation. Firstly, this chapter provides a short background for the study. Secondly, the research problem, questions, objectives and focus are introduced.

  6. Data Migration

    Thesis is a cloud-based student information system for higher education. Learn how we can help migrate your student data accurately and securely. ... We have an entire data migration strategy to transfer student records, enrollment information, academic history and other crucial data points to your new SIS. Rest assured, we'll maintain data ...

  7. PDF Automating Code and Data Migration After Schema Refactorings

    This thesis describes two techniques for automating web application developer tasks cre-ated when the application's underlying database schema is refactored. These schemas are generally refactored to improve performance or maintainability, but doing so creates two programmer tasks: code migration and data migration. My research with the UT ...

  8. PDF A MapReduce based Algorithm for Data Migration in a Private Cloud

    For my thesis, I have proposed a novel architecture and algorithm to study this phe-nomenon. I have used MapReduce data processing software framework within a private Cloud environment to determine the data loss. I have proposed metrics such as e ciency, speed, computation time and the cost of data migration and formulae for these metrics to

  9. (PDF) Data Migration

    Sarmah S. 2018, Scientific & Academic Publishing. This document gives the overview of all the process involved in Data Migration. Data Migration is a multi-step process that begins with an analysis of the legacy data and culminates in the loading and reconciliation of data into new applications. With the rapid growth of data, organizations are ...

  10. Data migration: relational RDBMS to non-relational NoSQL

    Data migration: relational RDBMS to non-relational NoSQL. As a part of achieving specific targets, business decision making involves processing and analyzing large volumes of data that leads to growing enterprise databases day by day. Considering the size and complexity of the databases used in today's enterprises, it is a major challenge for ...

  11. Data Migration from Legacy Systems to Modern Database

    Data migration is the process of transferring data from one system to another. Data. migration differs from data movement, data integration and conventional ETL (extract, transform and load) respectively by migrating to a new environment, by being a one-time. process and by maintaining the consistency of usage.

  12. A Review on Database Migration Strategies, Techniques and Tools

    This thesis contributes a solution for migrating RDBs into object-based and XML databases. ... this paper reports some early results from a long-term project to provide support for the migration ...

  13. (PDF) Data Migration

    Abstract This document gives the overview of all the process involved in Data Migration. Data Migration is a multi-step. process that begins wi th an analysis of the legacy data and culm inates in ...

  14. PDF Dynamic Computation Migration in Distributed Shared Memory Systems

    to pure data migration; always using data migration for reads only improves performance by 5%. Keywords: computation migration, data migration, distributed shared memory, parallel program-ming Thesis Supervisor: M. Frans Kaashoek Title: Assistant Professor of Computer Science and Engineering Thesis Supervisor: William E. Weihl

  15. Human migration: the big data perspective

    We concentrate on three phases of migration, at each phase describing the state of the art and recent developments and ideas. The first phase includes the journey, and we study migration flows and stocks, providing examples where big data can have an impact. The second phase discusses the stay, i.e. migrant integration in the destination country.

  16. PDF Big Data and Environmental Migration: Can Google Trends Explain

    BIG DATA AND ENVIRONMENTAL MIGRATION: CAN GOOGLE TRENDS EXPLAIN MIGRATION DRIVERS? A Thesis submitted to the Faculty of the Graduate School of Arts and Sciences of Georgetown University in partial fulfillment of the requirements for the degree of Master of Public Policy By Lilith Tromblay, B.A. Washington, D.C. April 22, 2021

  17. GitHub

    About. BSc thesis: System for migrating social network data from heterogeneous sources to a graph database. Authors: Gabriel Kępka, Piotr Makarewicz

  18. PDF A Data-Driven Analysis of Environmental Migration By Kelsea Best

    A Data-Driven Analysis of Environmental Migration in Coastal Bangladesh By Kelsea Best Thesis Submitted to the Faculty of the Graduate School of Vanderbilt University in fulfillment of the requirements for the degree of MASTER OF SCIENCE in Earth and Environmental Sciences August 9, 2019 Nashville, TN Approved: Jonathan Gilligan, Ph.D.

  19. Where to get Data: a collection of resources for your thesis

    Google has a dedicated search engine for datasets. It is freely available and index data that implement a particular schema.org format. Some of the data may be behind paywalls. However, academics and students usually can contact the data provider in order to get access to them or to a selected portion that is sufficient for their study.

  20. Data Migration: Process, Strategy, Types, and Key Steps

    data auditing and profiling, data backup, migration design, execution, testing, and. post-migration audit. Key phases of the data migration process. Below, we'll outline what you should do at each phase to transfer your data to a new location without losses, extansive delays, or/and ruinous budget overrun.

  21. A Complete Data Migration Checklist For 2024

    Determine backup procedures in case of data loss or errors during migration. Establish a rollback plan in case any issues arise, allowing for a quick reversal to the previous state if needed. Determine how long the old data/system needs to be archived. Define the methods and storage solutions for archiving.

  22. Former Thesis Topics

    Master Thesis Details of the 12th Edition of the Master´s program in Migration Studies (2020-2021) Master Students, Thesis Topics and Supervisors. Name. of Student. Topic. Supervisor. Federica Peloso. The consequences of climate migration with a focus on gender and intersectionality. Zenia Hellgren.

  23. PDF Jørgen Carling and Mathilde Bålsrud Mjelva Survey instruments and

    Data on migration aspirations have in part been used in attempts to predict or forecast migration flows. Even though most prospective migrants face daunting obstacles and end up staying, variations in the incidence of migration can shed light on the evolution of migration flows. Moreover, there are additional reasons for