3 minute read
IOM-Microsoft release public dataset on victims and perpetrators of trafficking
First-ever publicly available dataset will improve the production of privacy-preserving data and accelerate evidence-based policy in the fight against human trafficking.
The International Organisation for Migration (IOM) on 05 December released the first publicly available dataset linking the profiles of trafficking victims and perpetrators whilst preserving the anonymity and privacy of survivors. The dataset was produced via state-of-the-art technology developed in partnership with Microsoft Research.
The nature of the victimperpetrator relationships represents a valuable source of insight to better assist survivors and prosecute offenders. By making this information openly and safely available for the first time, IOM and Microsoft Research aim to share this technique with humanitarian actors worldwide to improve the production of privacy-preserving data and accelerate evidence-based policy in the fight against human trafficking.
The Global Victim-Perpetrator Synthetic Dataset is available on the Counter Trafficking Data Collaborative (CTDC) data hub – the first global data portal on human trafficking. The dataset includes IOM case data from over 17,000 victims and survivors of trafficking identified across 123 countries and territories, and their accounts of over 37,000 perpetrators who facilitated the trafficking process from 2005 to 2022.
“Making data on human trafficking widely available to stakeholders while protecting the safety and privacy of victims in a sustainable manner is crucial to developing evidencebased responses,” said Monica Goracci, IOM’s Director of Programme Support and Migration Management. “IOM is delighted to work with Microsoft Research in overcoming the challenge of sharing administrative data for analysis.”
Since the 2019 Tech Against Trafficking Accelerator Program, IOM and Microsoft Research have worked together to develop and refine an approach to generate synthetic data from CTDC’s sensitive victim case records. The resulting synthetic case records accurately preserve the statistical properties of the original victim data without representing actual victims.
A new extension to this approach, which incorporates the gold standard of “differential privacy”, generates synthetic data with quantifiable privacy guarantees against any privacy attacks, even across multiple data releases.
It’s an approach that promotes sustainable and long-term contributions to a shared evidence base in the collective fight against trafficking, making it possible to share more data and conduct more rigorous research while protecting privacy and civil liberties.
“At Microsoft, we believe everyone can benefit from collaborating around open data to make better decisions and tackle some of the world’s most pressing societal challenges,” said Darren Edge, Director at Microsoft Research and project lead.
“By protecting the privacy and safety of victims with synthetic data, and empowering policymakers to view, explore, and make sense of data through rich interactive dashboards, we are illustrating one of the many ways in which research technology can help to coordinate and amplify the efforts of anti-trafficking organizations around the world – or any organizations working to tackle human rights issues using open data.”
The new synthetic data solution is available as both open-source software and a free-to-use web application that enables creation of synthetic datasets interactively in the web browser.