Collecting “Big Data” to Understand the Impact of Global Internet Censorship and Surveillance
march 24, 2017

On Friday, March 24, 2017 at 11:00 am in Cramer 221, Dr. Jed Crandall of UNM will give a talk on Collecting “Big Data” to Understand the Impact of Global Internet Censorship and Surveillance.
Abstract
Censorship and surveillance on the Internet is a global phenomena with far-reaching
and transformative effects on society, yet research on this phenomena is still very
nascent and is limited in scope (e.g., to a single country or a short timeframe).
Important questions go unanswered. For example, how commonly are support websites
made inaccessible to at-risk populations (such as domestic abuse victims) because
they are mis-categorized as pornography? What role do software and Internet media
companies, either
intentionally or unwittingly, play in state surveillance in various parts of the world?
Who decides which keywords trigger censorship or surveillance in different market
segments for different countries? How are the national-scale firewalls that limit
Internet traffic evolving?
Longitudinal datasets that are global in scope are needed to truly understand the
impact and nature of Internet censorship and surveillance, but how do you collect
large data sets about a phenomena that is clouded in secrecy? In this talk I’ll discuss
two research thrusts that my group is pioneering that each
have the potential to scale to truly “big data”.
One research thrust is TCP/IP side channels, where it’s possible to measure conditions
about the Internet between any two points in the world without having any infrastructure
at either point or in between. In other words, using a single Linux machine here in
North America, we can, for example, determine if
an IP address in Zimbabwe can communicate with another IP address in Saudi Arabia
or if a firewall restricts their communications. It sounds like magic, but I’ll explain
how this is made possible through spoofed return IP addresses and careful monitoring
of remote machines’ network stack state. Our goal is to measure Internet censorship
everywhere, all the time.
The second research thrust is reverse engineering. We are collaborating with the Citizen Lab at the University of Toronto to reverse engineer closed-source software and reveal its secrets. Some companies implement censorship and surveillance within their software, while others make claims about privacy and cryptography that aren’t true and thereby put the communications of journalists, activists, ethnic minorities, and many others at risk. The large amount of software that’s out there and is being used by at-risk populations makes this an essentially “big data” problem.