These data doors include facial recognition sensors, ID scanners and sensors for collecting electronic device information.
At the Aq Mosque in Urumqi, 2018.
China’s ‘data doors’ scoop up information straight from your phone
The security screeners scan more than your face, picking up MAC addresses and IMEI numbers
Facial recognition devices have become ubiquitous across China. But what you probably didn’t know is that some of these machines can snatch up information straight from your smartphone.
While they look like regular metal detectors on the outside, they’re much more than that. Aside from facial recognition and ID card verification, the so-called “three-dimensional portrait and integrated data doors” vacuum up MAC addresses, IMEI numbers and other identifying information from electronic devices. This data is unique to a user’s hardware, and it could potentially be used to track people.
A new report from Human Rights Watch uncovered the use of these data doors at certain checkpoints in Xinjiang, where the government is using heavy surveillance to monitor the local Uyghur Muslim minority.
“People that went through it only knew that they were going through facial recognition, but they didn’t know identifying information from their electronic devices was also being collected to be logged and tracked,“ said Maya Wang, a senior researcher for China at Human Rights Watch.
China already has the biggest video surveillance network in the world called Skynet. It’s also trialing facial recognition blacklists such as those that shame jaywalkers, unlicensed drivers and even bad tourists.
Collecting this kind of information from electronic devices is a new level of privacy invasion. A data door maker called Pingtech explains in a patent that in addition to IMEIs, the devices can pick up mobile phone Wi-Fi MAC addresses, IMSI and ESN numbers for identification and location tracking.
Data doors, however, are not the only way it’s happening. According to Techcrunch, a smart city system with facial recognition cameras in one Beijing districts has also been equipped with sensors that monitor Wi-Fi enabled devices, suggesting it can collect IMEI and IMSI numbers. The system was discovered by Condition:Black security researcher John Wethington after the database was left accessible without a password.
What exactly this information is being used for remains an open question. Numbers such as IMEI are unique identifiers assigned to SIM-capable devices like mobile phones. Independent cyber security expert Greg Walton, who worked on the HRW report, said that aside from identification, mass transit systems might want to harvest unique identifiers from devices to measure traffic.
But this kind of information can be used to track people physically. In many countries, IMEIs and other information from phones are used by the police to track down stolen phones, missing people or suspects (they still need a warrant, at least in the US).
A device’s identifying numbers such as IMEI and IMSI could serve as a beacon for authorities. When this is combined with other data from facial recognition, surveillance cameras, license plates, or even phone records and social media posts, a clearer picture of a person’s life emerges.
“Now I can see who you talk to, on what devices, when you physically met with them,” Wethington explained.
There’s currently no evidence of physical tracking. However, in Xinjiang, where authorities are monitoring and incarcerating the local Uyghur population on a massive scale, the data picked up from electronic devices is being logged in the Integrated Joint Operations Platform (IJOP). This platform is being used by local police to track suspicious behavior, which can be interpreted pretty broadly. Things like not using front doors, not talking to your neighbors or using Virtual Private Networks (VPN) can all be seen as suspicious behavior, according to the report.
The smart city system uncovered in Beijing also used its facial recognition capabilities to identify Uyghurs and individuals with criminal convictions and known drug abuse, TechCrunch’s analysis showed.
Researchers at HRW suggest that the Chinese police are using all this data to develop capabilities for something called reality mining. This is a term for machines collecting and analyzing data on human social behavior to predict patterns of behavior and map social relationships.
This isn’t inherently negative. According to MIT, reality mining could be used for things like stopping the spread of infectious diseases.
Wethington, however, describes it as behavioral surveillance. It relies on spotting anomalies and changes in people’s behaviors that could indicate a threat such as building a bomb or becoming a terrorist.
“The problem is that it’s subject to interpretation and rife for abuse,” Wethingon said.
Other countries are also performing surveillance, he added, but none on the scale of China.
Xinjiang and Beijing are likely not the only places in China using the technology. Dilusense, another company that sells data doors, explains on its website that Yiwu city uses its systems to monitor train stations and other public spaces, especially those used by the Muslim population. The company’s systems are also being used at the Hong Kong-Zhuhai-Macao bridge, although it’s not clear whether all of these locations also collect electronic device information.
Reality mining is the collection and analysis of machine-sensed environmental data pertaining to human social behavior, with the goal of identifying predictable patterns of behavior. In 2008, MIT Technology Review called it one of the “10 technologies most likely to change the way we live.”
Reality mining studies human interactions based on the usage of wireless devices such as mobile phones and GPS systems providing a more accurate picture of what people do, where they go, and with whom they communicate with rather than from more subjective sources such as a person’s own account. Reality mining is one aspect of digital footprint analysis.
Reality Mining is using Big Data to conduct research and analyze how people interact with technology everyday to build systems that allow for positive change from the individual to the global community. Reality Mining also deals with data exhaust.
Individual Scale (1 person)
Individuals use mobile phones, tablets, laptops, cameras, and any device connected to the internet for a variety of purposes, therefore creating a variety of data from GPS locations to frequently asked questions on Google. Mobile phones carry so much data about the individual that now phones can suggest restaurants based on our searches, visited places, book preference, and even guess the ends of sentences we type. A simple application of Reality Mining is listening to voices and understanding speech patterns to diagnose medical problems such as the simple flu to even early onset Parkinson’s. More powerful phones also allow for calendar customization and event tracking which display behaviors within individuals, what is deemed important enough to track. Social websites also allow researchers to view snapshots of a person’s life by following status updates on FaceBook or tweets from Twitter. Even more specific, a recent app called Snapchat allows users to post videos, pictures, or even live streams of exactly what they’re doing when they’re doing it, strong indicators of behaviors and interactions with the world. In 2004, MIT conducted the Reality Mining Project which gave 100 MIT students a Nokia 6600 which was tracked in a variety of ways by the researchers. The Cell Tower ID #’s (a very cheap and unobtrusive way to measure location), the status of the phone (charging or idle), and any use of the phone’s applications (games, web surfing, etc…). They found that by collecting this kind of data, they could predict with high accuracy the behaviors of the students, for example, if one of the students woke up on a Saturday morning at 10 AM, the researches could predict what they were going to do that day using “eigenbehaviors”. This new way of understanding data opened up doors for new research and possibly even larger survey research with detailed and accurate statistics. There are hundreds of websites offering software for mobile phones that will track just about everything the phone does, useful for worried parents or people who want to increase their personal productivity. This data is then uploaded to a server and can be accessed at any time.
Although already a lot of data can be collected from personal devices, they only make up a part of a person’s life. Reality Miners can also use biometric devices to measure physical health and activity. There are many devices like this such as the Fitbit, Nike+, and Polar and Garmin GPS watches. There is even an app called Sleep Cycle for iPhone and Android users that measures sleep quality, which includes the amount of sleep and even optimal alarms settings. Using this data, Reality Miners may be able to measure one’s actual health and processes that allow us to function (or dysfunction). Heart attacks generally don’t have any longitudinal indicators, but using all this data or even when a person engages in Lifelogging can create date useful to the medical field and track the lifestyles of those who undergo heart attacks to then create preventative guidelines. There are several ways to start Lifelogging, for instance Google has its own device called Google Glass that has a Heads-Up-Display (HUD), a microphone, a processor, and a camera. These are all ways to log information in specific directories.
Community Scale (10 to 1,000 people)
The way researchers have started to observe and record behaviors in large groups was by using RFID badges. Data is also recorded in work places using Knowledge Management Systems that try to improve worker productivity and efficiency, although a short-coming of this is the inability to converge the social and technological cultures of the work place, therefore providing incomplete behavioral data. Another way to measure larger groups of people in a community is through conference attendance. This data allows researchers to know where participants are from, ethnic demographics, and the actual number of people attending the event. Some conferences use smart-badges with more functions than the standard RFID badges. Companies like Microsoft and IBM have used them to record the number of people they interact with during the conference and allow people to answer survey questions. The smart-badges also record vocal interactions and when attendees are at certain booths and can even alert booth workers when certain profiles enter within a certain range of the booth. Smart-badges have obvious advantages for gathering data for reality miners. In 2009, a company called nTag, which was then acquired by Alliance Technology used nTag technology which allows for users to even be notified who to talk to and it’s able to exchange business cards electronically. Another type of data reality miners are looking are climate and environmental information. They collect data from neighborhoods by employing air-quality sensors which records carbon dioxide and nitrogen oxides as well as the general climate. Information like this could help policy makers decide whether to act or not or to see progress. Another way to collect data about the surrounding is through Project Noah. Project Noah was an effort to collect data on types of plant species by geotagging pictures of plants and fungi people upload, allowing users to see the kind of ecosystems users live in. This helps schools and students who want to collect data for projects, but also for bird-watchers to know what kind of birds are in the area.
City Scale (1,000 to 1,000,000 people)
In general terms for this section, a city is defined by 1,000 to 1,000,000 people. One way data is collected on a city scale is through collecting data on traffic with traffic signals and speed cameras. Data can also be collected from police reports and road scanners as well as GPS from mobile applications. Using this kind of traffic data, cities can create routes that would best allow for efficient movement and flow of traffic. A company called Inrix, started in 2010, has been compiling data on traffic and buys data from bridge operators and other transportation systems. It uses this data to predict traffic routes and time of congestion. Another way traffic can be monitored is through bluetooth technology, which is a technology that Inrix does not consider. The University of Maryland completed a project in 2012 that demonstrated that two Bluetooth sensors permanently placed two miles apart could accurately detect traffic speeds. All of this combined can be created to make route-suggestion algorithms to help people get to and from places in an efficient matter that, additionally, the route can update itself in real time using these type of sensors and information. Notable start-up, now a subsidiary of google, Waze, which also collected data from users (anonymously) who reported accidents and this game them in-app currencies and rewards. For crime on the city scale, the first way to collect and view data is through historical research of previous reports within any area. Now, more complex algorithms automatically place officers in places of high crime rates before any actual crime has been committed. Since 2005, the Memphis Police Department has been using a program called Blue CRUSH (Criminal Reduction Utilization Statistical History) which uses the police reports and uses heat maps to distinguish between high and low areas of crime. This program updates itself weekly and allows to the police department to change tactics accordingly. Using this kind of data will allow police departments to interact with the society in a much more meaningful way, also allowing preventative work to be done rather than rehabilitative work.
National Scale (1,000,000 to 100,000,000 people)
On the national scale, government play a much larger role. Census data are by far the easiest to acquire. Many nations make their census findings public via websites from which data can be downloaded and visualized for further analysis. “In addition, the World Bank conducts international surveys and compiles census data from all participating nations— a sort of one-stop shop for information on its member countries. These data are publicly accessible: they can be downloaded and independently sorted and analyzed. Importantly, the World Bank offers an open API that allows programmers to integrate various data into software applications. Using World Bank data, Google has integrated a simple visualization tool into its search results; a search query on the population of Botswana will pull up the number, the dated World Bank source, and a graph showing population change over decades”. Another way to collect data is through call data record (or call detail record) which is just a log of phone calls and texts with information such as time and location of both the caller or sender and the recipient. CDR’s allow phone companies to view human mobility trends. Major data companies like Google, Facebook, and Twitter also allow researchers to track cultural trends and even the when/where of the allocation of resources in time of natural disasters.
Global Scale (100,000,000 to 7,000,000,000 people)
The biggest worry for the world is the spread of disease and is one of reality mining’s best applications. With globalization, the ability to travel is unprecedented compared to previous histories. The United Nations has created an agenda called the Millennium Development Goals (MDG) which are eight goals that aim to improve the world. They collect population data, the first step to allowing for policy making on disease control, nations must first collect data on air travel as billions of people travel by air each year and sea travel. Air travel carries more people each year than sea shipments, but the primary reason for collecting data on shipments is that shipments often carry pests that carry diseases, food-borne illnesses, and sometimes invasive species of plant and animal. The idea of managing and collecting seems monumental, but the World Bank has already started which helps statistical software like MAPS which stands for Marrakech Action Plan for Statistics. MAPS aims to complete six objectives, which include these three;
- Planning statistical systems and preparing national statistical development strategies for all low-income countries
- Ensuring full participation of developing countries
- Setting up the International Household Survey Network, a global collection of household-based socioeconomic data sets
For people traveling on flights, a source of data is the International Air Transportation Association (IATA) which has been collecting data on about 90% of global air traffic on a monthly basis since 2000. This data allowed researchers and professionals to view the ability of disease to spread from certain location on Earth. Ships carry about 90% of global trade; in 2001, the Automatic Identification system was implemented to record the “comings and goings of sea traffic”.
Von der Idee zum Geschäft: Reality Mining
Was als eine von vielen vollmundigen Forschungsideen begann, entwickelt sich zunehmend zu einer kommerziellen Datendienstleistung, die es in sich hat: das “Reality Mining”. Aus den Bewegungsdaten von Mobilfunknutzern sowie allgemeinen demografischen und ökonomischen Daten erstellen Firmen wie Sense Networks aus New York oder Path Intelligence aus Portsmouth inzwischen detaillierte Verhaltensprofile von Verbrauchern, berichtet Technology Review in seiner aktuellen Ausgabe 5/09 (seit dem 17.4. am Kiosk oder portokostenfrei online zu bestellen).
Sense Networks etwa hat in den vergangenen drei Jahren dank Abkommen mit Netzwerkbetreibern, Datenmaklern und Taxifirmen mehrere Milliarden Datensätze angehäuft. Ein Basisdatensatz besteht aus Ortskoordinaten, Datum und einer nach Firmenangaben anonymen Identifikationsnummer, die einer konkreten Telefonnummer zugeordnet ist. Diese lassen sich mit weiteren Daten verknüpfen. Aus den Datensätzen werden zunächst mit speziell entwickelten Algorithmen die Bewegungsmuster von Millionen Menschen vor allem in den amerikanischen Ballungsräumen New York, Houston, Chicago und San Francisco verglichen und daraus sogenannte Mobilitätsgraphen für Verbraucher und bestimmte Orte errechnet.
Alex Pentland, Informatiker am Massachusetts Institute of Technology und einer der Gründer von Sense Networks, vergleicht das Reality Mining mit einer Röntgen-Analyse der Gesellschaft. “Wir müssen so gut wie nichts über eine Person wissen, aber wenn wir ihre Signale an die Umgebung beobachten, können wir mit erstaunlich hoher Treffsicherheit vorhersagen, ob jemand in einer Gruppe von Kollegen mehr Autorität besitzt, ob er etwas kaufen wird oder wie die Verhandlungen um eine Gehaltserhöhung ausgehen werden”, sagt Pentland. Sowohl Sense Networks als auch Path Intelligence – das wegen des strengeren britischen Datenschutzes weniger Daten nutzen kann als Sense – bieten ihre Analysen inzwischen als Dienstleistung für Banken, Eisenbahngesellschaften oder Flughafenbetreiber an.
Langfristiges Ziel sei laut Pentland eine Art Super-Telefonbuch für die Gesellschaft von morgen, in der alle Menschen nach “Verhaltens-Postleitzahlen” sortiert sind. Unternehmen von Banken über den Einzelhandel bis hin zu Bars und Restaurants sollen mit Hilfe solcher Verhaltensprofile bald in der Lage sein, ihr Marketing kundengenau zu automatisieren.