Foreign intelligence and Strava’s ‘global heatmap’

Data thefts are nothing new. In the United States, several incidents attributed to China have involved vast amounts of personal information.

In 2014, the US health insurance company Anthem had data on almost 80 million customers stolen, including dates of birth, addresses, emails and employment information. The following year, United Airlines—the world’s second largest airline—had flight manifests and other data stolen, including passenger names, travel origins and destinations.

Most strikingly, over the course of 2014 and 2015, the US Office of Personnel Management (OPM), the government agency that manages security clearances, had data from 21.5 million personnel records stolen, including social security numbers, residency, education, employment, health, and criminal and financial history. The fingerprint records of 5.6 million federal employees were also stolen.

All three cases are suspected to be the work of state-sponsored Chinese hackers. The Chinese government denied that it was involved in the OPM hack and later arrested hackers connected to the breach. Curiously for a criminal action, the stolen OPM data hasn’t been used elsewhere for financial gain. Nor has the stolen Anthem data been used to make money.

The OPM breach involved a mass of information that would be of enormous value to a foreign intelligence organisation; it was described as a ‘treasure trove’ by former FBI director James Comey. US intelligence officials I’ve met were viscerally affected by the OPM breach, alarmed by both the personal nature of the information stolen and the sheer volume of the data taken. One prominent theory is that the hackers are intending to use the information to create a massive database on US intelligence personnel that will be mined to advance China’s intelligence interests.

When combined with the other large volumes of information taken, the OPM data can be cross-referred to reveal even more valuable information that doesn’t exist in any single dataset. Employment history combined with travel, financial and health data, for example, would allow Chinese intelligence agencies not only to identify people of intelligence interest, but also to immediately develop targeted profiles and identify patterns linking otherwise separate people or locations. An individual’s employment history combined with their travels might reveal the relationships between otherwise covert facilities. And cross-referencing employment data with financial and medical records might reveal possible avenues for recruitment or blackmail.

China’s push to become a world leader in artificial intelligence (AI) will make the incentives to collect this kind of comprehensive data greater than ever. Chinese intelligence agencies will be motivated to use new techniques and algorithms to mine their data. Part of the strength of AI is in its processing power and algorithms, but many AI algorithms rely on large amounts of data. Gathering more complementary large-scale datasets will mean better results.

It’s worth noting that in September 2015, China’s president Xi Jinping and US president Barack Obama reached an agreement that neither country would ‘conduct or knowingly support cyber-enabled theft of intellectual property, including trade secrets or other confidential business information, with the intent of providing competitive advantages to companies or commercial sectors’. That agreement, however, will make no difference whatsoever in this large-scale data theft. Cyber espionage is still very much on the table as a tool of statecraft.

Any organisation that holds large amounts of information is a target, but one source that would be of high interest is Strava’s activity-tracking data. Strava, a fitness activity tracking service, recently made headlines because its ‘global heatmap’ could be used to identify and profile military and intelligence bases. But Strava also holds private data that would be invaluable when combined with the other stolen datasets I’ve mentioned. Even sensible users with strict privacy settings would have their activity available to hackers. If that data is combined with previously stolen employment, financial, medical and travel data, it could be used to not only identify people of intelligence interest, but also provide information about their patterns of life, movements and exercise interests over potentially many years.

Recent large-scale hacks point to China’s voracious appetite for data to enhance its intelligence-gathering efforts. That information will be entered into big-data programs that will be part of China’s investment in becoming an AI powerhouse. Given that China has no compunction about collecting personal data of its own citizens, it is certain that it will aggressively seek out and use data about foreigners to advance its interests.

Individually, we all need to make sensible decisions about what data we share and how much of it we allow to be collected. And organisations that collect and store large amounts of personal information need to be aware that they are legitimate foreign intelligence targets and they will be pursued.