computer typing
Clickstreams are anonymous reams of internet data - but could be tied to citizens Glenn Carstens-Peters/Unsplash

On the internet, you're not always as anonymous as you think. To demonstrate this, two researchers from Germany set up a fake website and were able to acquire vast swathes of data from "clickstreams", detailed records of every website you visit while surfing the web.

The pair, showing off their findings at Defcon 2017, a hacking conference hosted annually in Las Vegas, were able to use clickstream data – which is typically anonymous – to expose the porn habits of a judge, the drug-taking activities of a politician and a cybercrime investigation.

Journalist Svea Eckert and data scientist Andreas Dewes set up a fake advertising company and asked a series of data brokers for access to citizen's browsing data, which one provided free-of-charge.

According to The Guardian, the pair posed under the made-up brand and fake company website while claiming they needed the website data to boost a machine learning algorithm for marketing purposes.

What they ended up doing was exposing a shady world where citizens' internet browsing is collected, retained, analysed and traded – often via browser plugins that fail to advertise exactly where your personal information will end up.

Companies typically use clickstream analysis for market research, software testing or to judge how a human traverses through the internet. It can be exploited so marketing firms know exactly where someone is most likely to click an advertisement, for example.

The data is anonymised, but the Defcon research showed that it could easily be tied to individual citizens because it's often linked directly to social media accounts or online services. These included personal Twitter profiles, YouTube channels and online shopping accounts.

In one case data was collected from Google Translate, which stores individual queries. The researchers linked the activity to a cybercrime probe taking place in Germany as law enforcement used the popular service to translate letters to foreign police departments, The Guardian reported.

In total, they amassed a massive database containing data relating to roughly 3 million German internet users. They said that 95% of the data originated from 10 popular browser extensions.

"What these companies are doing is illegal in Europe but they do not care," Eckert told Defcon attendees, as reported by the BBC. "This could be so creepy to abuse," she continued. "You could have an address book and just look up people by their names and see everything they did."

The pair warned that the consequences of collecting clickstream data without strong protections could be severe. "After the research project we deleted the data because we did not want to have it close to our hands any more," Eckert noted. "We were scared that we would be hacked."

Not at Defcon? You can see the full presentation slides here.