While probing the internet earlier this month, a cybersecurity researcher discovered unsecured databases maintained by Deep Root Analytics, a marketing and big data firm linked to the US Republican Party, containing personal information of nearly 200 million voters.
The researcher in question, Chris Vickery, came across the huge amount of personal records on 12 June and worked with US authorities to secure the database within 48 hours. It is believed the data was also compiled by two other firms, Target Point Consulting and Data Trust.
"Currently downloading what is, basically, the home address of every Trump supporter," Vickery tweeted on 12 June. "Understand the cloud before you upload to it."
He later added: "Apparently the RNC has a number assigned to every US voter (regardless of party)."
The data reportedly contained records from the presidential campaigns of 2008, 2012 and 2016. A folder for last year was included in the database however only included files for Ohio and Florida.
In total, the files contained more than 25 terabytes of personal data linked to 150 to 198 million people. Such information would typically be used to predict the behaviour of potential voters.
To put that into perspective, Politico reported that 200 million people had registered to vote in the US election in 2016.
According to UpGuard, the cybersecurity firm Vickery works for, each record contained names, home addresses, dates of births, registration information, ethnicity, party affiliation and more. It remains unclear at the time of writing if this sensitive data was accessed by anyone else.
"The RNC data repository would ultimately acquire roughly 9.5 billion data points regarding three out of every five Americans," the firm's security analyst Dan O'Sullivan wrote in a blog post.
He continued: "UpGuard's discovery — of perhaps the largest known exposure of voter information in history—is corroborated by technical evidence, as well as by the public statements of the responsible firms and political staffers.
"Spreadsheets containing this accumulated data—last updated around the January 2017 presidential inauguration—constitute a treasure trove of political data and modelled preferences used by the Trump campaign. This data was also exposed in the misconfigured database."
Data on public citizens can be gleaned from a variety of sources – and can even sometimes be purchased from individual US states. The sheer scale of this national voter database indicates information was collected and collated from a slew of companies and online services.
On its website, Deep Root Analytics says it "prides itself on presenting large-linked data sets in a useful, easy-to-use and compelling way" and that the team is "the most experienced group of targeters in Republican politics." The RNC paid the firm $983,000 between 2015 and 2016.
The data exposed by Deep Root Analytics appeared to have been altered specifically for the Republican National Committee (RNC) and was likely used to "create models for turnout and voter preferences," one source with experience in political strategy told Gizmodo.
"We take full responsibility for this situation," The founder of Deep Root Analytics, Alex Lundry, said in a statement. "Since this event has come to our attention, we have updated the access settings and put protocols in place to prevent further access," he added.
This is not the first instance of Vickery finding troves of voter information on the internet without adequate password protection. In June 2016, he found 154 million voter profiles on US citizens exposed online and more than 93 million from the Mexican government.
"This exposure raises significant questions about the privacy and security Americans can expect for their most privileged information," O'Sullivan continued.
"That such an enormous national database could be created and hosted online, missing even the simplest of protections against the data being publicly accessible, is troubling.
"The ability to collect such information and store it insecurely further calls into question the responsibilities owed by private corporations and political campaigns to those citizens targeted by increasingly high-powered data analytics operations."