GeorgiaLeaks: Data from 2011 causes confusion in 2020

Personal data of millions of Georgians

GeorgiaLeaks: Data from 2011 causes confusion in 2020

Share this story
THE FOCUS

Personal data of millions of Georgians surfaced in an online leaks forum

(Source: EtoBuziashvili/DFRLab via @underthebreach/archive)

This research is published as part of #ElectionWatch Georgia 2020, a collaboration between DFRLab and ON.ge supported by the EWMI/USAID. You can also read it in Georgian.

Seven months before the parliamentary elections in Georgia, the personal data of more than 4 million Georgians appeared in an online hacking forum. DFRLab’s analysis confirms that the data is from 2011, however, and has resurfaced.

The leaks come during a turbulent period for Georgia, with the country facing a tense situation over the global COVID-19 pandemic on one hand, and a lack of clarity regarding the procedures for fall parliamentary elections on other.

Cyber-attacks and data breaches are an ongoing concern for the country. In October 2019, the country witnessed a widespread cyber-attack that led to disruption of several websites, including the Office of the President. While the attack didn’t result in data breach, worries about such an incident have grown ever since that cyber intrustion.

According to ZDNet, which was the first to report about the breach, personally identifiable information of millions of Georgian citizens, including the deceased, was shared online at the end of March 2020. The report suggested that the leaked data included full names, ID numbers, home addresses, dates of birth, and mobile phone numbers.

(Source: ZDNet/archive)

The ZDNet article was picked up quickly in Georgia, though engagement with the article was moderate, mostly on Facebook and Twitter.

Engagement with ZDNet’s original article. (Source: BuzzSumo)
Engagement with Georgian translations of the article. (Source: BuzzSumo)

The leaked data was found by the Under the Breach, a data breach monitoring and prevention service that shared the data with ZDNet and the DFRLab.

New Story, Old Database

On March 28, 2020, user TheHarleyQueen uploaded the leaked database to Raidforums, a discussion board dedicated to online leaks. The user seems to have joined the forum in March 2020, as shown in the user information section in the bottom left of their profile page.

Image of TheHarleyQueen user page on Raidforums, with the join date listed as March 2020. (Source: raidforums.com)

The DFRLab analyzed the dataset to determine its veracity, as well as the accuracy of the reporting surrounding it. The data was scanned for malware and run on a virtual machine, as content shared on Raidforums sometimes contains malicious code. For instance, one of the leaks by WikiLeaks on Turkey’s ruling Justice and Development Party found that among the files there were more than 3,000 files that were infected with malicious code.

The file being an .mdb database required an mdb viewer, since mdb is an legacy format that was used prior to the release of database management software Microsoft Access 2007. The file format was a giveaway that the database seems to have been developed using a technology that would be fairly old for a circa-2020 election database.

One of the key findings can be noticed on the file itself: it was created in August 2011. Under the Breach also confirmed that the database appears to have been leaked around 2011 but had not observed its surfacing prior to 2020.

Image of the file with creation date implying it was made in 2011. (Source: DFRLab)

The file name “reestri” implied that the data seems to be a registry database. The lead document contained Georgian citizens’ ID number, last name, first name, father’s name, date of birth, registration date, “DMONAC” (the DFRLab couldn’t confirm the meaning of this acronymn), sex, card number, address, and region.

The DFRLab verified the authenticity of the database by going to the Georgian voter registration verification site voters.cec.gov.ge and searching for random people listed in the data set via the site’s search engine.

The data set includes people born in 1880. This supports the idea that it includes information about family descent, given that those born in 1880 would now be 131 years old.

Image of the leaked database showing that the document includes the information about people born in 1880. (EtoBuziashvili/DFRLab)

Another finding was that the data set includes the personal information of underaged Georgians. The data finishes with the people born in 2011, who would still be young to vote in a 2020 election.

Image of the leaked database showing that the document includes the information about people born in 2011. (EtoBuziashvili/DFRLab)

After the leaked data surfaced, the Central Election Committee of Georgia (CEC) stated that the database uploaded on Raidforums doesn’t match their official database because there are only 3.5 million Georgian citizens in the CEC election database and it does not include data regarding family descent.

While the leaked dataset dates to 2011, it could be used for various nefarious purposes, including privacy breaches, identity theft, and election-related intimidation. Even if the leaked data in itself is useless for influencing elections, the fact that it was circulated in the first place may raise serious perception problems and impact Georgians’ trust in democratic processes.


Eto Buziashvili is a Research Associate, Caucasus, with the Digital Forensic Research Lab.

Kanishk Karan is a Research Associate with the Digital Forensic Research Lab.

Follow along for more in-depth analysis from our #DigitalSherlocks.