16-billion credential exposure
Researchers compiled roughly 16 billion login credentials from infostealer logs, phishing kits and prior breaches — the largest credential exposure ever disclosed.
- Target
- 16-billion credential exposure
- Date public
- 19 June 2025
- Sector
- Technology
- Attack type
- Data Breach
- Threat actor
- Aggregated infostealer operators (multiple)
- Severity
- Critical
- Region
- Global
In June 2025 researchers compiled and released what is now believed to be the largest collection of stolen username-password pairs ever assembled — sixteen billion records, drawn primarily from years of infostealer malware infections on individual users' personal devices. The dump doesn't represent a single breach. It's a curated rollup of countless smaller compromises, ready for criminal use against any service where the affected users reused their passwords. Anyone who uses the same password across multiple services is, statistically, somewhere in this dataset. The credential dump fuels the ongoing wave of account-takeover attacks against email, banking, retail and crypto-exchange accounts that defines this period of cybercrime.
What happened
In June 2025, researchers at Cybernews disclosed the existence of around thirty unsecured datasets — the largest exceeding 3.5 billion records each — collectively containing approximately sixteen billion username-and-password pairs. The compilation was discovered openly accessible on the public internet for several weeks before the relevant hosting infrastructure was taken down.
It is the largest credential exposure ever disclosed by an order of magnitude. It is not, however, a breach in the conventional sense. The sixteen billion records did not come out of a single intrusion. They were the product of years of aggregated infostealer activity, prior breach dumps, and phishing-kit harvests, stitched together by criminal data brokers and re-released as a single super-dataset. Have I Been Pwned’s analysis confirmed that a substantial fraction of the records overlapped with previously-known breaches, but enough were novel — particularly fresh infostealer captures from the preceding eighteen months — to make the disclosure materially significant.
The headline number was the story. Sixteen billion records is more than two for every human alive. The aggregation is, in effect, a receipt for what cybersecurity researchers had been documenting for the better part of a decade: the credential-reuse problem is real, the infostealer industry is industrial-scale, and the long tail of unmanaged personal devices feeds it continuously.
How it worked
The pipeline that produces a sixteen-billion-credential aggregate has three stages: capture, aggregation, and resale.
Capture happens primarily through commodity infostealer malware — RedLine, Lumma, Vidar, StealC, and the dozen or so successor families that filled the gap when each previous generation was disrupted. Infostealers are not particularly sophisticated; they target browser-stored credentials, password manager exports, cookie jars, cryptocurrency wallet files, and Discord or Steam tokens, and exfiltrate everything in a single burst. They are typically distributed through cracked-software lures, malicious advertising, fake software updates, and SEO-poisoned download pages. Most of the captures come from personal or unmanaged devices that are not within the visibility of any corporate security control, but corporate credentials end up in the haul whenever an employee uses a work email and password on a personal machine.
Aggregation is the work of criminal data brokers. Stealer logs are sold in bulk on Telegram and on a handful of underground forums. Brokers buy logs in volume, deduplicate, parse, and re-list them — sometimes by domain, sometimes by industry, sometimes by country, depending on the buyer they have lined up. The 2025 compilation was an aggregation of aggregations: a broker, or a small number of brokers, had assembled the deduplicated output of years of stealer activity into a small number of monolithic datasets and stored them, briefly, on infrastructure that was not adequately access-controlled.
Resale or, as in this case, accidental disclosure closes the loop. The Cybernews researchers did not steal the data; they discovered it on misconfigured cloud storage, indexable from public scanners. That fact — that the largest credential exposure on record was the result of negligent storage by a criminal aggregator — is one of the more persistent ironies of the case.
Timeline
- 2023–2025 — Sustained infostealer campaigns capture credentials from millions of personal and unmanaged devices.
- Early 2025 — Criminal aggregators begin assembling deduplicated super-datasets from years of stealer logs.
- May–June 2025 — Aggregated datasets surface on misconfigured public cloud storage.
- 19 June 2025 — Cybernews discloses the discovery; coverage by mainstream outlets follows within hours.
- 20–25 June 2025 — Hosting infrastructure taken offline; Have I Been Pwned ingests the new portion.
- July–August 2025 — Sustained spike in credential-stuffing activity against major consumer services as buyers operationalise the dataset.
What defenders should learn
The sixteen billion is a noisy headline; the more useful number for defenders is the proportion of the dataset that came from infostealer logs less than eighteen months old. That is, the credentials your employees and customers are typing today are showing up in criminal datasets within months — not years. Any control that depends on attackers not knowing a password is now operating on a clock measured in weeks.
Two operational lessons follow. The first is that single-factor authentication is a liability, not a baseline; if your service still allows it for any user category, this dataset has materially raised the cost of that decision. The second is that the boundary between corporate and personal devices, which most organisations have been treating as a soft policy line, has hardened into an attack surface. Employees who reuse passwords between work systems and personal services on personal devices are, in effect, leaking corporate credentials through an environment outside corporate control.
The defender’s response is not novel. Phishing-resistant MFA, conditional access, credential-leak monitoring, and a clear position on what work credentials may be entered on what devices have been the right answer for years. The 2025 aggregation simply removes any remaining excuse for not having implemented them. Andy will layer the segmentation and identity-architecture lens in over time; the underlying observation stands on its own.
Controls that would have helped
Defender controls catalogued in the Controls Desk that would have changed the outcome of this incident, or limited its blast radius. Sourced from regulator and framework guidance — never vendors.