Mastodon

Have I Been Pwned

A 200-post collection

Soft-Launching and Open Sourcing the Have I Been Pwned Rebrand

Designing the first logo for Have I Been Pwned was easy: I took a SQL injection pattern, wrote "have i been pwned?" after it and then, just to give it a touch of class, put a rectangle with rounded corners around it: Job done! I mean really, what more did I need for a pet project with a stupid name that would likely only add to the litany of failed nerdy ideas I'd had before that? And then, to compress 11 and a bit years into a single sentence: it immediately became unexpectedly popular, I adde...

We're Backfilling and Cleaning Stealer Logs in Have I Been Pwned

I think I've finally caught my breath after dealing with those 23 billion rows of stealer logs last week. That was a bit intense, as is usually the way after any large incident goes into HIBP. But the confusing nature of stealer logs coupled with an overtly long blog post explaining them and the conflation of which services needed a subscription versus which were easily accessible by anyone made for a very intense last 6 days. And there were the issues around source data integrity on top of ever...

Processing 23 Billion Rows of ALIEN TXTBASE Stealer Logs

I like to start long blog posts with a tl;dr, so here it is: We've ingested a corpus of 1.5TB worth of stealer logs known as "ALIEN TXTBASE" into Have I Been Pwned. They contain 23 billion rows with 493 million unique website and email address pairs, affecting 284M unique email addresses. We've also added 244M passwords we've never seen before to Pwned Passwords and updated the counts against another 199M that were already in there. Finally, we now have a way for domain owners to query their en...

Experimenting with Stealer Logs in Have I Been Pwned

TL;DR — Email addresses in stealer logs can now be queried in HIBP to discover which websites they've had credentials exposed against. Individuals can see this by verifying their address using the notification service and organisations monitoring domains can pull a list back via a new API. Nasty stuff, stealer logs. I've written about them and loaded them into Have I Been Pwned (HIBP) before but just as a recap, we're talking about the logs created by malware running on infected machines. You k...

"Pwned", The Book, Is Now Available for Free

Nearly four years ago now, I set out to write a book with Charlotte and RobIt was the stories behind the stories, the things that drove me to write my most important blog posts, and then the things that happened afterwards. It's almost like a collection of meta posts, each one adding behind-the-scenes commentary that most people reading my material didn't know about at the time. It was a strange time for all of us back then. I didn't leave the country for the first time in over a decade. I bare...

Closer to the Edge: Hyperscaling Have I Been Pwned with Cloudflare Workers and Caching

I've spent more than a decade now writing about how to make Have I Been Pwned (HIBP) fast. Really fast. Fast to the extent that sometimes, it was even too fast: The response from each search was coming back so quickly that the user wasn’t sure if it was legitimately checking subsequent addresses they entered or if there was a glitch. Over the years, the service has evolved to use emerging new techniques to not just make things fast, but make them scale more under load, increase availability an...

Inside the DemandScience by Pure Incubation Data Breach

Apparently, before a child reaches the age of 13, advertisers will have gathered more 72 million data points on them. I knew I'd seen a metric about this sometime recently, so I went looking for "7,000", which perfectly illustrates how unaware we are of the extent of data collection on all of us. I started Have I Been Pwned (HIBP) in the first place because I was surprised at where my data had turned up in breaches. 11 years and 14 billion breached records later, I'm still surprised! Jason (not...

The Data Breach Disclosure Conundrum

The conundrum I refer to in the title of this post is the one faced by a breached organisation: disclose or suppress? And let me be even more specific: should they disclose to impacted individuals, or simply never let them know? I'm writing this after many recent such discussions with breached organisations where I've found myself wishing I had this blog post to point them to, so, here it is. Let's start with tackling what is often a fundamental misunderstanding about disclosure obligations, an...

Begging for Bounties and More Info Stealer Logs

TL;DR — Tens of millions of credentials obtained from info stealer logs populated by malware were posted to Telegram channels last month and used to shake down companies for bug bounties under the misrepresentation the data originated from their service. How many attempted scams do you get each day? I woke up to yet another "redeem your points" SMS this morning, I'll probably receive a phone call from "my bank" today (edit: I was close, it was "Amazon Prime" 🤷‍♂️) and don't even get me started...

Telegram Combolists and 361M Email Addresses

Last week, a security researcher sent me 122GB of data scraped out of thousands of Telegram channels. It contained 1.7k files with 2B lines and 361M unique email addresses of which 151M had never been seen in HIBP before. Alongside those addresses were passwords and, in many cases, the website the data pertains to. I've loaded it into Have I Been Pwned (HIBP) today because there's a huge amount of previously unseen email addresses and based on all the checks I've done, it's legitimate data. That...