Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It feels like there are some missing dots and connections here: I see how a concurrency or memory safety bug can accidentally exfil a private key into a debugging artifact, easily, but presumably the attacker here had to know about the crash, and the layout of the crash dump, and also have been ready and waiting in Microsoft's corporate network? Those seem like big questions. "Assume breach" is a good network defense strategy, but you don't literally just accept the notion that you're breached.


> but presumably the attacker here had to know about the crash, and the layout of the crash dump

If I were an advanced persistent threat attacker working for China who had compromised Microsoft's internal network via employee credentials (and I'm not), the first thing I'd do is figure out where they keep the crash logs and quietly exfil them, alongside the debugging symbols.

Often, these are not stored securely enough relative to their actual value. Having spent some time at a FAANG, every single new hire, with the exception of those who have worked in finance or corporate regulation, assumes you can just glue crash data onto the bugtracker (that's what bugtrackers are for, tracking bugs, which includes reproducing them, right?). You have to detrain them of that and you have to have a vault for things like crashdumps that is so easy to use that people don't get lazy and start circumventing your protections because their job is to fix bugs and you've made their job harder.

With a compromised engineer's account, we can assume the attacker at least has access to the bugtracker and probably the ability to acquire or generate debug symbols for a binary. All that's left then is to wait for one engineer to get sloppy and paste a crashdump as an attachment on a bug, then slurp it before someone notices and deletes it (assuming they do; even at my big scary "We really care about user privacy" corp, individual engineers were loathe to make a bug harder to understand by stripping crashlogs off of it unless someone in security came in and whipped them. Proper internal opsec can really slow down development here).


your statement:

> but presumably the attacker here had to know about the crash, and the layout of the crash dump

another statement from the article:

> Our credential scanning methods did not detect its presence (this issue has been corrected).

The article does not give any timeline when things happened.

Imagine the following timeline:

- hacker gets coredump in 2021, doesn't know that it contains valuable credentials.

- For data retention policy reasons, Microsoft deletes their copy of the coredump — but hacker just keeps it.

- Microsoft updates its credential scanning methods.

- Microsoft runs updated credential software over their reduced archive (retention policy) of coredumps. As that particular coredump doesn't exist anymore at Microsoft, they are not aware of the issue.

- hacker get scanner update.

- hacker runs updated credential scanner software over their archive of coredumps. Jackpot.


>... you have to have a vault for things like crashdumps that is so easy to use that people don't get lazy...

Let's assume a crash dump can be megabytes up to gigabytes big.

How could a vault handle this securely?

the moment it is copied from the vault to the developer's computer, you introduce data remanence (undelete from file system).

keeping such coredump purely in RAM makes it accessible on a compromised developer machine (GNU Debugger), and if the developer machine crashes, its coredump contains/wraps the sensitive coredump.

A vault that doesn't allow direct/full coredump download, but allows queries (think "SQL queries against a vault REST API") could still be queried for e.g. "select * from coredump where string like '%secret_key%'".

So without more insight, a coredump vault sounds like security theater which tremendously makes it more difficult for intended purposes.


Everything is imperfect, but where I work crashdumps are uploaded straight to a secure vault and then deleted from the origin system. The dumps are processed, and insensitive data is extracted and published with relatively lenient access controls. Sensitive data, such as raw memory dumps, require a higher tier of permissions. In order to be eligible for that higher tier, your developer machine has to be more locked down than that of people who are not in the secure group. (You also need to have a reason to need more access.)

Given that stack traces, crash addresses, and most register contents are considered to be security insensitive, most people don't really need access to the raw dumps.

It's far from perfect, but it would be unfair to call it "security theater". It seems like a pretty decent balance in practice. Admittedly, we have the slight advantage of several hundred millions installs, so the actual bugs that are causing crashes are likely to happen quite a few times and statistical analysis will often provide better clues than diving deep into an individual crash dump.


> Everything is imperfect, but where I work crashdumps are uploaded straight to a secure vault and then deleted from the origin system. The dumps are processed, and insensitive data is extracted and published with relatively lenient access controls. Sensitive data, such as raw memory dumps, require a higher tier of permissions. In order to be eligible for that higher tier, your developer machine has to be more locked down than that of people who are not in the secure group. (You also need to have a reason to need more access.)

From my understanding, this is more or less how the Microsoft system was designed with credential scanning and redaction over coredumps, but a chain of bugs and negligence broke the veil.


While your points are all valid theoretically, keeping stuff off of developer filesystems can still help a lot practically.

This attacker probably (it's unclear, since the write-up doesn't tell us) scanned compromised machines for key material using some kind of dragnet scanning tool. If the data wasn't on the compromised filesystem, they wouldn't have found it. Even though perhaps in theory they could have sat on a machine with debug access (depending on the nature of the compromise, this is even a stretch - reading another process's RAM usually requires much higher privilege than filesystem access) and obtained a core dump from RAM.

Security is always a tension between the theoretical and the practical and I think "putting crash dumps in an easy-to-use place that isn't a developer's Downloads folder" isn't a bad idea.


Ephemeral compute/VM/debug environment with restricted access.

Tear down the environment after the debugging is done.

Keeping the crash dumps in a vault presumably allows more permission/control that an internal issue tracker (usually anyone can access the issue tracker). At least a vault can apply RBAC or even time based policies so these things aren't laying around forever.


Oh, yeah, the amount of times I see HAR file with things in it just floating in JIRA is not fun.


Why does an attacker need symbols?


They're not needed, but they will speed up comprehension of the contents of the program.

... but they're definitely not strictly necessary.


The article says that the employee compromise happened some time after the crash dump had been moved to the corporate network. It says that MS don't have evidence of exfil, but my reading is that they do have some evidence of the compromise.

The article also says that Microsoft's credential scanning tools failed to find the key, and that issue has now been corrected. This makes me think that the key was detectable by scanning.

Overall, my reading of this is that the engineer moved the dump containing the key into their account at some point, and it just sat there for a time. At a later point, the attacker compromised the account and pulled all available files. They then scanned for keys (with better tooling than MS had; maybe it needed something more sophisticated than looking for BEGIN PRIVATE KEY), and hit the jackpot.


This article says the lack of exfil evidence is "because of log retention policies", ie, they deleted the logs since the exfil happened.


> and hit the jackpot.

And how often do you hit the jackpot? For larger lotteries, it's less than once in a million. So that leads to two equally unpleasant alternatives:

1. The attacker was informed where to find the key.

2. The attackers have compromised a large part of Microsoft engineering and routinely scan all their files.


Red teams and malicious actors have plenty of tools which automated the looting and look for juicy things. Crash dumps, logs, and many others... The bottom line is that if there is a secret stored on disk somewhere, it won't take long for a proper actor to find it.


Oh, "jackpot" was just a figure of speech, I didn't intend to imply any particular probability. Not sure what the chance of finding sensitive information in the private files of an engineer is, but I would guess a lot better than one in a million. One in a hundred, maybe? One in ten?

I think the most likely explanation is that this actor routinely attempts to compromise big-tech engineers using low-sophistication means, then grabs whatever they can get. Keep doing that often enough, for long enough, and you get something valuable -- that's the "persistent" in APT.


it brings a lot of questions to the table about what employee knew what, and when.. A real question is - under a "zero trust" environment, how many motivated insiders have they accumulated with their IT employment and contracting.


Was having the same thought.. Also anyone know what happened with the employee?


Yeah:

'After April 2021, when the key was leaked to the corporate environment in the crash dump, the Storm-0558 actor was able to successfully compromise a Microsoft engineer’s corporate account. This account had access to the debugging environment containing the crash dump which incorrectly contained the key.'

So either the attacker was already in the network and happened to find the dump while doing some kind of scanning that wasn't detected, or they knew to go after this specific person's account.


Or they knew/discovered that there was a repository of crash dumps - likely a widely known piece of information - and just grabbed as much as they could. Nothing in the write-up indicates any connection between the compromised engineer and this particular crash dump, other than they had access.


I believe there are somewhat standard tools for scanning memory dumps for cryptographic material, which have been around since the cold boot attack era. And I can imagine attackers opportunistically looking for crash dumps with that in mind. But it does seem like an awfully lucky (for the attacker) sequence of events...


If you infiltrate a system and see sshd.core in the filesystem, you're not going to snarf that up?


I just checked the source and openssh doesn't appear to set madvise(MADV_DONTDUMP) anywhere :-( That seems like an oversight? For comparison openssl has a set of "secure malloc" functions (for keys etc) which uses MADV_DONTDUMP amongst other mitigations.


On OpenBSD, you'd be looking for MAP_CONCEAL, though it's not used in many places, either.


Interestingly, it looks like ssh-agent disables core dumps[1], but I don't see similar usage for sshd

1: https://github.com/openssh/openssh-portable/blob/694150ad927...


sshd runs as root, so the core dumps would be readable as root-only, no? If you have root access already you could dump it even while it's still running with ptrace anyways


>sshd runs as root, so the core dumps would be readable as root-only, no

Yes, although the article we're discussing shows that you can't rely on that, the dump could be subsequently moved to a developer machine for investigation, and unencrypted key material left in could be compromised that way... defense in depth would make sense here.


Secret materials for ssh keys won’t be in sshd. They stay client side. Granted m, host keys could be compromised, so you could impersonate a server, but a sshd key leak won’t give direct access


It could leak passwords, though, unless you’re very certain your overwriting after validation always works.


Yes, though it does have the host key which you could use to do a MITM attack.


MADV_DONTDUMP or MAP_CONCEAL don't appear anywhere in the source, client or server (with the exception of the seccomp filter where they're just used to filter potential system calls).


Key material aside, such a coredump could give some hints towards someone else’s capabilities, and point you in an interesting direction for finding new and exciting ways to own more shit.


Since they can supposedly redact the key from crash dumps, the key must have some identifiable footprint in memory.

The redact_key_from_crash_dump function would be a good place to look for juicy bugs. Any case it misses tells you what to look for.


Another article from Microsoft in this affair that barely (if indeed) answers more questions than it raises.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: