Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Meltdown fix impact on Redis performances in virtualized environments (gist.github.com)
178 points by ABS on Jan 5, 2018 | hide | past | favorite | 73 comments


Breakdown from post:

~8% pipelined SET performance reduction after patch

~15% pipelined GET performance reduction after patch

~29% non-pipelined SET performance reduction after patch

~25% non-pipelines GET performance reduction after patch


Note also that the host OS (the one running the hypervisor) was already patched in all cases. These tests only reflect the difference between a patched and unpatched guest OS. The total performance reduction is likely greater.


Good point - I guess it's more or less impossible to do this test on the public cloud, now (at least on the big three).

It'd be fun to set up the hardware and do a full matrix benchmark of patched/unpatched host and patched/unpatched guest.


Wholly Shit Isn't that enough


And this all seems in line with what the expected slowdown would be. Non-pipelined calls will have more syscalls and therefore a bigger slowdown. Glad antirez took the time to do the benchmarks.


https://www.phoronix.com/scan.php?page=article&item=linux-kp... shows that the kernel version has more of an impact.


That's super informative, as I wrote my methodology was flawed and indeed... Tests to redo. Btw what's the reason for this terrible regression?


The regression being the drop in performance you saw? I don't know.. I'd be interested in seeing what the performance looks like on multiple "identical" C4.X8LARGE instances, regardless of KPTI.

If you mean their results.. they actually show an increase in perf in later kernel versions.

Based on

https://github.com/phoronix-test-suite/test-profiles/tree/ma...

it looks like they just use the default configuration and run the benchmark with

  -n 1000000 -P 32 -q -c 50 --csv


Yep, seeing a much more severe hit on all our elasticache instances than anywhere else. CPU graph from two underused ones: https://twitter.com/Adys/status/949432228727218177


We also saw a huge hit on our memcached elasticache nodes. Eyeballing from our CPU graph, seems ~30%.


I've noticed a big slowdown in PV EC2. One machine used to be fine running apache and mysql. I had to add a 2nd cpu and split the mysql off onto a different instance to keep load to manageable levels.


Hydra/Gamepedia got hit by the patch on its Elasticache instances around the same time. Redis is used heavily on its caching infrastructure and the web nodes starting thrashing due to the sudden performance loss.(Until the resources were increased.)


The two graphs have different y-axis-scales.


Obviously? They're two different instances.


Here's hoping your infrastructure scaling plan has more than 30% CPU headroom...


Is this good news for AWS? Some customers will be forced to upgrade to a larger instance.


Unless customers absolutely raise hell, this is going to be very good for AWS. Suddenly all of these autoscaling setups are going to consume 30% more resources in order to achieve the same throughput.

My expectation is we may see AWS apply some significant discounting to AWS services as a result, though that doesn't help all of us who utilize mostly prepaid reserved instances.


> My expectation is we may see AWS apply some significant discounting to AWS services as a result, though that doesn't help all of us who utilize mostly prepaid reserved instances.

It'll happen once one of the 3 main clouds breaks ranks and discounts their rates to match.


Are the majority of aws revenues from casual users that probably wouldn't care? If so, I think you're going to be proven right.

If, however, major customers make up the majority of revenue, then the answer seems less clear to me.


Major customers are absolutely the primary revenue source. AWS revenue follows the pareto principle.


Redddit has been running like garbage the last few days-I bet they're getting slammed with their instances needing mass reboots+big performance decreases


Are they running in a virtualized environment? They could just disable it as all the code is their own and is trusted?


There’s a case to be made that AWS is suddenly providing you 30% less than what you paid for. Imagine having reserved instances and they get 30% slower.


Not sure if you saw this a few days back but this exact issue is already impacting some folks and they're obviously not happy.

https://forums.aws.amazon.com/thread.jspa?threadID=269858


Maybe, except that it makes running in non-virtualized environments more attractive.


Because you can run without security patches?


I hope so for this one. It would suck to have to pay the performance penalty when you don't need to.


Or they’ll go bankrupt.


Is there a real reason to set `pti=on` in virtualized kernels, other than an abundance of caution? I'm not familiar with the internals of VT-x or intel-kvm.ko, but as far as I know, this is just doubling the hit for no real reason (unless you're doing nKVM or similar?).


Do you expect privilege isolation between users and kernel in your virtualized environment? If so, you have to enable PTI. If your virtualized environment just has a single root user, there's no need.


Are the timing semantics within a virtual environment predictable enough to pull off the attack within the limits of the guest context? I guess at this point we could just test for ourselves.

EDIT: Actually, thinking about this some more ... if the host has retpoline and the microcode update, the indirect jump technique can't be used to deduce the state of the BHB, so the kernel addresses never leak. If KVM's _internal_ kernel space, which isn't actually protected by the CPU's ring 0 (I think, unless this is also provided by VT-x?), is guarded by whichever of IBRS/IBPB is relevant, then would those mitigations prevent guest exploitation as well regardless of timing semantics?


The mitigations you mentioned are all for Spectre, aka variants 1 and 2, aka tricking code running in kernel mode into doing something weird. But KPTI is specifically for Meltdown, aka variant 3, aka having user mode code speculatively access kernel data directly (data which is mapped in its address space but marked kernel-only - thus the ‘fix’ is to separate the address spaces). Protecting kernel code with retpolines or by flushing the branch predictor won’t help with Meltdown, because it doesn’t rely on kernel code at all.


I have this question as well. I can understand docker containers being vulnerable to other containers reading kernel memory across the boundary, but with VMs you would have to read into an entirely different kernel, right?

I can understand how the attack works for JavaScript to read from the parent process / I.e. steal session cookies, but how could this attack work across VM boundaries?


The Project Zero article [0] demonstrates breaking KVM this way by using indirect jumps to deduce the state of the CPU's internal branch history buffer:

>The branch history buffer state is leaked in steps of 2 bits by measuring misprediction rates of an indirect call with two targets.

That leaks out the physical address of the KVM driver, against which more code is speculative executed, with side channel being used to make these results deducible within the guest's memory area.

Technically speaking, the guest can't fully unsandbox (i.e., they can't run commands on the physical host OS) with Spectre/Meltdown alone, but they can read the memory content of the entire physical host in the case of Meltdown, or all of userspace in the case of a Meltdown-less Spectre.

This is being mitigated by retpoline + microcode update + new kernel parameters IBRS and IBPB, which will make it so that users can't mislead the CPU with speculative indirection (by killing the functionality in risky situations, another layer of slowdowns we will all have to confront soon; impact is not seen until major distros recompile everything with retpoline). PTI itself wouldn't mitigate that leakage, it just mitigates its exploitation to read back the host memory.

However, this is a hardware thing that affects the host OS. If the host already has these features enabled on the real kernel, what value do they provide within the guest itself? Is timing _within the guest kernel_ accurate enough for sandboxed exploitation, making this a universal sandbox-constrained privilege escalation? I don't know. It'd be nice to avoid the redundant overhead, of course.

[0] https://googleprojectzero.blogspot.com/2018/01/reading-privi...


Fairly substantial slowdown BUT still a phenomenal set of performance numbers. 800k GET’s per second is way above my pay grade.


Would be curious to see this also done on the Google cloud.


Google Cloud does not have Redis as a service. You can deploy your own Redis to GCE VM and try.


Yes deploy your own and do benchmarks before and after they implemented the patch.


Check Container-optimized OS release notes to see which version has the fix. If I am not mistaken stable-63 has the latest set of fixes.

https://cloud.google.com/container-optimized-os/docs/release...


Rather shocked that Amazon and the other cloud providers are not screaming harder at Intel. Maybe they are behind the scenes.

I believe the increase cost of I/O is going to hit Amazon and can not be passed to the customer as what you pay is contractual.

Then with computer instances the customer is paying more for ultimately the same amount of compute and that is a PR issue for Amazon and the other cloud providers.

Seems like Amazon would not be OK with that and will want to pass it to Intel somehow.


And this is only going to be for installations that actually care about this level of security, so mostly just shared infrastructure/serverless/cloud. Internal systems aren't going to need the mitigation.

I think people are blowing this way out of proportion.


I personally think "the price of cloud computing just went up 5-10%" is "a big deal" and not some niche scenario.


Thing is I/O fees are contractual so Amazon has to suck up the increased cost or try to pass it somehow to Intel.

But computer instances the extra cost will be on the customer and it is hard to see Amazon taking the PR hit and not somehow trying to put it on Intel.


Like “all installations on AWS, GCE, DO and so on are affected? Even semi-private clouds might need mitigations since people rely on VMs as a security boundary. So yes, this is a big deal.


These test results do not show the real effect of PTI, they only show the effect of enabling pti within a virtualized kernel. That's still useful because a lot of people are going to be enabling it (even though the value of this seems dubious), but real tests showing the approximate performance loss would need to happen on host hardware.


Meltdown on Linux allows to read all physical memory, not only the kernel data. The defence against it is only dubious if one does not rely at all on process isolation as extra layer of defence.


I understand that, but I don't know that either Spectre or Meltdown needs to be mitigated within the context of a KVM guest running on a host with mitigations. They are timing attacks, and virtualization may incur enough overhead/latency to make them infeasible (I don't know if it does or not, I haven't tested).

Also, theoretically, the KVM driver and/or QEMU could be updated to block speculative execution attacks specific to guest passthrough, though admittedly I haven't done any KVM or VT-x hacking so I don't know its intricacies; maybe this wouldn't work?

With a host that is using new vendor microcode to disable branch prediction within unsafe contexts (IBRS as a global alternative to retpoline, or IBPB), speculative execution attacks may be sufficiently mitigated at the host level without necessitating additional mitigation at the guest level.

I'm not sure if anyone who doesn't work at Intel knows that for sure yet, since it seems that some people are still trying to figure out how to get the new microcodes...


Host measures should stop the attacks between VMs, but they will not stop attacks within a VM. A busy VM threads may run on their physical cores/threads uninterrupted by the host. Flashing branch prediction by the host on a context switch then does not happen so this variant of Spectre still works.

The host measures will not protect against another variant of Spectre or Meltdown within the VM even if the host performs context switches. The frequency of switches is just too low to disrupt a single cycle of the attack. So the attack will be just slowed down. In addition, as processes within VM have both access to high-resolution timer and CPU performance counters, the attack can detect context switches allowing for simpler recovery from them.


Sure, a VM running on a patched host might not need extra mitigation, but the host mitigation won’t magically be free, so running tests in a VM to get a rough idea of the impact is still a good idea.


Yeah I agree, but these tests can't show the cost without the base layer of mitigation (since at least PTI should be in effect across all instances at Amazon), so they're incomplete. The "unmitigated" version have at least the host-level mitigations. I'd love to know more definitively whether host-only mitigation is enough.


But since when a VM is used as a workstation?. AFAIK we know that this vulnerability could be executed via javascript. If the browser is safe then we are safe about it without patch-hacking the kernel.


Here's an example, though -- a former employer has dealt with scaling by staying just this side of what top-of-line hardware can support in lieu of going distributed. Because it massively simplifies ops / possible failure modes of dist systems that simply aren't present in single node systems. I'm still on a slack with them and they're very unhappy. They don't have this type of available overhead available on their pg or redis instances.


Are they running on bare metal? I'd say don't patch and only run pg and redis on the machine. They should still be secure, no?


I agree. For most cases, the fix is worst than the solution.


I don't quite understand what "pipelining" means. Can anyone TL;DR?


Please don't bring lazyweb to HN. We can already see the duplicate replies (which are on the edge of karma whoring, for lack of a better term) filling the comment thread.

There is redis documentation for this, which is literally the first result for searching "pipelining redis":

https://redis.io/topics/pipelining

HN has a standard which at minimum is above LMGTFY.


Woah chill out. I don't think it's that big of a deal to ask a question, especially for newbies who find documentation intimidating and tough to get through.


To be fair, if getting through the documentation for a feature is tough, then engaging in the discussion about its performance and the potential impacts of these recent mitigations will be much, much more difficult.


That's fair. Although I usually look to HN to hopefully find someone who provides clarity over situations like this. Performance impacts are really easy to interpret incorrectly.


I can see from your profile that your account is "half troll", and there is not really room for troll accounts on HN, while we do value legitimate "devil's advocate" or contrary positions. This is how we maintain our level of discourse.

In the old days of the internet, asking for anything searchable is the action of last resort, and generally considered quite rude unless truly unfindable or incomprehensible. Taking action on one's own intellectual curiosity is invaluable for education.


Not to play devil's advocate here but there's no harm in labeling oneself as a troll unless his actions meet the criteria. I certainly don't believe he's trolling in this instance nor do his previous comments suggest past behavior of such. Anyways, just wanted to point out that this is a straw man argument.


[flagged]


While the delivery leaves something to be desired, I second what he's saying. You're essentially telling people "please do the work for me since I'm too lazy to". It's the kind of low effort post that plagues reddit. I love reddit, but appreciate that there's an alternative for higher quality discussion. Part of keeping discussion high quality is signaling to others what's acceptable and not acceptable. Simply ignoring your question wouldn't communicate to others that questions like yours should be avoided.


I get that. I'm happy to be downvoted to oblivion :)

Although, in a lower comment, talking through how the performance benchmarks were about using Redis for persistence was really helpful to me. I don't think I would have ever had that realization had I not asked the "dumb question" at first.

My hesitation with just asking my real question on using Redis for caching was that I didn't really understand what "pipelining" is. I've misinterpreted documentation so many times that I've found it's useful to ask dumb questions, even if the only value is to get validation that you were right.

Lower comment: https://news.ycombinator.com/item?id=16081482


Fairly certain this is talking about Redis pipelining (https://redis.io/topics/pipelining). It let's you send multiple commands to Redis without waiting for a response between each one.


Pipelined operations don't wait for the preceding operation to complete before they start. So this will be multiple gets being issued simultaneously.

This is often used to reduce latency but only works with operations that don't depend on the preceding operation.


It's great if you want to run something like "Here's a bunch of SET operations. Do them and get back to me when they're all finished."


It's basically just consolidating multiple requests into one larger request.


So applications that use Redis mainly for caching won't be hit as hard then?


Applications that don’t use Redis as a persistent store, and enable pipelining, will be hit less hard, yes. Applications that connect to Redis remotely will also be hit less hard — loopback makes this somewhat of a worst-case scenario.


Ahhhh that's this line was about:

> Test performed with AOF enabled

AOF is a type of persistence: https://redis.io/topics/persistence

Most of the time, I've used Redis with (I think) RDB persistence, just in case. Usually we've used it as an in-memory cache.

Ok, that makes a lot of sense, thanks!


Pipelining is buffering multiple commands into one network packet, otherwise each command is a separate packet.


Not necessarily. The relationship to packets is pretty thin. Pipeline mode just avoids the wait after each command before beginning the next, the same thing SMTP can do.


Well, not necessarily a single packet depending on the amount of data, but one send+receive operation versus one for every command.

https://redis.io/topics/pipelining




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: