> The RAM trade off is excellent for normal sizes but if you scale enormously the trade off eventually reverses
I don't think you've watched the talk. The minimal RAM-per-core is quite high, and often sits there unused even though it could be used to reduce the usage of the more expensive CPU. You pay for RAM that you could use to reduce CPU utilisation and then don't use it. What you want to aim for is a RAM/CPU usage that matches the RAM/CPU ratio on the machine, as that's what you pay for. Doubling the CPU often doubles your cost, but doubling RAM costs much less than that (5-15%).
If two implementations of an algorithm use different amounts of memory (assuming they're reasonable implementations), then the one using less memory has to use more CPU (e.g. it could be compressing the memory or freeing and reusing it more frequently). Using more CPU to save on memory that you've already paid for is just wasteful.
Another way to think about it is consider the extreme case (although it works for any interim value) where a program, say a short-running one, uses 100% of the CPU. While that program runs, no other program can use the machine, anyway, so if you don't use up to 100% of the machine's RAM to reduce the program's duration, then you're wasting it.
As the talk says, it's hard to find less than 1GB per core, so if a program uses computational resources that correspond to a full core yet uses less than 1GB, it's wasteful in the sense that it's spending more of a more expensive resource to save on a less expensive one. The same applies if it uses 50% of a core and less than 500MB of RAM.
Of course, if you're looking at kernels or drivers or VMs or some sorts of agents - things that are effectively pure overhead (rather than direct business value) - then their economics could be different.
> Second thing though: Unpredictability. GC means you can't be sure when reclamation happens.
What you say may have been true with older generations of GCs (or even something like Go's GC, which is basically Java's old CMS, recently removed after two newer GC generations). OpenJDK's current GCs, like ZGC, do zero work in stop-the-world pauses. Their work is more evenly spread out and predictable, and even their latency is more predictable than what you'd get with something like Rust's reference-counting GC. C#'s GC isn't that stellar either, but most important server-side software uses Java, anyway.
The one area where manual memory management still beats the efficiency of a modern tracing GC (although maybe not for long) is when there's a very regular memory usage pattern through the use of arenas, which is another reason why I find Zig particularly interesting - it's most powerful where modern GCs are weakest.
By design or happy accident, Zig is very focused on where the problems are: the biggest security issue for low-level languages is out-of-bounds access, and Zig focuses on that; the biggest shortcoming of modern tracing GCs is arena-like memory usage, and Zig focuses on that. When it comes to the importance of UAF, compilation times, and language complexity, I think the jury is still out, and Rust and Zig obviously make very different tradeoffs here. Zig's bottom-line impact, like that of Rust, may still be too low for widespread adoption, but at least I find it more interesting.
> As I understand it this one is why Microsoft are rewriting Office backend stuff in Rust after writing it originally in C#
The rate at which MS is doing that is nowhere near where it would be if there were some significant economic value. You can compare that to the rate of adoption of other programming languages or even techniques like unit testing or code review. With any new product, you can expect some noise and experimentation, but the adoption of products that offer a big economic value is usually very, very fast, even in programming.
> What you want to aim for is a RAM/CPU usage that matches the RAM/CPU ratio on the machine, as that's what you pay for.
This totally ignores the role of memory bandwidth, which is often the key bottleneck on multicore workloads. It turns out that using more RAM costs you more CPU, too, because the CPU time is being wasted waiting for DRAM transfers. Manual memory management (augmented with optional reference counting and "borrowed" references - not the pervasive refcounting of Swift, which performs less well than modern tracing GC) still wins unless you're dealing with the messy kind of workload where your reference graphs are totally unpredictable and spaghetti-like. That's the kind of problem that GC was really meant for. It's no coincidence that tracing GC was originally developed in combination with LISP, the language of graph-intensive GOFAI.
> It turns out that using more RAM costs you more CPU
Yes, memory bandwidth adds another layer of complication, but it doesn't matter so much once your live set is much larger than your L3 cache. I.e. a 200MB live set and a 100GB live set are likely to require the same bandwidth. Add to that the fact that tracing GCs' compaction can also help (with prefetching) and the situation isn't so clear.
> That's the kind of problem that GC was really meant for.
Given the huge strides in tracing GCs over the past ten and even five years, and their incredible performance today, I don't think it matters what those of 40+ years ago were meant for, but I agree there are still some workloads - not anything that isn't spaghetti-like, but specifically arenas - that are more efficient than tracing GCs (young-gen works a little like an arena but not quite), which is why GCs are now turning their attention to that kind of workload, too. The point remains that it's very useful to have a memory management approach that can turn the RAM you've already paid for to reduce CPU consumption.
Indeed, we're not seeing any kind of abandonment of tracing GC at a rate that is even close to suggesting some significant economic value in abandoning them (outside of very RAM-constrained hardware, at least).
> The point remains that it's very useful to have a memory management approach that can turn the RAM you've already paid for to reduce CPU consumption.
That approach is specifically arenas: if you can put useful bounds on the maximum size of your "dead" data, it can pay to allocate everything in an arena and free it all in one go. This saves you the memory traffic of both manual management and tracing GC. But coming up with such bounds involves manual choices, of course.
It goes without saying that memory compaction involves a whole lot of extra traffic on the memory subsystem, so it's unlikely to help when memory bandwidth is the key bottleneck. Your claim that a 200MB working set is probably the same as a 100GB working set (or, for that matter, a 500MB or 1GB working set, which is more in the ballpark of real-world comparisons) when it comes to how it's impacted by the memory bottleneck is one that I have some trouble understanding also - especially since you've been arguing for using up more memory for the exact same workload.
Your broader claim wrt. memory makes a whole lot of sense in the context of how to tune an existing tracing GC when that's a forced choice anyway (which, AIUI, is also what the talk is about!) but it just doesn't seem all that relevant to the merits of tracing GC vs. manual memory management.
> we're not seeing any kind of abandonment of tracing GC at a rate that is even close to suggesting some significant economic value in abandoning them
We're certainly seeing a lot of "economic value" being put on modern concurrent GC's that can at least perform tolerably well even without a lot of memory headroom. That's how the Golang GC works, after all.
> It goes without saying that memory compaction involves a whole lot of extra traffic on the memory subsystem
It doesn't go without saying that compaction involves a lot of memory traffic, because memory is utilised to reduce the frequency of GC cycles and only live objects are copied. The whole point of tracing collection is that extra RAM is used to reduce the total amount of memory management work. If we ignore the old generation (which the talk covers separately), the idea is that you allocate more and more in the young gen, and when it's exhausted you compact only the remaining live objects (which is a constant for the app); the more memory you assign to the young gen, the less frequently you need to do even that work. There is no work for dead objects.
> when it comes to how it's impacted by the memory bottleneck is one that I have some trouble understanding also - especially since you've been arguing for using up more memory for the exact same workload.
Memory bandwidth - at least as far as latency is concerned - is used when you have a cache miss. Once your live set is much bigger than your L3 cache, you get cache misses even when you want to read it. If you have good temporal locality (few cache misses), it doesn't matter how big your live set is, but the same is if you have bad temporal locality (many cache misses).
> which, AIUI, is also what the talk is about
The talk focuses on tracing GCs, but it applies equally to manual memory management (as discussed in the Q&A; using less memory for the same algorithm requires CPU work regardless if it's manual or automatic)
> when that's a forced choice
I don't think tracing GCs are ever a forced choice. They keep getting chosen over and over for heavy workloads on machines with >= 1GB/core because they offer a more attractive tradeoff than other approaches for some of the most popular application domains. There's little reason for that to change unless the economics of DRAM/CPU change significantly.
> It doesn't go without saying that compaction involves a lot of memory traffic
It definitely tracks with my experience. Did you see Chrome on AMD EPYC with 2TB of memory? It reached like 10% of Mem utility but over 46% of CPU around 6000 tabs. Mem usage climbed steeply at first but got overtaken by CPU usage.
I have no idea what it's using its CPU on, whether it has anything to do with memory management, or what memory management algorithm is in use. Obviously, the GC doesn't need to do any compaction if the program isn't allocating, and the program can only allocate if it's actually doing some computation. Also, I don't know the ratio of live set to total heap. A tracing GC needs to do very little work if most of the heap is garbage (i.e. the ration of live set to the total memory is low), but any form of memory management - tracing or manual - needs to do a lot of work if the ratio is low. Remember, a tracing-moving GC doesn't spend any cycles on garbage; it spends cycles on live objects only. The more heap you give it (assuming the same allocation rate and live set), means more garbage and less CPU consumption (as GC cycles are less frequent).
All you know is that CPU is exhausted before the RAM is, which, if anything, means that it may have been useful for Chrome to use more RAM (and reduce the liveset-to-heap ratio) to reduce CPU utilisation, assuming this CPU consumption has anything to do with memory management.
There is no situation in which, given the same allocation rate and live set, adding more heap to a tracing GC makes it work more. That's why in the talk he says that a DIMM is a hardware accelerator for memory management if you have a tracing-moving collector: increase the heap and voila, less CPU is spent on memory management.
That's why tracing-moving garbage collection is a great choice for any program that spends a significant amount of CPU on memory management, because then you can reduce that work by adding more RAM, which is cheaper than adding more CPU (assuming you're running on a machine that isn't RAM-constrained, like small embedded devices).
> (outside of very RAM-constrained hardware, at least)
I've spent much of my career working on desktop software, especially on Windows, and especially programs that run continuously in the background. I've become convinced that it's my responsibility to treat my user's machines as RAM-constrained, and, outside of any long-running compute-heavy loops, to value RAM over CPU as long as the program has no noticeable lag. My latest desktop product was Electron-based, and I think it's pretty light as Electron apps go, but I wish I'd had the luxury of writing it all in Rust so it could be as light as possible (at least one background service is in Rust). My next planned desktop project will be in Rust.
A recent anecdote has reinforced my conviction on this. One of my employees has a PC with 16 GB of RAM, and he couldn't run a VMware VM with 4 GB of guest RAM on that machine. My old laptop, also with 16 GB of RAM, had plenty of room for two such VMs. I didn't dig into this with him, but I'm guessing that his machine is infested with crap software, much of which is probably using Electron these days, each program assuming it can use as much RAM as it wants. I want to fight that trend.
It's perfectly valid to choose RAM over CPU. What isn't valid is believing that this tradeoff doesn't exist. However, cloud deployments are usually more CPU-constrained than RAM constrained, so it's important to know that more RAM can be used to save CPU when significant processing is spent on memory management.
That talk is mainly a GC person assuring you and perhaps themselves that all this churn is actually desirable. While "Actually this was a terrible idea and I regret it" is a topic sometimes (e.g. Tony Hoare has done this more than once) the vast majority of such talks exist to assure us that the speaker was correct, or at worst that they made some brief mistake and have now corrected it. So there's nothing unexpected here, I would not expect something else from the GC maintainer.
The part you've mistaken for being somehow general and relevant to our conversation is about trade-offs within GC. So it's completely irrelevant rather than being, as you seemed to imagine, an important insight. It actually reminds me of the early 1940s British feedback on intelligence for the V2 rocket. British and American Scientists were quite sure that Germany could not develop such a rocket, toy rockets work but at scale this rocket cannot work.
You can buy those toys today, a model store or similar will sell you the basics so you can see for yourself, and indeed if you scale that up it won't make an effective weapon. However intelligence sources eventually revealed how the German V2 was actually fuelled. To us today having seen space rockets it's obvious, the fuel is a liquid not a solid like the toy, which makes fuel loading rather difficult but delivers enormously more thrust. The experts furiously recalculated and discovered that of course these rockets would work unlike the scaled up toy they'd been assessing before. The weapon was very real.
Anyway. The speaker is assuming that we're discussing how much RAM to use on garbage. Because they're assuming a garbage collector, because this is a talk about GC. But a language like Rust isn't using any RAM for this. "The fuel is liquid" isn't one of the options they're looking at, it's not what their talk is even about so of course they don't cover it.
> With any new product, you can expect some noise and experimentation, but the adoption of products that offer a big economic value is usually very, very fast, even in programming.
What you've got here is the perfect market fallacy. This is very, very silly. If you're young enough to actually believe it due to lack of first hand experience then you're going to have a rude awakening, but I think sadly it probably just means you don't like the reality here and are trying to explain it away.
> That talk is mainly a GC person assuring you and perhaps themselves that all this churn is actually desirable. While "Actually this was a terrible idea and I regret it" ...
The speaker is one of the world's leading experts on memory management, and the "mistake" is one of the biggest breakthroughs in software history, which is, today, the leading chosen memory management approach. Tony Hoare has done this when his mistakes became apparent; it's hard to find people who say "it was a terrible idea and I regret it" when the idea has won spectacularly and is widely recognised to be quite good.
> Anyway. The speaker is assuming that we're discussing how much RAM to use on garbage. Because they're assuming a garbage collector, because this is a talk about GC. But a language like Rust isn't using any RAM for this. "The fuel is liquid" isn't one of the options they're looking at, it's not what their talk is even about so of course they don't cover it.
Hmm, so you didn't really understand the talk, then. You can reduce the amount of garbage to zero, and the point still holds. To consume less RAM - by whatever means - you have spend CPU. After programming in C++ for about 25 years, this is obvious to me, as it should be to anyone who does low-level programming.
The point is that a tracing-moving algorithm can use otherwise unused RAM to reduce the amount of CPU spent on memory management. And yes, you're right, usually in languages like C++, Zig, or Rust we typically do not use extra RAM to reduce the amount of CPU we spend on memory management, but that doesn't mean that we couldn't or shouldn't.
> What you've got here is the perfect market fallacy.
A market failure/fallacy isn't something you can say to justify any opinion you have that isn't supported by empirical economics. A claim that the market is irrational may well be true, but it is recognised, even by those who make it, as something that requires a lot of evidence. Saying that you know what the most important thing is and that you know the best way to get it, and then use the fact that most experts and practitioners don't support your position as evidence that it is correct. That's the Galileo complex: what proves I'm right is that people say I'm wrong. Anyway, a market failure isn't something that's merely stated; it's something that's demonstrated.
BTW, one of those times Tony Hoare said he wrong? It was when he claimed the industry doesn't value correctness or won't be able to achieve it without formal proofs. One of the lesssons from that in the software correctness community is to stop claiming or believing we have found the universally best path to correctness, and that's why we stopped doing that in the nineties. Today it's well accepted there can be many effective paths to correctness, and the research is more varied as well.
I started programming in 1988-9, I think, and there's been a clear improvement in quality even since then (despite the growth in size and complexity of software). Rust makes me nostalgic because it looks and feels and behaves like a 1980s programming language - ML meets C++ - and I get its retro charm (and Zig has it, too, in its Sceme meets C sort of way), but we've learnt, what Tony Hoare has learnt, is that there are many valid approaces to correctness. Rust offers one approach, which may be attractive to some, but there are others that may be attractive to more.
> You can reduce the amount of garbage to zero, and the point still holds.
Nah, in the model you're imagining now the program takes infinite time. But we can observe that our garbage free Rust program doesn't take infinite time, it's actually very fast. That's because your model is of a GC system - where ensuring no garbage really would need infinite time and a language without GC isn't a GC with zero garbage, it's entirely different, that's the whole point.
More generally, a GC-less solution may be more CPU intensive, or it may not, and although there are rules of thumb it's difficult to have any general conclusions. If you work on Java this is irrelevant, your language requires a GC, so this isn't even a question and thus isn't worth mentioning in a talk about, again, the Java GC.
> A claim that the market is irrational may well be true
Which makes your entire thrust stupid. You depend upon the perfect market fallacy for your point there, the claim that if this was a good idea people would necessarily already be doing it - once you accept that's a fallacy you have nothing.
> Nah, in the model you're imagining now the program takes infinite time.
Wat? Where did you get that?
> That's because your model is of a GC system - where ensuring no garbage really would need infinite time and a language without GC isn't a GC with zero garbage, it's entirely different, that's the whole point.
Except it's not different, and working "without garbag"e doesn't take infinite time even with a garbage collector. For example, a reference counting GC also has zero garbage (in fact, that's the GC Rust uses to manage Rc/Arc) and doesn't take infinite time. It does, however, sacrifice CPU for lower RAM footprint. Have you studied the theory of memory management? The CPU/memory tradeoff exists regardless of what algorithm you use for memory management, it's just that some algorithms allow you to control that tradeoff within the algorithms and others require you to commit to a tradeoff upfront.
For example, using an arena in C++/Rust/Zig is exactly about exploiting that tradeoff (Zig in particular is very big on arenas) to reduce the CPU spent on memory management by using more RAM: the arena isn't cleared until the transaction is done, which isn't minimal in terms of RAM, but requires less CPU. Oh, and if you use an arena (or even a pool), you have garbage, BTW. Garbage means an object that isn't used by the program, but whose memory cannot yet be reused for the next allocation.
If you think low-level languages don't have garbage, then you haven't done enough low-level programming (and learn about arenas; they're great). There are many pros and many cons to the Rust approach, and it sure is a good tradeoff in some situations, but the reason the biggest Rust zealots - those who believe it's universally superior - are those who haven't done much low-level programming and don't understand the tradeoffs it involves. It's also them who think that the reason those of us who were there first picked up C++ and later abandoned it for some use-cases did so only because it wasn't memory-safe. That was one reason, but there were many others, at least as equally decisive. Rust fixes some of C++'s shortcomings but certainly not all; Zig fixes others (those that happen to be more important to me) but certainly not all. They're both useful, each in their own way, and neither comes close to being "the right way to program". Knowledgeable, experienced people don't even claim that, and are careful to point out that they may genuinely believe some universal superiority but that they don't actually have the proof.
Whether you use a GC (tracing-moving as in Java, refcounting as in Rust's Rc or Swift, or tracing-sweeping as in Go), use arenas, or manually do malloc/free, the same principles and tradeoffs of memory management apply. That's because abstract models of computation - Turing machines, the lambda calculus, or the game of life, pick your favourte - have infinite memory, but real machines don't, which means we have to reuse memory, and doing that requires computational work. That's what memory management means. Some algorithms, like Rust's primitive refcounting GC, aim to reuse memory very aggressively, which means they do some work (such as updating free-lists) as soon as some object is unused so that its memory can be reused immediately, while other approaches postpone the reuse of memory to do less work. That's what tracing collectors or arenas do, and that's precisely why we who do low-level programming like arenas so much.
Anyway, the point of the talk is this: the CPU/memory tradeoff is, of course, inherent to all approaches of memory management (though not necessarily in every particular use case), and so it's worth thinking about the fact that the minimal amount of RAM per core on most machines these days is high. This applies to everything. Then he explains that the trancing-moving collectors allow you to turn the tradeoff dial - within limits - within the same algorithm. Does it mean tracing-moving collection is always the best choice? No. But do approaches that strive to minimise footprint somehow evade the CPU/memory tradeoff? Absolutely not.
A lesson that a low-level programmer may take away from that could be something like, maybe I should rely on RC less and, instead, try to use larger arenas if I can.
> Which makes your entire thrust stupid. You depend upon the perfect market fallacy for your point there
Except I don't depend on it. I say it's evidence that cannot be ignored. The market can be wrong, but you cannot assume it is.
The talk you're so excited about actually shows this asymptote. In reality a GC doesn't actually want zero garbage because we're trading away RAM to get better performance. So they don't go there, but it ought to have pulled you up short when you thought you could apply this understanding to an entirely unrelated paradigm.
Hence the V2 comparison. So long as you're thinking about those solid fuel toy rockets the V2 makes no sense, such a thing can't possibly work. But of course the V2 wasn't a solid fuel rocket at all and it works just fine. Rust isn't a garbage collected language and it works just fine.
GC is very good for not caring about who owns anything. This can lead to amusing design goofs (e.g. the for-each bug in C# until C# 5) but it also frees programmers who might otherwise spend all their time worrying about ownership to do something useful which is great. However it really isn't the panacea you seem to have imagined though at least it is more widely applicable than arenas.
> it's worth thinking about the fact that the minimal amount of RAM per core on most machines these days is high.
Like I said, this means you are less likely to value the non-GC approach when you're small. You can put twice as much RAM in the server for $50 so you do that, you do not hire a Rust programmer to make the software fit.
But at scale the other side matters - not the minimum but the maximum, and so whereas doubling from 16GB RAM to 32GB RAM was very cheap, doubling from 16 servers to 32 servers because they're full means paying twice as much, both as capital expenditure and in many cases operationally.
> I say it's evidence
I didn't see any evidence. Where is the evidence? All I saw was the usual perfect market stuff where you claim that if it worked then they'd have already completed it by some unspecified prior time, in contrast to the fact I mentioned that they've hired people to do it. I think facts are evidence and the perfect market fallacy is just a fallacy.
> The talk you're so excited about actually shows this asymptote.
Oh, I see the confusion. That asymptote is for the hypothetical case where the allocation rate grows to infinity (i.e. remains constant per core and we add more cores) while the heap remains constant. Yes, with an allocation rate growing to infinity, the cost of memory management (using any algorithm) also grows to infinity. That it's so obvious was his point showing why certain benchmarks don't make sense as they increase the allocation rate but keep the heap constant.
> So they don't go there, but it ought to have pulled you up short when you thought you could apply this understanding to an entirely unrelated paradigm.
I'm sorry, but I don't think you understand the theory of memory management. You obviously run into these problems even in C. If you haven't then you haven't been doing that kind of programming long enough. Some results are just true for any kind of memory management, and have nothing to do with a particular algorithm. It's like how in computational complexity theory, certain problems have a minimal cost regardless of the algorithm chosen.
> But at scale the other side matters - not the minimum but the maximum, and so whereas doubling from 16GB RAM to 32GB RAM was very cheap, doubling from 16 servers to 32 servers because they're full means paying twice as much, both as capital expenditure and in many cases operationally.
I've been working on servers in C++ for over 20 years, and I know the tradeoffs, and when doing this kind of programming seeing CPU exhausted before RAM is very common. I'm not saying there are never any other situations, but if you don't know how common this is, then it seems like you don't have much experience with low-level programming. Implying that the more common reason to need more servers is because what's exhausted first is the RAM is just not something you hear from people with experience in this industry. You think that the primary reason for horizontal scaling is dearth of RAM?? Seriously?! I remember that in the '90s or even early '00s we had some problems of not enough RAM on client workstations, but it's been a while.
In the talk he tells memory-management researchers about the economics in industry, as they reasonably might not be familiar with them, but decision-makers in industry - as a whole - are.
Now, don't get me wrong, low level languages do give you more control over resource tradeoffs. When we use those languages, sometimes we choose to sacrifice CPU for footprint and use a refcounting GC or a pool when it's appropriate, and sometimes we sacrifice footprint for less CPU usage and use an arena when it's appropriate. This control is the benefit of low level languages, but it also comes at a significant cost, which is why we use such languages primarily for software that doesn't deliver direct business value but is "pure overhead", like kernels, drivers, VMs, and browsers, or for software running on very constrained hardware.
> I didn't see any evidence.
The evidence is that in a highly competitive environment of great economic significance, where getting big payoffs is a way to get an edge over the competition, technologies that deliver high economic payoffs spread quickly. If they don't, it could be a case of some market failure, but then you'd have to explain why companies that can increase their profits significantly and/or lower their prices choose not to do so.
When you claim some technique would give a corporation a significant competitive edge, and yet most corporations don't take it (at least not for most projects), then that is evidence against that claim because usually companies are highly motivated to gain an advantage. I'm not saying it's a closed case, but it is evidence.
> When you claim some technique would give a corporation a significant competitive edge, and yet most corporations don't take it (at least not for most projects), then that is evidence against that claim because usually companies are highly motivated to gain an advantage.'
Corporations will generally want to optimize for ease of development and general ecosystem maturity. Rust is at a clear disadvantage at least wrt. the latter - the first generally usable version of the language was only released in late 2018 - and other safe languages are generally GC-based (compare Java/C# with C++). It's quite normal that a lot of performance would be left on the table wrt. both CPU and memory footprint.
But C++ is over 40 years old, and Java et al. displaced it in like five minutes. And that the footprint savings aren't worth much is pretty much the point. If you get 1GB/core and you use less, then you can't run more programs on the machine. The machine is exhausted when the first of RAM/CPU is, not when both are.
Nah. I'm an old man. I remember when Java 1.0 shipped. It got relatively little initial enterprise adoption considering it was from Sun who had a lot of enterprise contracts. Traction took years and was often aligned with adoption of Tim's crap hypermedia system, the "World Wide Web" which he'd created years prior but was just beginning to intrude into normal people's lives by the end of the 1990s.
A big factor is that Java was the entire ecosystem, you're getting a programming language (which is pretty good), a novel virtual machine (most of their ideas fell through but the basic thing is fine), dedicated hardware (mostly now forgotten), a component architecture, and a set of tools for them.
You're still on this 1GB/core thing which is the wrong end of the scale in two senses. Firstly, I've worked on systems where we'd want 1TB/core and so today that means you're buying a lot of CPU performance you don't need to get enough RAM because as you say, that machine is "exhausted" anyway.
But more importantly the scale for big products isn't dictated by RAM or CPU it's dictated by service provision, and at large scale that's just linear. Twice as much provision, twice the cost. Avoiding a GC can let you slash that cost. Cutting a $1M annual bill in half would justify hiring a Rust programmer though likely not a whole team. Cutting a $1Bn annual bill in half - which is much more like what Microsoft are spending on O365 - is obviously worth hiring a team.
It's not instant. GC tuning is basically instant. RIIR might take three, five, even ten years. So don't expect results tomorrow afternoon.
Rust is as old now as Java was when JDK 6 came out. But it's not just Java. Look at Fortran, C, C++, JS, C#, PHP, Ruby, Go, and even the late-bloomer Python - no popular language had an adoption rate as low as Rust at its rather advanced age. The trend just isn't there. It may well be that low-level programming will slowly shift to Rust an Zig, but there is no indication that low level programming as a whole isn't continuing its decline (in use, not importance).
> Avoiding a GC can let you slash that cost
But it doesn't, because RAM isn't the bottleneck in the vast majority of cases. It doesn't matter how linear costs are if RAM isn't the thing that's exhausted. That's why the use of manual memory management had been declining for decades.
At 1TB per core things still don't change because GC no longer has a high footprint cost in the old gen. You may use 15x RAM for the first 50 MB, but the overhead for the second GB is very small, thanks to the generational hypothesis: the older an object is, the less frequently it is allocated. The cost of a moving-tracing GC is propprtional to allocation-rate * live-set / heap-size per generation. When the live set is large, the allocation rate in that generation is low, which means that the heap size premium is also low.
> So don't expect results tomorrow afternoon.
Manual memory management is the older option, and it's been in decline for decades precisely because the savings in costs go in the other direction (not in all situations, but in most) due to the economics of RAM costs vs CPU costs. Without a marked change in the economics, the trend won't reverse even in another 50 years.
I don't think you've watched the talk. The minimal RAM-per-core is quite high, and often sits there unused even though it could be used to reduce the usage of the more expensive CPU. You pay for RAM that you could use to reduce CPU utilisation and then don't use it. What you want to aim for is a RAM/CPU usage that matches the RAM/CPU ratio on the machine, as that's what you pay for. Doubling the CPU often doubles your cost, but doubling RAM costs much less than that (5-15%).
If two implementations of an algorithm use different amounts of memory (assuming they're reasonable implementations), then the one using less memory has to use more CPU (e.g. it could be compressing the memory or freeing and reusing it more frequently). Using more CPU to save on memory that you've already paid for is just wasteful.
Another way to think about it is consider the extreme case (although it works for any interim value) where a program, say a short-running one, uses 100% of the CPU. While that program runs, no other program can use the machine, anyway, so if you don't use up to 100% of the machine's RAM to reduce the program's duration, then you're wasting it.
As the talk says, it's hard to find less than 1GB per core, so if a program uses computational resources that correspond to a full core yet uses less than 1GB, it's wasteful in the sense that it's spending more of a more expensive resource to save on a less expensive one. The same applies if it uses 50% of a core and less than 500MB of RAM.
Of course, if you're looking at kernels or drivers or VMs or some sorts of agents - things that are effectively pure overhead (rather than direct business value) - then their economics could be different.
> Second thing though: Unpredictability. GC means you can't be sure when reclamation happens.
What you say may have been true with older generations of GCs (or even something like Go's GC, which is basically Java's old CMS, recently removed after two newer GC generations). OpenJDK's current GCs, like ZGC, do zero work in stop-the-world pauses. Their work is more evenly spread out and predictable, and even their latency is more predictable than what you'd get with something like Rust's reference-counting GC. C#'s GC isn't that stellar either, but most important server-side software uses Java, anyway.
The one area where manual memory management still beats the efficiency of a modern tracing GC (although maybe not for long) is when there's a very regular memory usage pattern through the use of arenas, which is another reason why I find Zig particularly interesting - it's most powerful where modern GCs are weakest.
By design or happy accident, Zig is very focused on where the problems are: the biggest security issue for low-level languages is out-of-bounds access, and Zig focuses on that; the biggest shortcoming of modern tracing GCs is arena-like memory usage, and Zig focuses on that. When it comes to the importance of UAF, compilation times, and language complexity, I think the jury is still out, and Rust and Zig obviously make very different tradeoffs here. Zig's bottom-line impact, like that of Rust, may still be too low for widespread adoption, but at least I find it more interesting.
> As I understand it this one is why Microsoft are rewriting Office backend stuff in Rust after writing it originally in C#
The rate at which MS is doing that is nowhere near where it would be if there were some significant economic value. You can compare that to the rate of adoption of other programming languages or even techniques like unit testing or code review. With any new product, you can expect some noise and experimentation, but the adoption of products that offer a big economic value is usually very, very fast, even in programming.