Virtual threads are going to be great, but they're still limited (still starved the pool when used with 'synchronized' blocks), and they aren't the structured concurrency power houses like kotlin coroutines, but its an invaluable tool that will only continue to accelerate as the ecosystem moves to adopt them.
Expect a lot of libraries to start release versions that are java 21 baseline because of this feature alone. We're in for a little bit of dependency hell for the short while. Thankfully, devs have been exposed to a mostly final loom for a year, so my hope is that at least the big projects are well on their way to quick adoptions.
Unlike the 8->11 migration which largely brought pain, the 8->21 release brings with it a ton of value that i think will encourage most shops to actually pull the trigger and finally abandon 8.
Structured concurrency in JDK 21 is not only a powerful and flexible library feature, but one that is built deep into the runtime in a way that allows observability into the relationships among threads: https://openjdk.org/jeps/453
It's somewhat unfortunate that structured concurrency ended up being a preview feature in 21. I agree that it's a great addition but man it'd be nice if it made the LTS.
As it stands, probably won't be heavily used until Java 25.
Organisations that care about new features need to reconsider their stance on using Preview features and even more importantly sticking to versions for which the sales org offers an LTS service. The whole concept of LTS is designed for companies that are so uninterested in new features (often because their software is no longer heavily maintained) that they're willing to pay money not to receive them. There are a lot of such projects around, and the fact that non-legacy projects choose old versions and an LTS service shows that the ecosystem still hasn't adapted to the new release model.
While I tend to agree, it's just a losing battle (in my experience). The issue is that shockingly few people in organizations actually care about language improvements. Further, because orgs prioritize spitting out new features above all else, selling maintenance work like updating the JVM is seen more as pure waste than actual benefit to the company.
The one argument I've been able to make to get an update is "This has fallen out of support and will no longer get security updates". That seems to be the only motivator for my company to do updates.
That seems like a bad faith interpretation. Upgrading has both costs and risks. Even upgrades within the same major version can break things. LTS is about paying for stability, not a lack of features.
Well, yes, but in the past the versions that now get a new integer number (feature releases) were mandatory for everyone and there was no LTS at all. There were some differences, but not as big as many think. The biggest one was the psychological aspect of the name (7u4 or 8u20, which were not patch releases but big feature releases).
So why did we create the LTS service? 1. Because the new feature releases, while no more risky than the old ones (like 7u4 and 8u20), do require a little more work that companies don't want to put into legacy applications, and 2. Many companies indeed are willing to pay for more stability for their legacy apps.
So while it is absolutely true that some projects want better stability, this level of stability is new. Companies that religiously stick to old versions now didn't do that in the past. The simplest explanation is that the new release model isn't yet understood, not that thousands of companies changed their risk strategy.
Congrats to you and the team on this huge milestone!
Really looking forward to taking advantage of these things (transparently and automatically!) in ZIO/Scala... which I think shows the true power of the JVM-as-platform approach you're taking!
You could always set the backing thread pools for core.async and agents in Clojure. That gives you the ability to use virtual threads right now.
But in order to avoid thread pinning, there will need to be some code changes to convert some uses of synchronized to ReentrantLock. How fast that happens will depend upon the given library maintainer. Here's an issue for making some of these changes in Clojure's core library: https://clojure.atlassian.net/browse/CLJ-2771
I've tested Clojure's agents with the new virtual threads for my targeted use case they're significantly faster than before - I can spin up tens of thousands of mostly idle agents and reach performance close enough to core.async for me.
> Expect a lot of libraries to start release versions that are java 21 baseline because of this feature alone.
Java has had multi-version jars since 11 I think... that allows library authors to ship code that benefits from new features in newer versions of the JDK while still supporting older ones as well. Hopefully library authors can leverage that, though I'm aware something like Virtual Threads may be very difficult to design around for older versions.
mrjars are a terrible idea, because libraries are very rarely a single jar, and require dependencies. Often that dependency set is different depending on the runtime version because of backports of newer Java APIs. So either you don't care and make your consumers use proguard to remove the unnecessary backports, or you create a Maven package with variants for each runtime version, which 99% of your downstream will end up using anyway.
With the API being nearly the same, I keep just thinking that Virtual Threads are basically identical to Platform Threads except that they use far less memory (so you can have lots more of them).
Are there any other actual differences? Better Peformance?
The relationship between throughput, latency, and concurrency in servers is expressed via Little's theorem. If your server is written in the thread-per-request style -- the only style for which the platform offers built-in language, VM, and tooling support -- then the most important factor affecting maximum throughput is the number of threads you can have (until, of course, the hardware is fully utilised). Being able to support many threads is the most effective improvement to server throughput you can offer.
Thanks for the video. I feel like there's a bit of conflation between the terms "performance(latency)" and "throughput", but I see the point. I'd be interested to see that latency graph (Time marker 15:38) between platform and virtual threads in the case where the server doesn't manufacture a 100ms delay (say, in the case of a caching reverse-proxy).
Also - millions of Java programmers thank you for not going to async/await. What an evil source-code virus (among other things that is).
I tried to watch it at 1.25x speed as I normally do, but you already talk at 1.25x speed, so no need !
To understand what happens when the server doesn't perform IO, apply Little's formula to the CPU only. Clearly, the maximum concurrency would be equal to the number of cores, which means that in that situation there would be no benefit to more threads than cores. What you would see in the graph would be that the server fails once L is equal to the number of cores. The average ratio between IO and CPU time as portions of the average duration would give you an upper limit on how much more throughput you gain by having more threads. That's what I explain at 11:34.
Also, both throughput and latency are performance metrics.
I watched the video and thoroughly enjoyed it, thank you for sharing it! I have a question that is perhaps not entirely related to the video, but it touches the topic of context switches. I've read this post [1] by Chris Hegarty, which explains that when calling the traditionally blocking network I/O APIs in the Java stdlib from a virtual thread, it uses asynchronous/poll-based kernel syscalls (IOCP, kqueue, epoll on Windows, Mac and Linux respectively) which I assume is to avoid blocking the carrier threads. That post was written in 2021, does it still hold true today in Java 21?
Reading that, it also makes me wonder what happens for disk I/O? Many other runtimes, both "green thread" ones like Golang and asynchronous like libuv/tokio, use a blocking thread pool (static or elastic) to offload these kernel syscalls to because, from what I've read, those syscalls are not easily made non-blocking like e.g epoll is. Does Java Virtual Threads do the same, or does disk I/O block the carrier threads? For curiosity, does Java file APIs use io_uring on Linux if it is available? It is a fairly recently added kernel API for achieving truly non-blocking I/O, including disk I/O. It doesn't seem to bring much over epoll in terms of performance, but has been a boon for disk I/O and in general can reduce context switches with the kernel by reducing the amount of syscalls needed.
> That post was written in 2021, does it still hold true today in Java 21?
Yes.
> Does Java Virtual Threads do the same, or does disk I/O block the carrier threads? For curiosity, does Java file APIs use io_uring on Linux if it is available?
We're working on using io_uring where available, especially for filesystem IO. For now, filesystem IO blocks OS threads but we temporarily compensate by increasing the size of the scheduler's worker thread pool.
In late 2021 I compared OS threads to io_uring for filesystem I/O at random-access reads from fast, NVMe SSDs.
That measurement told me that it's not necessary to use io_uring for disk I/O performance for some workloads.
It found no improvement in performance from io_uring, compared with a dynamic thread pool which tries to maintain enough I/O-blocked threads to keep the various kernel and device queues busy enough.
This was a little surprising, because the read-syscall overhead when using threads was measurable. preadv2() was surprisingly much slower than pread(), so I used the latter. I used CLONE_IO and very small stacks for the I/O threads (less than a page; about 1kiB IIRC), but the performance was pretty good using only pthreads without those thread optimisations. Probably I had a good thread pool and queue logic, as it surprised me that the result was much faster than "fio" banchmark results had led me to expect.
In principle, io_uring should be a little more robust to different scenarios with competing processes, compared with blocking I/O threads, because it has access to kernel scheduling in a way that userspace does not. I also expect io_uring to get a little faster with time, compared with the kernel I tested on.
However, on Linux, OS threads* have been the fastest way to do filesystem and block-device I/O for a long time. (* except for CLONE_IO not being set by default, but that flag is ignored in most configurations in current kernels),
Interesting, didn't realize the kernel would let you do that. I guess it makes sense since it's up to user space to map pages for the stack. The kernel doesn't have much to do on clone except set the stack pointer.
That is an absolutely amazing video. From a brief, intuitive, and well-diagrammed explanation of non-trivial concepts of queuing theory, to practical examples, to connecting it all to real-world use cases and value, and all within a surprisingly short period of time, it is one of the most impressive technical videos I've ever seen. Thank you.
"With the currency being the same, I keep thinking a salary of $50k/year is the same as that of $500k/year. Are there any other actual differences?"
Just as with performance improvements [1][2][3][4], the actual impact on the user experience is non-linear and often hard to predict. In the case of virtual threads, you go from needing to consciously work around a limited amount of available threads to spawning one per request and moving on.
[2]: "These tests are fast enough that I can hit enter (my test-running keystroke) and have a response before I have time to think. It means that the flow of my thoughts never breaks." - https://news.ycombinator.com/item?id=7676948
[4]: "Go’s execution tracer has suffered from high overhead since its inception in 2014. Historically this has forced potential users to worry about up to 20% of CPU overhead when turning it on. Due to this, it's mostly been used in test environments or tricky situations rather than gaining adoption as a continuous profiling signal in production." - https://blog.felixge.de/waiting-for-go1-21-execution-tracing...
Do you have to baseline on Java 21 if you want to add support for virtual threads? Couldn't you continue using heavyweight threads on older versions of Java? My understanding is that both use the same Thread abstraction.
From an API perpective, you can always use reflection to cheat past the option to create virtual threads in pre-21 (without previews) java bytecode, but you need to do more to your code than just flip the switch to support virtual threads.
A virtual thread thread pool by definition is unbound. If you're binding data to a thread (eg. Thread locals, you now have a seemingly unbound list of threads that is now effectively a memory leak). I bumped into that one a few months ago with Netty that has a per thread cache for some things (thankfully you can turn off that cache). It was creating a significantly large waste of RAM that slowed down the application alone.
The other big one is as I mentioned the synchronized limitation. If you assume naively that anything can run in a virtual thread without worries, you're opening yourself up to deadlocks or at least significantly low performance code if you're relying on libraries/code that are synchronized using java monitors.
There may be more examples of gotchas, but these two are the most notable examples I have right now.
I believe, e.g. ZIO 2.next is doing something like this, dynamically deciding whether running something async or just doing the blocking thing depending on the availability of VThreads... but of course that's Scala, so YMMV.
Without a way to trampoline computation (or transform code appropriately) it's probably impractical to do anything like that.
(And of course, still many caveats as the sibling post points out.)
Why haven't places updated already? It's not that much work to update. Where I work we always go to the new LTS version as soon as it's supported by gradle.
The bigger the project, the more painful the upgrade. Package systems are convenient to avoid reinventing the wheel, until you have to upgrade any piece of it. Then you're stuck trying to figure out which versions of each package go together.
If Package A won't run on JDK 17 your entire project is stuck on JDK 11. If Package B is upgraded but has conflicts with Package A, you have to dig through old versions until you find one that works -- and you don't get upgrades.
The more games somebody has played with reflection, undocumented features, deprecations, etc. the more likely you are to have a conflict. And since package managers encourage you to depend on somebody else's code, you end up depending on everybody else's code.
The smaller and greener the project is the more likely it is you can just pull the latest versions and be happy about it. A project that was written when Java 8 was current, and continued to develop, is going to be a nightmare.
"Oh look, I need to upgrade mockito and Spring. Oh, now I upgraded Spring I need to update the spring JPA plugin. Oh now I upgraded that I need to upgrade Hibernate. Oh now I need to upgrade the library built on it that that team over there maintains. Oh, they're not interested." etc. etc.
When using Spring Boot you usually update just one version and everything else is updated via BOM. There should be a really good reason to have fine-grained control over every single dependency.
So honestly I didn't realise that it's POM extended to ecosystem libraries beyond Spring's own. However, it still doesn't solve the problem that e.g. the hibernate version compatible with Spring Framework n+1 is not compatible with the hibernate version compatible with Spring Framework n and now you're doing an "all or nothing" upgrade, which for a large app can be time consuming.
The point of using BOM is to avoid specifying dependency versions of individual components, so this problem is in reality non-existent. The only case when you would actually need to have different version of Hibernate than in your BOM is some critical bug fix, which is very likely a patch version and is very unlikely to break compatibility with the rest of your setup.
True, Spring upgrades can be a pain in the ass. There is a trick to make it less painful though. Use maven BOM for version management. As with any framework upgrade, it doesn't make the process entirely painless. But very much less painful.
Overwhelmingly it is, until it isn't. There are tiny gotchas, especially if you play with some of the murkier aspects, such as reflection or class loading.
The more of someone else's code you use, the more likely one of them bumps into one of the gotchas. And that sets off a cascade of conflicting versions.
1) dependencies need to be upgraded. for example, not all versions of Gradle support all Java versions. So you need to upgrade Gradle to upgrade Java.
2) other things are deemed to have higher priority.
3) people are satisfied with existing features and don't want to spend energy to upgrade to something that doesn't provide immediate value.
4) folks aren't educated on what the benefit of switching would be so why would it be prioritized? This is a case of "they don't know what they don't know".
I work on a team using Java 8 daily. It's fine. It's got things I wish it didn't (no null in switch statements for example) but I don't care about that so much that I'm going to go through the pain of upgrading 7-9 services in the mono repo, their dependencies, and then test them all to be on a new version of Java.
1) is garbage. Since grade 6 you can run on 22 ea with no issues. Use toolchains, as they say on their docs.
2) no shit. What business user is every in their mind prioritising upgrading their language version? It's not up to them to push the upgrade. It's yours.
3) of course they are. People don't desire what they don't want. Invest in people who are actually interested in improvement of their software.
4)the java team have been pushing heavily via twitter / youtube / infoq / hacker news / other open jdk providers all the new features for every single java version during their 6 months release cycles. If your devs / your team don't know about it, then maybe again youre not encouraging people to want to improve on what they have, or take interest in the tech they work in.
I mean that is fine, do I give a shit what java version in using for my take home salary? No...but I enjoy using the newest, most interesting and useful tools. And you best believe those people are more attractive to other companies and you working on some 15 year old java 8 tech.
1) you're assuming every one is on Gradle 6 or higher lol. I can assure you that is not true.
2) Sure, pushing and making the decision are not the same thing. I can complain and persuade as much as possible and it doesn't mean it's going to happen.
3) I agree that you want people who care about improving software. Upgrading language versions isn't always the best route to do that though.
4) I don't think everyone on the team is reading about the latest updates in the world of Java. I think a pretty small portion of engineers are keeping 100% up to date, following Twitter accounts for Java dev, watching YouTube videos on it, etc. All that is educational and that's great to know but for most people, it's not going to help them work better because they know about features they can't use.
5) definitely sounds aggressive but okay. I haven't found a company yet who's complained about working in Java 8 versus 11/17. If a company is hiring for a role that uses Java, they're likely not limiting their candidates to those who've used their version of Java. It's a pretty standard language and if you know any other object oriented language, you'll be fine.
You’re just delaying and making the upgrade worse when the time comes. It’s much easier to upgrade now to 11 and then 17 and then 21 rather than try to upgrade from 8 to 27 when 8 is finally EOL.
Whether you perceive there to be no immediate benefit (hint: there is, Java 8 is an antiquated runtime) or not, delaying upgrading until Java 8 EOL is a way larger risk than upgrading now.
I've never done a language upgrade. I don't know what it takes. I've heard 8->11 is painful and 11->17 is not. So doesn't that mean jumping from 8->17 directly is mostly the same as going from 8->11?
I'm not saying there is no benefit. I'm not saying there is no risk. I agree that going from 8->17 would be worse than 8->11->17.
My point is to list out reasons why a team may not be able to just spend a day upgrading (dependency issues) or why someone might not be given the time to do it.
As a heavy user of Groovy/Spock, though, I agree that upgrading Groovy itself can be challenging, unfortunately. Really depends though on how many edgy Groovy features you relied on :).
Not always for sure.
We started JDK upgrades with 9 and went +1 every half year and Groovy was lacking with one of 10, 11, 12 or 13. It got so tiring that we had to let it go.
Fortunately our tests were mostly JUnit 5, so it wasn't much of work.
If you're upgrading every minor Java version, then yeah, I agree Groovy and most other dependencies that may not work on Java version changes off the bat (Lambok, probably Spring and other heavy frameworks like Micronaut and Quarkus, build tools like Gradle... many more) are going to slow you down. You end up with a very simple project if you remove all of that, which is actually a good thing if you can afford doing it.
Those are not minor versions, that's quite natural path and is supported by every lib we used, except groovy. And this is the encouraged path for JDK upgrades, people are lazy, but not us :)
Spring supports new JDK release ("minor version" like you called them, those between 11 and 17 and 21) before release. The only exception was with JDK 13, there was about 2 week slip there.
Lombok (I don't like it) supports every such version at release (not before unfortunately).
Other libs didn't even error out (we keep them at newest versions possible, aside from Jakarta madness).
So from the major libs/frameworks, the only thing that slowed us down was groovy.
I see your point... I've never upgraded to the non-LTS Java versions (almost no one is doing it - as you can see in any survey about it - which in itself is enough for me not to be the "brave" one doing it!) so I don't know how much pain that would've been.
To me that just shows that Groovy could do with a bit more attention as I find it to be a great language, specially with Spock... it has made our tests so much nicer. It does have some warts but those are mostly a result of the lack of attention it receives, unfortunately, not some fundamental issues with the language.
It is a bit brave to upgrade every 6 releases or every 4, I prefer to have smaller issues more often than big ones rarely.
But I get that most of the time the issues is with dependencies that upgrade to lazily (that's why I don't like those that are bytecode magic and don't use ASM).
Kotlin, not Groovy, has been the culprit for slower Gradle support. I believe there was some split module pain in Groovy wrt Java 9, but it has been very smooth since then. The Kotlin compiler on the other hand is not very forgiving.
This is a moot point because your the build execution and the project compile/run can be on different JDKs. It is a tiny amount of configuration to decouple them, e.g. to use an EA build.
Yeah, you are right that decoupling build tool JDK and compile JDK is the way to go.
But Groovy does indeed not work, or has support very late for releases between 8, 11, 17 and 21 - so for anyone that wants to stay current (and not wait 3 or 2 years), using groovy in your code will be a pain - that might be also possible for other JVM languages, but I don't know, haven't used them.
Oh interesting. Typically the gradle release notes, like 8.4-rc-1, will have comments that imply that Groovy build scripts compile but Kotlin ones lag.
"Gradle now supports using Java 21 for compiling, testing, and starting other Java programs. This can be accomplished using toolchains. Currently, you cannot run Gradle on Java 21 because Kotlin lacks support for JDK 21. However, support for running Gradle with Java 21 is expected in future versions."
This is a niche case, but I spent months trying to upgrade one of our services from one LTS version to the next (I forget which). We encountered a weird bug where services running on the latest JRE would mysteriously corrupt fields when deserializing thrift messages, but only after running for a little while.
After an enormously unpleasant debugging cycle, we realized that the JIT compiler was incorrectly eliminating a call to System::arrayCopy, which meant that some fields were left uninitialized. But only when JIT compiled, non-optimized code ran fine.
This left us with three possible upgrade paths:
* Upgrade thrift to a newer version and hope that JIT compilation works well on it. But this is a nightmare since A) thrift is no longer supported, and B) new versions of thrift are not backwards compatible so you have to bump a lot of dependent libraries and update code for a bunch of API changes (in a LARGE number of services in our monorepo...). With no guarantee that the new version would fix the problem.
* File a bug report and wait for a minor version fix to address the issue.
* Skip this LTS release and hope the JIT bug is fixed in the next one.
* Disable JIT compilation for the offending functions and hope the performance hit is negligible.
I ultimately left the company before the fix was made, but I think we were leaning towards the last option (hopefully filing a bug report, too...).
There's no way this is the normal reason companies don't bump JRE versions as soon as they come out, but it's happened at least once. :-)
In general there's probably some decent (if misguided) bias towards "things are working fine on the current version, why risk some unexpected issues if we upgrade?"
I encountered a weird bug with deserializing JSON in a JRuby app during an OpenJDK upgrade - it would sporadically throw a parse error for no apparent reason. I was upgrading to OpenJDK 15, but another user experienced the same regression with an LTS upgrade from 8 to 11.
The end result of my own investigation led to this quite satisfying thread on hotspot-compiler-dev, in which an engineer starts with my minimal reproduction of the problem and posts a workaround within 24 hours: https://mail.openjdk.org/pipermail/hotspot-compiler-dev/2021...
There's also a tip there: try a fastdebug build and see if you can convert it into an assertion failure you can look up.
For an example, my team owns a dozen services and they have hundreds of direct and transient dependencies. Of those, maybe a dozen or two to need work to support the new version but that's a dozen different teams that have to put the work on their roadmap and prioritize it. When the entitlement is 'devs want to use shiny feature X with hard to quantify productivity benefit' it's difficult to prioritize. When there's an efficiency benefit then things move fast because a 10% efficiency improvement means 10% lower server costs and that's easy math.
The services I work on pump the entire business revenue from start to finish. A few nice to haves for devs aren't any where close in the risk calculation if something breaks
Java getting better pattern matching is a great change. Id really like more of the functional features to make it into Java.
I would love if Java pattern matching could at least get to the level of ruby pattern matching. Ruby pattern matching will allow you to deconstruct arrays and hashes to get pretty complicated patterns, which is really powerful. Right now it seems like Java might have that with a lambda in the pattern, but its not going to be as elegant as ruby where:
case {name: 'John', friends: [{name: 'Jane'}, {name: 'Rajesh'}]}
in name:, friends: [{name: first_friend}, *]
"matched: #{first_friend}"
else
"not matched"
end
#=> "matched: Jane"
But the big change here is virtual threads which should be a game changer.
We recently added pattern matching to Dart [1], so I'm always keen to see how it compares to similar features in other languages. In case it's interesting, here's that Ruby example ported to Dart:
Pretty similar! The main differences are that Dart doesn't have symbols, so the keys are string literals instead. Also, variable bindings in patterns are explicit (using "var") here to disambiguate them from named constant patterns.
We require "var" before variable patterns because we also allow named constants in patterns (which match if the value is equal to the constant's value):
const pi = 3.14; // Close enough.
switch (value) {
(pi, var pi) => ...
}
This case matches a record whose first field is equal to 3.14 and binds the second field to a new variable named "pi". Of course, in practice, you wouldn't actually shadow a constant like this, but we didn't want pattern syntax to require name resolution to be unambiguous, so in contexts where a constant pattern is allowed, we require you to write "var", "final", or a type to indicate when you want to declare a variable.
Swift's pattern syntax works pretty much the same way.
> > Dart doesn't have symbols
> That's weird, as I actually use sometimes `#sym` (which has type `Symbol`)??
Oh, right. I always forget about those. Yes, technically we have symbols, but they are virtually unused and are a mostly pointless wart on the language. It's not idiomatic to use them like it is in Ruby.
I don't find symbols to be pointless. They are useful as "interned strings" and that's exactly what I need sometimes. I could use `const myThing = "my thing";` for that purpose (but with symbols I don't need to declare it anywhere, just use it... for better or worse!), I suppose, but before `const` existed, I believe symbols were the only way to do that?
> They are useful as "interned strings" and that's exactly what I need sometimes.
I'm not sure exactly what you mean by "need" here, but as far as I know, Dart doesn't make any promises about the memory management or efficiency of either strings or symbols.
I had to double check this, because basically the only use of symbols in any languages is to provide a constant value you can treat as an internalized string (like Common Lisp keywords, for example). The would be entirely useless if that were not the case.
Luckily, the current Dart specification does guarantee this (section 17.8):
"Assume that i ∈ 1, 2, and that oi is the value of a constant expression which
is a symbol based on the string si. If s1 == s2 then o1 and o2 is the same object.
That is, symbol instances are canonicalized."
Apparently, there's even special treatment for "private symbols", which are only the same object "in the same library". TIL.
EDIT: there's even a whole sentence justifying the existence of symbols as being related to reflection, actually... they say Dart literal Strings are already "canonicalized" so that fact about Symbols is not enough for that.... hence you're right that String literals are just as good for the use-cases I had in mind. I guess I will use String literals from now on after all.
EDIT 2:
> I'm not sure exactly what you mean by "need"
Hopefully it's clear what I "needed" now... basically, interned Strings to avoid wastefully comparing bytes when a pointer comparison would suffice as all values are known at compile-time.
Pattern matching is a neat tool to keep in the toolbox. When it's the right tool for the job, it is really cool and is a lot cleaner than a bunch of conditional checks. However, I rarely reach for it. Maybe my use cases are unusual? I am genuinely curious how often other developers find pattern matching to be the best tool for the job.
Pattern matching is what makes sum types ergonomic enough to be used. Many a Java design doesn't use said interface-based sum types because it's so cumbersome to use them. But whena language has pattern matching, then suddenly designing with sum types in mind is done a lot, and therefore you see examples of good pattern matching everywhere.
When I teach Scala, a very high percentage of the teaching time is ultimately down to re-introducing how to design business domains, because seasoned devs just reach for large classes with a million optional fields, which not only can represent valid systems states, but thousands of invalid ones.
In languages that have strong support for pattern matching, whether it be on values or types, I find myself reaching for it instead of conditionals. It's all about the explicitness for me. You have to list out all the cases you care about, so there's no room for ambiguity. Plus, the compiler will usually warn you if you've missed a case, which is like a built-in bug catcher. It's also great for working with immutable data, less state to worry about. And let's talk about readability; the code basically documents itself because you can see the shape of the data right in front of you. You can even destructure data on the fly, pulling out exactly what you need. If you're using a statically-typed language, pattern matching adds an extra layer of type safety. And, not to forget, it nudges you toward a more functional style of coding, which I find leads to cleaner, more modular code. So yeah, I reach for pattern matching quite a bit; it often feels like the right tool for the job.
When available, I pretty much always use pattern matching. It tends to shorten code while not reducing clarity (often increasing it) which means fewer opportunities for errors to creep in. Statically typed languages that can detect incomplete case handling also reduces the chances for some errors (as long as you don't make a catch-all case) but also helps when you change something so that a new case is needed. It also tends to shift the code to the left, reducing the indentation. So shorter, clearer, less unnecessary indentation. Generally a positive.
It probably depends on the language you're using. Pattern matching is awesome in Erlang and Elixir. In most other languages it ranges from "nice" to "bleh".
Patterns are somewhat nice to have, but for me they’re difficult to read, and not because my brain isn’t used to them. The simple identifier instanceof is about all I’ll use _most_ of the time. Otherwise, yes they are more concise, but lose too much information in the process.
I’d rather see a boatload load of other features before patterns. I’ve been experimenting with project manifold[1]. _That_ is the path Java sb on. Just my take.
One example for you: anytime you needed to use the "Visitor pattern" to do a transformation from one representation to another - you don't need it now. Sealed classes and pattern matching will be more succinct and easier to reason about.
I think that you can replace almost any If else with pattern matching. Pattern matching makes type checks easier, which if you are really heavily using types through your program, makes pattern matching even better.
I really like that Ruby throws NoMatchingPatternError if none of the patterns match. It's a bit like the much-acclaimed exhaustive pattern matching in static languages (though at runtime rather than compile-time, obviously) and better than just silently falling off the end, which IIRC is what Python's pattern matching does.
That's not particularly relevant to the nice pattern matching property I mentioned. If you need to manually write supplementary code to get the exhaustiveness safety then that's back into the realm of bog-standard defensive programming.
Here's what I mean. The Ruby will throw NoMatchingPatternError and the Python will silently do nothing.
x = [10, "figs"]
case x
in [n, "apples"]
:foo
in [n, "oranges"]
:bar
end
# ---
x = [10, "figs"]
match x:
case [n, "apples"]:
...
case [n, "oranges"]:
...
I know, that's why I mentioned the manual part. What I'm getting from this exchange is that Python is your team and no criticism can be allowed to stand.
I recently spotted a (new to me) foreach / else construct in a templating language (sorry, forget which one); else is invoked if the list is empty. Nice sugar for common outputs like "no items found".
I appreciate modest syntactic sugar.
For instance, my #1 sugar wish is for Java's foreach is to do nothing when the list reference is null. Versus tossing a NPE.
Eliminates an unnecessary null check and makes the world a little bit more null-safe.
We have 2.1 million LOC in Java and we're moving to Java 21 (from 17) in two weeks when we branch for release.
We have a hundreds of third party dependencies across the code base, a lot of the big ones (Hibernate, Spring, a lot of Apache). We write a big web application and maintain a big legacy desktop application in Swing.
We run a dedicated nightly CI job that is on the latest Java release to get early warning for any incompatibilities. After the painful migration from 8 to 9 so many years ago it has been smooth sailing.
In all those version upgrades over all those years and dozens of on premise installations with big customers we have never had a regression or a problem that was caused by the runtime itself.
(1) It's a bit of a bad smell (which he points out) that records aren't being used much at all in the Java stdlib, I wrote something that built out stubs for the 17 and 18 stdlibs and that stood out like a sore thumb. I do like using records though.
(2) I've looked at other ways to extend the collections API and related things, see
and I think the sequenced collections could have been done better.
(3) Virtual Threads are kinda cool but overrated. Real Threads in Java are already one of the wonders of the web and perform really well for most applications. The cases where Virtual Threads are really a win will be unusual but probably important for somebody. It's a good thing it sticks to the threads API as well as it did because I know in the next five years I'm going to find some case where somebody used Virtual Threads because they thought it was cool and I'll have to switch to Real Threads but won't have a hard time doing so.
I think the biggest impact of virtual threads is that the ecosystem will abandon asynchronous APIs. No more futures, callbacks, servers where you have to make sure not to block the thread, reactive frameworks, etc. Just nice simple imperative blocking code. Nima is the first example i've seen:
We've had two production bugs in the last two weeks caused by handlers blocking the server thread in apps using an async web framework, which would simply not have happened with a synchronous server.
The stack for the VT requires a heap allocation [0], which ok, not huge deal for most scenarios, but something to consider. Reactive programming will avoid that. For example, for a service that doesn't do much IO (like an in memory pubsub thing or CDN) you would still want to use reactive programming if you care about performance, since likely the code will be simple anyway.
But what’s more expensive, some more ram? Or the hours upon hours upon hours wasted in dev salaries trying to develop and debug reactive code?
Also is that VT allocation more than all of the extra allocations from reactive frameworks internally? Or all of the heap capturing lambdas that you pass to reactive libraries? Do you have a source comparing any of this?
Well I'm one person running several reactive services for fastcomments right now and I have no issues... so in my case the ram is more expensive :p but I am looking forward to benchmarks.
I'd definitely be interested to see some benchmarks of real-world code, once virtual threads and its attendant web frameworks have had a year or two to mature.
I guess I'm thinking of cases where I would invoke a virtual thread from a reactive framework per request in which case you need the callback regardless. I guess if you can do away with that then, maybe. I need to benchmark it.
I suspect if we had records from the start they'd be all over the stdlib, but because of backwards compatibility they'll likely only be considered for new APIs.
The problem with regular threads is (a) multi-kb memory stack per thread and (b) consuming a file handle.
Either of those severely limits the scalability of the most "natural" parallelism constructs in Java (perhaps generally). Whole classes of application can now just be built "naturally" where previously there were whole libraries to support it (actors, rxJava, etc etc).
It make take a while for people to change their habits, but this could be quite pervasive in how it changes programming in general in all JVM languages.
You could easily have a million threads if you use multi-kb stacks. Million times multi-kb means multi-gb, that's still 3-4 orders of magnitude less than big memory servers/VMs. (and 1 order of magnitude less than a normal laptop)
What do you mean by using a file handle, is this a Windows platform thing? On *ix, threads don't use up file descriptors (but you can still have a million fd's at least on linux for other stuff if you want).
Thanks - this caused me to dig into the specific scenario where creating threads was exhausting file handles in my experience and you are correct - consuming a file handle is indeed not intrinsic to creating a new thread in Linux. It's insanely easy for literally anything you do with the thread to consume a file handle, but of course, that applies to virtual threads as well. Thanks!
What do you base this on? The stacks and kernel bookkeeping shouldn't use nearly this much at least on linux. Keep in mind that thread stacks have are lazily allocated virtual memory so won't use as much physical memory as the thread stack size setting shows.
If these threads are handling TCP connections and L7 protocol processing on top, you're going to have nontrivial both kernel and userspace memory usage per connection too that may dwarf the thread overhead.
Here's a linux kernel dev (Ingo Molnar) benchmarking Linux in 2002 and starting just shy of 400k threads in 4 GB: https://lkml.iu.edu/hypermail/linux/kernel/0209.2/1153.html - though on a 32 bit systems lots of objects things are 50% the size compared to current 64 bit. But still gives you a ballpark.
If the code is simple, blocking code, then the number of threads required in the pool is the average total duration of a request times the fanout times the request rate. That number can easily reach many thousands and more.
discussion was about specific context: avoiding overhead from spawning millions of threads, in this case you shouldn't have any blocking code at all, all API should utilize epoll underneath or something similar.
there are tons of variations, depending on your logic and API. The closest to virtual threads is ForkJoinPool and RecursiveTask, where you can have code like regular blocking code:
var f = async_api_returns_future();
...
var res = f.join();
but join() won't block OS/JVM thread, but make it to perform other tasks in the queue.
Or you can design API which will receive executorService as params, and run callback there, e.g.:
One way to see how different virtual threads are from our old mechanisms is to ask yourself how many IO operations you can have in flight. There are two options: either the operations are blocking, in which case the number will be equal to the (very limited) number of threads in all of your thread pools combined, or the operations are non-blocking, in which case thread context that is so necessary for troubleshooting and JFR profiles is lost (e.g. JFR can't know on behalf of whom is some IO operation performed because the "owner" of some operation -- in the design of the Java platform -- can only be a thread). Virtual threads allow you to have hundreds of thousands (or even millions) of I/O operations in flight (which you need for high throughput as a result of Little's law) while still preserving observable context.
BTW, as for fork-join's `join`, not only is it designed for pure computation only and cannot help with IO, but every `join` increases the depth of the stack, so it is fundamentally limited in how much it can help. FJ is designed for pure computation workloads, so in practice that's not a huge limitation, but virtual threads are designed for IO workloads.
I apologise for not going into more depth here, but as you can imagine, with a user base numbering in the many millions, we can only afford to put in depth explanations in texts and videos that gain a large audience, but once you've familiarised yourself with the material I'll gladly answer specific questions (and perhaps your questions will help us improve our material, too).
> I apologise for not going into more depth here, but as you can imagine, with a user base numbering in the many millions
my concern is that you somehow can find time to write long comments with lots of handwavings (our framework is designed for that, and their framework is not designed for that), but refuse to provide specific code pointers and example in support of your opinion. For example, in this specific case, can you give example how green threads can be used with current Java IO library, or Java JDBC library?
If there's something unclear in that material, please ask. Also, there is no our framework vs. their framework here. I'm only discussing the JDK's own various thread pools vs. virtual threads. They were all designed and implemented by the JDK team.
BTW, I'm not trying to support any opinion. Pretty much all popular Java server frameworks are adopting virtual threads, and Java's virtual threads are poised to become the most popular lightweight user mode threads. We've already convinced everyone that needed convincing. I'm merely offering pointers in case you're interested to learn how to use virtual threads and understand how they offer high throughput and good observability at the same time (whereas before you could have one or the other). Of course, if you're satisfied with the throughput and observability you can get with our old mechanisms, you don't have to use virtual threads. We've not taken anything away.
try (var in = url.openStream()) {
return new String(in.readAllBytes(), StandardCharsets.UTF_8);
}
which claims that this example will scale well with virtual threads, my understanding is that in.readAllBytes() will call OS blocking socket API underneath, which will block OS thread, so you would need many OS threads to scale. Is this understanding correct?
It is not. Blocking IO (with some exceptions mentioned in the JEP) will automatically be translated by the runtime into non-blocking IO when it occurs on virtual threads, and no OS threads will be blocked. The Java code will look blocking and that's what thread dumps and other Java observability mechanisms will show, but to the OS it will seem as if it's running non-blocking code.
You can't do that with thread pools. You could achieve that scalability with async code, but then observability tools will not be able to track the IO operations and who initiated them, but with virtual threads you'll see exactly what business operation is doing what IO and why.
> will automatically be translated by the runtime into non-blocking IO when it occurs on virtual threads, and no OS threads will be blocked
it looks like it is true for several API you implemented support for. What about other API, for example some JDBC driver which wants to use non-blocking DB driver. How to use virtual threads with that?
JDBC drivers are implemented on top of JDK APIs and so will work the same way: their I/O would automatically be non-blocking when run on a virtual thread (module some quality-of-implementation issues around the use of synchronized that we're working on, which are mentioned in the material I linked to).
JDBC drivers that are implemented on top of their own native code are a different matter, but they are not common these days.
Then you either don't get the same scalability that virtual threads give you or you get it but with asynchronous code that requires not just more work but can't enjoy the same observability/debuggability on the Java platform.
Sure. Because handling server requests typically requires IO, if you wish not to block you need some way to sequence operations that is different from the ordinary sequential composition of the language (beforeIO(); blockingIO(); afterIO()). Similarly, other language constructs that build on top of basic sequential composition -- loops, exceptions, try/finally -- no longer work across the IO boundary. Instead you must reach for an asynchronous composition DSL (such as the one offered by CompletableFuture) which is not as composable as the basic language primitives.
Moreover, the platform now has no insight about your composition. Exceptions, which are designed to give context in the form of a thread stack trace, simply don't know about the context as it's not composed through the normal composition (in plain terms, stack traces in asynchronous code don't give you the operation's context). Debuggers cannot step through the asynchronous flow because the platform's built in debugging support works only by stepping through threads, and profilers are no longer able to assign IO to operations: a server that's under heavy load may show up as idle thread pools in a profiler because the platform cannot assign an asynchronous operation to some asynchronous pipeline such as CompletableFutures because these are not observable constructs of the Java platform.
Virtual threads give you the same scalability as asynchronous code but in a way that fits with the design of the Java platform. All language constructs work and compose well, debuggers step through code, and profilers can understand what's going on.
That's not to say that some other platform could not be designed around a different construct, but the Java platform -- from language, through libraries and bytecode, and all the way to the VM and its tooling interfaces -- was designed around the idea that sequential composition occurs by sequencing operations on a single thread. And virtual threads are just Java threads.
You can find more information, including some examples, in our virtual thread JEP [1] and adoption guide [2].
We did spend some time contemplating teaching the platform about non-thread-based, i.e. asynchronous sequential composition, in the end we realised that if it walks like a thread and quacks like a thread, we might as well call it a thread.
If you read the JEP and play around with virtual threads (e.g. do asynchronous IO with CompletableFuture or blocking IO in a virtual thread and see what their exception stack traces look like and what their JFR profile looks like) you'll quickly see that the capabilities they offer were simply not attainable by asynchronous code, which is why we've spent years to teach the JVM's innermost mechanisms to be able to observe virtual threads and expose them to observability tools the same way as platform threads are (and how I know those capabilities weren't available before).
We've written and presented a significant amount of published material about virtual threads so there's not much point in recreating it here, but if you're interested, all that material is out there.
I read materials, and my opinion is that virtual threads is hyped mess which adds very little benefits (or maybe doesn't at all) in very few use cases, but will bring more complexity and fragmentation into platform:
- 95% java business spaghetti code doesn't require such scalability and fine with spawning of 10k threads on modern hardware
- in 5% left cases, 80% can be covered by executorservice and forkjoinpool
- in 1% cases which left, engineer made wrong decision in choosing JVM because of its other many performance issues
The fact that you can't bring simple code example and quality of your previous comments make me think that you not necessary understand what are you doing.
I don't know what relevant performance issues you're referring to, but if you want to learn more about concurrency and performance, I suggest you start with some of the basics of the theory behind concurrent servers: https://youtu.be/07V08SB1l8c
As I said, I've put examples and detailed explanations in a significant amount of material that's available online that will help you understand how and why user mode threads work and why we decided to add them to Java. While I can't teach concurrency and the design of the Java platform from the ground up (especially detailed mechanisms such as stack walking, JVM TI and JFR) on an individual basis on social media, I'd be happy to answer any specific questions you may have once you've familiarised yourself with the subject.
Are you aware you're talking to the guy who added virtual threads to the JVM? Disagree on design if you wish, but the idea he isn't an expert in these matters is a bit silly.
He is person on salary in oracle, which is not top tier tech company.
There are tons of virtual threads like frameworks were implemented in JVM and in other languages, looking at his github profile he has 5 years experience in working on this thread stuff, before that he worked on some bloated j2ee stuff, all of these doesn't qualify for some unconditional authority, so I judge him base on his expertise demonstrated in this discussion, which looks very weak.
Currently virtual threads aren't a good match if you have a CPU heavy workload. The scheduler isn't fair and if your code doesn't enter into any blocking code it won't be unmounted from the carrier thread.
Scala community always thinks they're the best tool. The size of the community is at best static though, Kotlin and re-energized Java took away most of the reasons for using it. I know in my company the teams that went the Scala route complain of huge compile times and really struggle to find people, I think we'll probably port back to Java.
It's great, but irrelevant since Scala is already so far ahead. I will start to care if i am ever forced to do java again. I love how much better Java is getting! Most of these things we have had in scala for a long time already, and much better versions.
If you're viewing that website on a desktop, I strongly suggest removing max-width: 90ch from the body css. Instead of 50% white space, it goes full width and makes the table substantially more readable (particularly the code samples).
Hilariously enough I was initially confused by this comment because the webpage rendered so readably for me - the base CSS is actually quite reasonable and because I have JS disabled by default the page never re-rendered into the thinner mode.
It may be my specific setup. But on a 1440p display, 125% OS scale, I'm seeing more white left/right than actual content in the middle. It is also wrapping the code making it difficult to read.
Can anyone explain this comment: "In the past, a thread pool didn't just throttle the incoming requests but also the concurrent resources that your app consumed. If you now accept many more incoming requests, you may need other ways to manage resource consumption."
Yeah, if your server maxed out at 256 system threads you didn't have to worry about the fact that 1024 simultaneous calls would crash your DB. But now you're not limited by system threads
You can still use connection pool + platform threads. Or executor with virtual threads and semaphores or blocking queues. It’s mostly a concern for someone who implements a connection pool, for most devs it’s gonna be the same config option of max connection, that you need to pay attention to.
Any modern web app already has multiple instances of the app querying a db, so you have to keep a tally of total connection number either way.
That's all sequential code, it would be run inside a single "virtual thread". Note that the async code on the right is also sequential, just structured through an async API.
From my perspective they're not entirely equivalent. The async variant seems to be batching getImages and saveImages, while the sync variant gets and saves each image individually, sequentially.
They aren't perfectly equivalent because the virtual thread example uses a loop instead of the following (dropping the try/catch):
// client.sendAsync(request, HttpResponse.BodyHandlers.ofString())
var response = client.send(request, HttpResponse.BodyHandlers.ofString());
// .thenApply(HttpResponse::body)
var body = response.body();
// .thenApply(this::getImageURLs)
var urls = getImageURLs(body);
// .thenCompose(this::getImages)
var images = getImages(urls);
// .thenAccept(this::saveImages)
saveImages(images);
And if it had been written this way it would have been clearer that they are, in fact, equivalent. But generally people don't write like this, they use looping constructs.
Regardless, the important bit is that the parallel/concurrent bit of the async one is that it is cast off into an async system. The following execution steps are, well, steps. Each executed in sequence. Just like the body of the virtual thread example would be executed, but without the cumbersome noise of thenApply and thenCompose and such.
Just spit balling but you should be able to use clojure core async channels and the blocking put/take/alts functions. Would probably take a small amount of work to expose those things to Java in an idiomatic way but should be doable. Please take all of that with a giant grain of salt though!
The size of the integer types are already fixed by the JVM specification (int is always 32 bits, etc.), and there are no unsigned integer types in Java except for char (a 16-bit unsigned integer type). Furthermore, Java does not support alias names for types. Hence it’s unclear what your question is aiming at.
AFAIK Java 8 added a few methods that helps you handle integers as if they were unsigned, like `toUnsignedString`. I think it's enough for any exotic cases.
It sounds like it has some neat new features. But I'll never know because I'm never again going to use another Oracle thing. There's not a thing they could make that's good enough for me to agree to one of their EULAs and install it. Their behavior in that area is just staggeringly bad.
Then use Java without agreeing to an Oracle EULA. You can get a GPLv2 open source build from https://jdk.java.net/21/
If you don't trust the Oracle based open source builds then just wait a bit for Microsoft, Redhat, and others to release their version 21 OpenJDK builds that will be found under https://adoptium.net/marketplace/
Most of which were likely introduced during new feature development in recent releases. To suggest that this on its own somehow manifests a more stable jdk compared to some ancient, battle tested version of the jdk is debatable.
I find it rather concerning that so many bugs exist to begin with. Why are these not caught sooner?
Has the whole world gone crazy? Am I the only one around here who gives a shit about quality? Mark it zero!
Being allergic to JIRA, my JIRA-fu is weak, so there's probably an easier/faster way to report bugs fixed in v21.
Any way.
> Am I the only one around here who gives a shit about quality?
Ages ago, I was a QA/Test manager. So I appreciate your sentiment. But it seems to me that Oracle's being a FANTASTIC shepherd of Java. Definitely a huge upgrade, at the very least.
While you're right that the number of bugs is not very meaningful and most are probably work on brand new features, but bugs in old features are always first fixed in the current version, and then only a subset of them (usually a small subset) is backported to old releases, and regressions are not common.
As to why some bugs go unnoticed for long, if you look at the bug database for reports of bugs that have been effect for a long while you'll see that these are almost always rather extreme corner cases (or, more precisely, the more utilised a mechanism is, the more extreme would be its old bugs). That's simply because full coverage is simply infeasible for software of such size (~8MLOC); you see similar bug numbers for the Linux kernel. The largest software that can be shown to be free of bugs is currently on the order of 10KLOC, so if your software is much larger than that and isn't getting many bug reports it's probably because it's not used that much.
Expect a lot of libraries to start release versions that are java 21 baseline because of this feature alone. We're in for a little bit of dependency hell for the short while. Thankfully, devs have been exposed to a mostly final loom for a year, so my hope is that at least the big projects are well on their way to quick adoptions.
Unlike the 8->11 migration which largely brought pain, the 8->21 release brings with it a ton of value that i think will encourage most shops to actually pull the trigger and finally abandon 8.