Hacker News new | past | comments | ask | show | jobs | submit login

> Either of those severely limits the scalability

you can avoid both issues by using 20yo executorservice.




If the code is simple, blocking code, then the number of threads required in the pool is the average total duration of a request times the fanout times the request rate. That number can easily reach many thousands and more.


yes, you shouldn't add blocking code into executorservice..


Wtf, where on Earth do you put blocking code then? Firing off some long-running task in a background thread through executors is bog-standard usecase.


discussion was about specific context: avoiding overhead from spawning millions of threads, in this case you shouldn't have any blocking code at all, all API should utilize epoll underneath or something similar.


And where do you handle callbacks?


there are tons of variations, depending on your logic and API. The closest to virtual threads is ForkJoinPool and RecursiveTask, where you can have code like regular blocking code:

var f = async_api_returns_future();

...

var res = f.join();

but join() won't block OS/JVM thread, but make it to perform other tasks in the queue.

Or you can design API which will receive executorService as params, and run callback there, e.g.:

async_call(Callable callback, ExecutorService threadPool);


One way to see how different virtual threads are from our old mechanisms is to ask yourself how many IO operations you can have in flight. There are two options: either the operations are blocking, in which case the number will be equal to the (very limited) number of threads in all of your thread pools combined, or the operations are non-blocking, in which case thread context that is so necessary for troubleshooting and JFR profiles is lost (e.g. JFR can't know on behalf of whom is some IO operation performed because the "owner" of some operation -- in the design of the Java platform -- can only be a thread). Virtual threads allow you to have hundreds of thousands (or even millions) of I/O operations in flight (which you need for high throughput as a result of Little's law) while still preserving observable context.

BTW, as for fork-join's `join`, not only is it designed for pure computation only and cannot help with IO, but every `join` increases the depth of the stack, so it is fundamentally limited in how much it can help. FJ is designed for pure computation workloads, so in practice that's not a huge limitation, but virtual threads are designed for IO workloads.

I apologise for not going into more depth here, but as you can imagine, with a user base numbering in the many millions, we can only afford to put in depth explanations in texts and videos that gain a large audience, but once you've familiarised yourself with the material I'll gladly answer specific questions (and perhaps your questions will help us improve our material, too).


> I apologise for not going into more depth here, but as you can imagine, with a user base numbering in the many millions

my concern is that you somehow can find time to write long comments with lots of handwavings (our framework is designed for that, and their framework is not designed for that), but refuse to provide specific code pointers and example in support of your opinion. For example, in this specific case, can you give example how green threads can be used with current Java IO library, or Java JDBC library?


Virtual threads are just Java threads, so any blocking code using any API will just work. If you're looking for introductory material on virtual threads with specific examples you can find some here: https://openjdk.org/jeps/444, https://docs.oracle.com/en/java/javase/21/core/virtual-threa..., https://youtu.be/l_Vlz0wkG58?si=MF-XJW7c8T5jGf5i&t=1914

If there's something unclear in that material, please ask. Also, there is no our framework vs. their framework here. I'm only discussing the JDK's own various thread pools vs. virtual threads. They were all designed and implemented by the JDK team.

BTW, I'm not trying to support any opinion. Pretty much all popular Java server frameworks are adopting virtual threads, and Java's virtual threads are poised to become the most popular lightweight user mode threads. We've already convinced everyone that needed convincing. I'm merely offering pointers in case you're interested to learn how to use virtual threads and understand how they offer high throughput and good observability at the same time (whereas before you could have one or the other). Of course, if you're satisfied with the throughput and observability you can get with our old mechanisms, you don't have to use virtual threads. We've not taken anything away.


There is this example in material:

try (var in = url.openStream()) { return new String(in.readAllBytes(), StandardCharsets.UTF_8); }

which claims that this example will scale well with virtual threads, my understanding is that in.readAllBytes() will call OS blocking socket API underneath, which will block OS thread, so you would need many OS threads to scale. Is this understanding correct?


It is not. Blocking IO (with some exceptions mentioned in the JEP) will automatically be translated by the runtime into non-blocking IO when it occurs on virtual threads, and no OS threads will be blocked. The Java code will look blocking and that's what thread dumps and other Java observability mechanisms will show, but to the OS it will seem as if it's running non-blocking code.

You can have a million threads blocking on a million sockets (obviously without creating a million OS threads): https://github.com/ebarlas/project-loom-c5m

You can't do that with thread pools. You could achieve that scalability with async code, but then observability tools will not be able to track the IO operations and who initiated them, but with virtual threads you'll see exactly what business operation is doing what IO and why.


> will automatically be translated by the runtime into non-blocking IO when it occurs on virtual threads, and no OS threads will be blocked

it looks like it is true for several API you implemented support for. What about other API, for example some JDBC driver which wants to use non-blocking DB driver. How to use virtual threads with that?


JDBC drivers are implemented on top of JDK APIs and so will work the same way: their I/O would automatically be non-blocking when run on a virtual thread (module some quality-of-implementation issues around the use of synchronized that we're working on, which are mentioned in the material I linked to).

JDBC drivers that are implemented on top of their own native code are a different matter, but they are not common these days.


Then you either don't get the same scalability that virtual threads give you or you get it but with asynchronous code that requires not just more work but can't enjoy the same observability/debuggability on the Java platform.


could you give example what requires more work exactly and where virtual threads give more "observability"?..


Sure. Because handling server requests typically requires IO, if you wish not to block you need some way to sequence operations that is different from the ordinary sequential composition of the language (beforeIO(); blockingIO(); afterIO()). Similarly, other language constructs that build on top of basic sequential composition -- loops, exceptions, try/finally -- no longer work across the IO boundary. Instead you must reach for an asynchronous composition DSL (such as the one offered by CompletableFuture) which is not as composable as the basic language primitives.

Moreover, the platform now has no insight about your composition. Exceptions, which are designed to give context in the form of a thread stack trace, simply don't know about the context as it's not composed through the normal composition (in plain terms, stack traces in asynchronous code don't give you the operation's context). Debuggers cannot step through the asynchronous flow because the platform's built in debugging support works only by stepping through threads, and profilers are no longer able to assign IO to operations: a server that's under heavy load may show up as idle thread pools in a profiler because the platform cannot assign an asynchronous operation to some asynchronous pipeline such as CompletableFutures because these are not observable constructs of the Java platform.

Virtual threads give you the same scalability as asynchronous code but in a way that fits with the design of the Java platform. All language constructs work and compose well, debuggers step through code, and profilers can understand what's going on.

That's not to say that some other platform could not be designed around a different construct, but the Java platform -- from language, through libraries and bytecode, and all the way to the VM and its tooling interfaces -- was designed around the idea that sequential composition occurs by sequencing operations on a single thread. And virtual threads are just Java threads.


I am not certain many of your assessments are correct. If you give specific simple example of code, we can iterate there.


You can find more information, including some examples, in our virtual thread JEP [1] and adoption guide [2].

We did spend some time contemplating teaching the platform about non-thread-based, i.e. asynchronous sequential composition, in the end we realised that if it walks like a thread and quacks like a thread, we might as well call it a thread.

If you read the JEP and play around with virtual threads (e.g. do asynchronous IO with CompletableFuture or blocking IO in a virtual thread and see what their exception stack traces look like and what their JFR profile looks like) you'll quickly see that the capabilities they offer were simply not attainable by asynchronous code, which is why we've spent years to teach the JVM's innermost mechanisms to be able to observe virtual threads and expose them to observability tools the same way as platform threads are (and how I know those capabilities weren't available before).

We've written and presented a significant amount of published material about virtual threads so there's not much point in recreating it here, but if you're interested, all that material is out there.

[1]: https://openjdk.org/jeps/444

[2]: https://docs.oracle.com/en/java/javase/21/core/virtual-threa...


I read materials, and my opinion is that virtual threads is hyped mess which adds very little benefits (or maybe doesn't at all) in very few use cases, but will bring more complexity and fragmentation into platform:

- 95% java business spaghetti code doesn't require such scalability and fine with spawning of 10k threads on modern hardware

- in 5% left cases, 80% can be covered by executorservice and forkjoinpool

- in 1% cases which left, engineer made wrong decision in choosing JVM because of its other many performance issues

The fact that you can't bring simple code example and quality of your previous comments make me think that you not necessary understand what are you doing.


I don't know what relevant performance issues you're referring to, but if you want to learn more about concurrency and performance, I suggest you start with some of the basics of the theory behind concurrent servers: https://youtu.be/07V08SB1l8c

As I said, I've put examples and detailed explanations in a significant amount of material that's available online that will help you understand how and why user mode threads work and why we decided to add them to Java. While I can't teach concurrency and the design of the Java platform from the ground up (especially detailed mechanisms such as stack walking, JVM TI and JFR) on an individual basis on social media, I'd be happy to answer any specific questions you may have once you've familiarised yourself with the subject.


> While I can't teach concurrency and the design of the Java platform

my point is that you can't teach because you don't have expertise and getting lost on extremely basic things: https://news.ycombinator.com/item?id=37620046


Are you aware you're talking to the guy who added virtual threads to the JVM? Disagree on design if you wish, but the idea he isn't an expert in these matters is a bit silly.


He is person on salary in oracle, which is not top tier tech company. There are tons of virtual threads like frameworks were implemented in JVM and in other languages, looking at his github profile he has 5 years experience in working on this thread stuff, before that he worked on some bloated j2ee stuff, all of these doesn't qualify for some unconditional authority, so I judge him base on his expertise demonstrated in this discussion, which looks very weak.


> 5 years experience in working on this thread stuff

He had worked on "thread stuff" before working on loom on the JVM. I did a search and that's 10 years ago.

https://web.archive.org/web/20130601144756/https://blog.para...


which makes quality of discussion even more confusing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: