Love discussing work stealing queues. My first comment is that once you have a private queue you sort of lose the benefits of a work stealing queue and are in the domain of work sharing. That might be fine in practice but has different properties.
The weirdest work stealing queue I have read about is one that tries to model the x86 store queue to avoid a StoreLoad by making sure enough stores happen between the push and the take. Not very practical though.
Sadly, while Chase-Lev is great, there is currently a patent for the algorithm that won’t expire until 2025. Of course, this algorithm is held by non other than Sun/Oracle.
Once I discovered this, I had to switch over my implementation to something less… of a legal risk. What a shame.