These days, fast lock implementations use the following rough idiom, or some idiom that is demonstrably not any slower even for short critical sections.
if (LIKELY(CAS(&lock, UNLOCKED, LOCKED))) return;
for (unsigned i = 0; i < 40; ++i) {
/* spin for a bounded a mount of time */
if (LIKELY(CAS(&lock, UNLOCKED, LOCKED))) return;
sched_yield();
}
... /* actual full lock algo goes here */
So, the reason to use spinlocks isn't that they are faster for short critical sections, but that they don't have to CAS on unlock - and so they are faster especially in the uncontended case (and in all other cases too).
In other words, the modern rule of thumb is something like: if you're going to grab the lock so frequently that the uncontended lock/unlock time shows up as a significant percentage of your execution time, then use a spinlock. That probably implies that you are probably holding the lock for <100x or even <1000x context switch time, or something around there.
> if you're going to grab the lock so frequently that the uncontended lock/unlock time shows up as a significant percentage of your execution time, then use a spinlock.
Yeah, and maybe also consider changing your design because usually this isn't needed.
This is (in my experience) a byproduct of good design, so changing the design wouldn't be a great idea.
Every time I've seen this happen it's in code that scales really well to lots of CPUs while having a lot of shared state, and the way it gets there is that it's using very fine-grained locks combined with smart load balancing. The idea is that the load balancer makes it improbable-but-not-impossible that two processors would ever want to touch the same state. And to achieve that, locks are scattered throughout, usually protecting tiny critical sections and tiny amounts of state.
Whenever I've gotten into that situation or found code that was in that situation, the code performed better than if it had coarser locks and better than if it used lock-free algorithms (because those usually carry their own baggage and "tax"). They performed better than the serial version of the code that had no locks.
So, the situation is: you've got code that performs great, but does in fact spend maybe ~2% of its time in the CAS to lock and unlock locks. So... you can get a tiny bit faster if you use a spinlock, because then unlocking isn't a CAS, and you run 1% faster.
Lots of code is 'lock free' and uses the same pattern (more or less). Attempt the change optimistically, if it fails spin it, if it keeps failing (Massive Contention), back off.
spin locks prevent forward grantees, e.g. when the thread is scheduled out (or serves an interrupt), no other thread can make progress. Lock free allows progress - there is no exclusivity. Of course if working on the kernel and controlling the scheduler, etc. the restriction does not apply immediately.
Realistically though, lots of lock free employs copy-on-write or rcu.
I have written quite the fair share of lock free code (back then when it was new), and generally enjoy writing it (to this day). Yet, for most folks it's just too hard to reason/understand/maintain it.
Just a note, lock free is a lesser requirement than wait-free.
In other words, the modern rule of thumb is something like: if you're going to grab the lock so frequently that the uncontended lock/unlock time shows up as a significant percentage of your execution time, then use a spinlock. That probably implies that you are probably holding the lock for <100x or even <1000x context switch time, or something around there.