Hacker Newsnew | past | comments | ask | show | jobs | submit | lmb's commentslogin

    Location: London, UK /  Munich, Germany
    Remote: No, hybrid only
    Willing to relocate: Maybe
    Technologies: Go, C, eBPF, Linux
    Résumé/CV: on request
    Email: jobs@lmb.io
Systems engineer with ten years of experience building performance critical software. The past couple of years I’ve been maintaining the most popular Go library for working with eBPF (used by DataDog, Microsoft, etc.) and have contributed to the Linux network stack, the Go standard library and other high profile open source projects. Before that I built Quicksilver (global distributed db) and Unimog (anycast L4 load balancer) at Cloudflare. In my spare time I’ve dabbled in embedded operating systems, virtual machine monitors and emulators.

I’m self taught, learn by doing and enjoy going deep on a specific subject. After working as a fully remote open source maintainer I’m looking for a small team that values collaborating in person, in a strategic role owning a domain or problem space, with a view to leading a team.

I’m curious about problems that require exceptional performance and / or reliability in areas like power infrastructure, renewable energy, operating AI models at scale, non-LLM AI, hardware.


Counter point from Gleixner which is worth reading for us in the peanut gallery: https://lwn.net/ml/linux-kernel/87r11qp63n.ffs@tglx/


Missing Allen (2001) by my late father Christian Bauer

https://www.youtube.com/watch?v=kOXSI_vcqK8

Around the turn of the century, my dad's good friend and longtime collaborator Allen Ross vanishes from one day to the next, just after they have finished shooting a film about the Mississippi. Years later, my dad returns to the US to find out what happened to his friend.

It's his most personal film for sure, and I remember him going off to the US for weeks and faxing us letters to keep in touch. It's also the one that had him most scared, he took out life insurance before he left because of the people he was looking into.


Count min sketches are really neat. A colleague at my old company used them to implement DDoS mitigations in eBPF and wrote it up: https://blog.cloudflare.com/building-rakelimit/

The code is also open source, and I've improved on it a bit: https://github.com/lmb/socklimit Not production ready but a cool idea and implementation.


I was wondering anyone could shed some light on the "These hash functions should be pairwise independent" part. I don't know what pairwise independent means. My background is mostly in web so I am familiar with things like `sha256`, `sha1`, `md5` things like that. Would you use these kind of functions for Count-min Sketch?

I saw in your socklimit project looks like `fasthash64` and `hashlittle` I'm not familiar with those any insight or recommended reading to understand these hash functions?

p.s. googling pairwise independent hash functions did get me some college class reading but doesn't mention any named hash functions developed out in the world.


Those hash functions are (or were) cryptographically secure. They aim to prevent many different kinds of attacks that may be launched on hashed data. There are many other noncryptographic ways to use hashes (like writing your own dictionary) that do not need the protections cryptographic hashes give you and therefore can trade those protections for performance.

Pairwise independence basically means that applying 2 different hash functions to the same key produces 2 distinct/seemingly random values. There's a much more precise mathematical definition but that's the essence.


You probably don't need the hash to be cryptographically secure. The simplest solution might be to use several uniquely keyed siphash instances, as siphash is quite cheap to compute, was designed for use in hash tables, and two siphashes with different keys behave pairwise independent, which roughly means there are no observable correlations between them.

[1] https://en.wikipedia.org/wiki/SipHash


I think we tried siphash at some point, unfortunately it's significantly slower than the two hash functions we ended up using.


We ended up with fasthash64 and lookup3 by looking for a fast hash that is easy to port to the restricted subset of C supported by eBPF with minimal changes. https://github.com/rurban/smhasher is a great resource for that.

I would probably choose different, more robust hash functions if I was targeting regular C.


We don't use kernel bypass anymore, it's all XDP: https://blog.cloudflare.com/l4drop-xdp-ebpf-based-ddos-mitig...


What does tail latency for the Zig pool look like? It seems like any Task that ends up in one of the overflow queues will stay there until at some ring buffer is emptied.

Put another way, during a period of more push than pop some tasks may see a very long delay before being worked on?


Id encourage you to try recording them yourself. The results can vary depending on your system, how much concurrent tasks it can make parallel, if scheduling resources are being used elsewhere, etc. The zig code contains an example of using timers + the spawning and joining is in quickSort() function so it should hopefully be easy to add the timing logic. I can answer questions regarding it if you hop on the IRC or Discord.

In regards to the overflow queue, yes some pathologica tasks may see long delyas but this is true for any mostly-FIFO or sharded queue scenario. Both Golang and Tokio overflow from their local buffer into a shared lock-protected linked list (tokio is a bit more eager in this regard) so they can suffer similar fates. They actually do an optimization which is to check the shared queue before the local buffer every few local scheduling ticks (% 61 or 64 for each task run iirc) to decrease the change of local starvation. Could try adding that to Zig's thread pool after the timing logic and see if that helps tail latencies. I'm curious about the outcome either way, but I may not have time to work on that.


What does tooling / debugging for p4 look like? How do you inspect what happens inside a p4 dataplane?

Context: I work on a L4 load balancer written in ebpf / XDP. Debugging that is still too hard even thought it's all software and open source.


There is a fair bit of co-evolution of the Linux BPF verifier and the BPF llvm backend. This means that the verifier is biased to do well on bytecode that is commonly emitted by clang. I can imagine that other programming languages will run into the verifier rejecting their output if it differs to much from common clang output. (There is redbpf which compiles rust to BPF AFAIK, but it still uses llvm. There is also a gcc BPF backend, but I don't have experience with that.) The good news is that Linux upstream is receptive to bug reports!


bpftrace is considering adding a new backend that wouldn't involve any LLVM[0].

Additionally, ply is another effort (which doesn't use LLVM) to build a DTrace-like frontend to BPF.[1]

[0] https://github.com/iovisor/bpftrace/issues/1845

[1] https://github.com/iovisor/ply


So it's more coupled than it appears. That makes sense. Thanks for the tip!


Seems like the right move from a volunteer run project, what will the future will hold though? Artificial scarcity is always a problem.

On another note, for just 20k$ I can offer you exclusive use of the xxgfzrf.dinglebop.me Public Suffix so that you can keep tracking your users. Please reach out to sales@example.com if you are interested.


It's interesting because being added to the PSL reduces your ability to track users. So yeah, I have a bridge to sell you, interested?


> being added to the PSL reduces your ability to track users

Not really, in fact it can increase your ability to track users if it's (ab)used in specific ways - see use case #2 and #3 here:

https://github.com/privacycg/private-click-measurement/issue...


There's an approval process to be added to the PSL so abuses would be quite surprising and easy to remove when discovered.


This entire discussion is literally about how that's not the case.


To make things worse, it's basically impossible to remove a domain from the PSL as no one knows how software built against the PSL would handle it. A removal could break tremendous amount of software that people rely on.


You're arguing that the majority of people use the term differently than you and are therefore wrong. Sounds like the definition of technobabble to me.


I think that tech has this extreme, albeit well-earn, revulsion to hype that makes them reject things based on terminology and community over actually understanding the thing.

I see it all the time. It's simply wrong to say that microservices is just "services". Many people being wrong changes nothing.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: