More

javierhonduco · 2025-01-14T17:52:41 1736877161

These metrics can be used in performance reviews at Facebook.

javierhonduco · 2024-12-26T20:41:30 1735245690

Zig 0.13 is required according to https://ghostty.org/docs/install/build

javierhonduco · 2024-12-25T20:02:51 1735156971

This is pretty cool work.

Something that’s been on my mind recently is that there’s a need of a high-performance flame graph library for the web. Unfortunately the most popular flame graph as a library / component, basically the react and d3 ones, work fine but the authors don’t actively maintain them anymore and their performance with large profiles is quite poor.

Most people that care about performance either hard-fork the Firefox profiler / speedscope flame graph component or create their own.

Would be nice to have a reusable, high performance flame graph for web platforms.

javierhonduco · 2024-10-07T09:37:38 1728293858

Axum + minijinja is quite close to this I would say. Been using it for a little while and I am very happy so far.

aquariusDue · 2024-10-07T10:46:41 1728298001

Seconding the recommendation, it's really great when paired with HTMX on the frontend too.

javierhonduco · on Aug 12, 2024

Haven’t had the chance to play with WezTerm just yet but wanted to share that the author is an incredibly smart, friendly, and humble.

Had the opportunity to work on a project together at work some years back and I can only aspire to be 1/10th as good of an engineer as him. A true hacker.

javierhonduco · on July 22, 2024

Or in other parts of the kernel. It's been the case in multiple occasions that buggy locking (or more generalised, missing 'resource' release) has caused problems for perfectly safe BPF programs. For example, see https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1033398 and the fix https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

umanwizard · on July 22, 2024

This is actually exactly the bug I was thinking of, so fair point! (I work at PS now and am aware you worked on debugging it a while back).

javierhonduco · on July 22, 2024

It is not, programs that are accepted are proved to terminate. Large and more complex programs are accepted by BPF as of now, which might give the impression that it's now Turing complete, when it is definitely not the case.

javierhonduco · on July 5, 2024

Personally I’m not a fan of Go’s default zero-initialisation. I’ve seen many bugs caused by adding a new field, forgetting to update constructors to intialise these fields to “non-zero” values which caused bugs. I prefer Rust’s approach where one has to be explicit.

That being said it’s way less complex than C++’s rules and that’s welcomef.

maccard · on July 5, 2024

I spent a year and a half writing go code, and I found that it promised simplicity but there an endless number of these kinds of issues where it boils down to "well don't make that mistake".

gizmo686 · on July 5, 2024

It turns out that a lot of the complexity of modern programming languages come from the language designers trying to make misaked harder.

If you want to simplyfing by synthesising decades of accumulated knowledge into a coherent language, or to remove depreciated ideas (instead of the evolved spaghetti you get by decades of updating a language) then fine. If your approach to simplicity is to just not include the complexity, you will soon disciplinary that the complexity was there for a reason.

kccqzy · on July 5, 2024

The problem you are describing in Go is rarely a problem in C++. In my experience, a mature code base rarely has things with default constructors, so adding a new field will cause the compiler to complain there's no default constructor for what you added, therefore avoiding this bug. Primitive types like `int` usually have a wrapper around them to clarify what kind of integers, and same with standard library containers like vector.

However I can't help but think that maybe I'm just so fortunate to be able to work in a nice code base optimized for developer productivity like this. C++ is really a nice language for experts.

dieortin · on July 5, 2024

Why would you have a wrapper around every primitive/standard library type?

kccqzy · on July 6, 2024

Type safety.

Compare `int albumId, songId;` versus `AlbumId albumId; SongId songId;`. The former two variables can be assigned to each other causing potential bug and confusion. The latter two will not. Once you have a basic wrapper for integers, further wrappers are just a one-liner so why not. And also in practice making the type more meaningful leads you to shorter variable names because the information is already expressed in types.

catlifeonmars · on July 5, 2024

FWIW there is a linter that enforces explicit struct field initialization.

ErikBjare · on July 5, 2024

Haven't written Go in a long time, but I do remember being bit by this.

zarathustreal · on July 5, 2024

Yea this can be problematic if you don’t have sum types, it’s hard to enforce correct typing while also having correct default / uninitialized values.

dgfitz · on July 5, 2024

Wouldn’t it just be considered bad practice to add a field and not initialize it? That feels strongly like something a code review is intended to catch.

javierhonduco · on July 5, 2024

It’s easy to miss this in large codebases. Having to check every single struct initalisation whenever a field is added is not practical. Some folks have mentioned that linters exist to catch implicit initialisation but I would argue this shouldn’t require a 3rd party project which is completely opt-in to install and run.

dieortin · on July 5, 2024

All bugs are considered bad practice, yet they keep happening :P

crowdyriver · on July 5, 2024

You can always use exhaustruct https://github.com/GaijinEntertainment/go-exhaustruct

to enforce all fields initialized.

If you care, the linter is there, so this is more of a skill issue.

javierhonduco · on March 24, 2024

For anybody interested in this topic, “Release it!” is a pretty good read. (Not affiliated in any way, just enjoyed reading it)

https://www.oreilly.com/library/view/release-it/978168050026...

jb3689 · on March 24, 2024

I just finished this. To each their own of course, but I found the writing too padded and tonally off-putting at times. Some of the stories felt dated both from a technological stance and a cultural stance. I prefer Azure's Cloud Pattern docs myself (though "Release It!" was really good if you prefer a storytelling approach):

https://learn.microsoft.com/en-us/azure/architecture/pattern...

javierhonduco · on March 24, 2024

This looks incredibly comprehensive, thanks for sharing!

Should have added that I read this book in 2016, and the first edition is even older, so there’s naturally been lots of new (and exciting) developments in this area!

javierhonduco · on March 17, 2024

Overall, I am for frame pointers, but after some years working in this space, I thought I would share some thoughts:

* Many frame pointer unwinders don't account for a problem they have that DWARF unwind info doesn't have: the fact that the frame set-up is not atomic, it's done in two instructions, `push $rbp` and `mov $rsp $rbp`, and if when a snapshot is taken we are in the `push`, we'll miss the parent frame. I think this might be able to be fired by inspecting the code, but I think this might only be as good as a heuristic as there could be other `push %rbp` unrelated to the stack frame. I would love to hear if there's a better approach!

* I developed the solution Brendan mentions which allows faster, in-kernel unwinding without frame pointers using BPF [0]. This doesn't use DWARF CFI (the unwind info) as-is but converts it into a random-access format that we can use in BPF. He mentions not supporting JVM languages, and while it's true that right now it only supports JIT sections that have frame pointers, I planned to implement a full JVM interpreter unwinder. I have left Polar Signals since and shifted priorities but it's feasible to get a JVM unwinder to work in lockstep with the native unwinder.

* In an ideal world, enabling frame pointers should be done on a case-by-case. Benchmarking is key, and the tradeoffs that you make might change a lot depending on the industry you are in, and what your software is doing. In the past I have seen large projects enabling/disabling frame pointers not doing an in-depth assessment of losses/gains of performance, observability, and how they connect to business metrics. The Fedora folks have done a superb and rigorous job here.

* Related to the previous point, having a build system that enables you to change this system-wide, including libraries your software depends on can be awesome to not only test these changes but also put them in production.

* Lastly, I am quite excited about SFrame that Indu is working on. It's going to solve a lot of the problems we are facing right now while letting users decide whether they use frame pointers. I can't wait for it, but I am afraid it might take several years until all the infrastructure is in place and everybody upgrades to it.

- [0]: https://web.archive.org/web/20231222054207/https://www.polar...

rwmj · on March 17, 2024

On the third point, you have to do frame pointers across the whole Linux distro in order to be able to get good flamegraphs. You have to do whole system analysis to really understand what's going on. The way that current binary Linux distros (like Fedora and Debian) works makes any alternative impossible.

spc476 · on March 18, 2024

It could be one instruction: ENTER N,0 (where N is the amount of stack space to reserve for locals)---this is the same as:

    PUSH EBP
    MOV  ESP,ESP
    SUB  SP,N

(I don't recall if ENTER is x86-64 or not). But even with this, the frame setup isn't atomic with respect to CALL, and if the snapshot is taken after the CALL but before the ENTER, we still don't get the fame setup.

As for the reason why ENTER isn't used, it was deemed too slow. LEAVE (MOV SP,BP; POP BP) is used as it's just as fast as, if not faster, than the sequence it replaces. If ENTER were just the PUSH/MOV/SUB sequence, it probably would be used, but it's that other operand (which is 0 above in my example) that kills it performance wise (it's for nested functions to gain access to upper stack frames and is every expensive to use).

felixge · on March 17, 2024

Great comments, thanks for sharing. The non-atomic frame setup is indeed problematic for CPU profilers, but it's not an issue for allocation profiling, Off-CPU profiling or other types off non-interrupt driven profiling. But as you mentioned, there might be ways to solve that problem.

brancz · on March 17, 2024

Great comment! Just want to add we are making good progress on the JVM unwinder!