More

dimatura · 2025-08-13T22:32:08 1755124328

Just one data point, but if it's as nice to use as their open source tools and not outrageously expensive, I'd be a customer. Current offerings for private python package registries are kind of meh. Always wondered why github doesn't offer this.

dimatura · 2025-08-13T01:10:03 1755047403

I had a similar thought, but at the same time, if people were mandated to use Windows or MacOS then that would also pretty much lock you into their respective window managers. I guess it feels more restrictive partly because it's more common to pick and choose WMs on linux. (And partly because, yeah, seems like the setup goes way beyond just a distro+WM).

bccdee · 2025-08-13T02:18:39 1755051519

Also because it's a niche WM catering to pretty specific preferences. I used tiled window managers for a while but eventually I decided I preferred the no-customization, one-size-fits-all Gnome/KDE experience. A hyperland config is going to fit like a tailor-made glove—but in this case, it's a glove tailor-made for your CEO, not you.

I spun up Omarchy in a VM just to see what the fuss was about, and when I opened Neovim it booted into a plugin manager and started installing at least two dozen random plugins, including an extremely over-eager autocomplete config that filled my screen with snippet suggestions. I was instantly irritated.

dimatura · 2025-08-01T14:18:26 1754057906

The few examples they show do look pretty good for a wifi-based method, although who knows how cherry-picked they are. I wonder how much the "SLAM" part is contributing and how sensitive that is to the sensor quality on the phone. I would've assumed that they'd be using vision, which seems to be the method of choice for other companies like niantic. The ground-truth data part for vision would certainly be more onerous, though.

cyanydeez · 2025-08-01T22:25:27 1754087127

He explains it fairly well if you understand how you'd go from wifi accuracy to SLAM. THE WIFI was providing 3m accuracy and the SLAM down to 1M. how much it provides is those two numbers. I'm sure the algorithms are complex but he points out that SLAM is corrected by the actual maps made by the self service app. So it's fairly easy to understand: the map provides a probability space, the wifi puts you within 3m and the SLAM is use to fill in the blanks with help from the probability space.

dimatura · 2025-07-31T22:17:15 1754000235

This sounds pretty similar to CDE, which I see they cite in the paper. Back in the pre-docker days I remember using CDE a few times to package some C++ code to run on some servers that didn't have the libraries I needed. Pretty cool tool.

dimatura · 2025-07-29T00:50:43 1753750243

I was going to mention this sounds like the idea behind adversarial approaches, which I guess go all the way back to game theory and algorithms like minimax. They're definitely used in the control literature ("adversarial disturbances"). And of course GANs.

dimatura · 2025-07-24T20:35:20 1753389320

Yeah, was going to post this - great talk that I've recommended to incoming devs on our team.

dimatura · 2025-07-23T03:59:53 1753243193

You can examine the code of zxing-cpp (which is fairly nice IMO) for a simple, "classical computer vision" approach to this. It's not the most robust implementation but it is pretty functional.

But in general, you can divide the problem more or less like this (not necessarily in this order) 1. find the rough spatial region of the barcode. Crop that out and only focus on this 2. Correct ("rectify") for any rotation or perspective skew of the barcode, turn it into a frontoparallel version of the barcode 3. Binarize the image from RGB or grayscale into pure black and white 4. Normalize the size so that each pixel is the smallest spatial unit of the barcode.

dimatura · 2025-07-23T03:53:18 1753242798

You can roughly divide barcode reading into a "frontend" and a "backend". The backend is the most well understood (but not necessarily trivial) part: you take a binary image, with each pixel corresponding to one little square in the QR code, and decode its payload. It doesn't need computer vision. The "frontend" is the part that takes the raw image containing the barcode and tries to find the barcode, and convert the barcode it finds into a nice, clean binary image for the backend. This is a computer vision problem and you can arbitrarily fancy, including up to using the latest trends in ML vision models. However, this isn't necessarily needed in most cases; after all, barcodes are designed to be easy to read for machines. With a large, sufficiently well focused and well exposed image of a barcode you can get away with simple classical computer vision algorithms like histogram-based binarization and some heuristics to identify the spatial extent of the barcode (for example, most barcode symbologies mandate "quiet space" (blank space) to be around the barcode, and have start and stop markers; QR codes have those prominent concentric squares on the corners).

As for implementation, Zxing-cpp [1] is still maintained, and pretty good as far as open source options go. At this point I'm not sure how related it is to the original zxing, as it has gone substantial development. It has python bindings which may be easier to use.

On mobile, Google MLkit and Apple vision also have barcode reading APIs, not open source but otherwise "free" as in beer.

[1] https://github.com/zxing-cpp/zxing-cpp

dimatura · 2025-07-22T01:10:35 1753146635

I've found myself adopting this philosophy for a specific use case: monitoring ML training jobs. It's pretty common to see people output training metrics (loss, validation accuracy, etc) every N batches, iterations or epochs. And that does make sense for a lot of reasons, and it's pretty simple to do. But also when you're exploring models that might have wildly varying inference latencies, or using different hardware, or varying batch sizes, or using a differently sized dataset, all of those might end up reporting too infrequently to get an idea of what's happening or too frequently and just spamming too much output.

Checkpointing the model every N iterations/epochs/batches has a similar problem - you may end up saving very few checkpoints and risk losing work or waste a lot of time/space with lots of checkpoints.

So I've often found myself implementing some kind of monitoring and checkpointing callbacks based on time, e.g., reporting every half an hour, checkpointing every two hours, etc.

dimatura · 2025-07-07T18:18:24 1751912304

I don't know if there's an earlier source, but I'm guessing Matlab originally popularized the `imread` name, and that OpenCV (along with its python wrapper) took it from there, same for scipy. Scikit-image then followed along, presumably.