More

inoop · 2025-07-12T13:02:31 1752325351

Respectfully, it's not up to other people to disprove your toy theory. The question you're asking here can very easily be answered with a quick Google search.

The short answer is that they are _very_ different controls, that looks and operate in a completely different way, located in a different place, and it's completely unrealistic to think a pilot could have mistaken one for the other.

inoop · 2025-05-13T17:01:14 1747155674

A hallmark of well-designed and well-written software is that it is easy to replace, where bug-ridden spaghetti-bowl monoliths stick around forever because nobody wants to touch them.

Just through pure Darwinism, bad software dominates the population :)

inoop · 2025-04-10T14:22:01 1744294921

I just wanted to give a huge shout-out to the excellent zune-jpeg library: https://github.com/etemesi254/zune-image/tree/dev/crates/zun...

Most Python libraries like PIL rely on some flavor of the venerable libjpeg (e.g. libjpegturbo, mozjpeg) which makes it the de-facto standard for loading jpegs.

A notable downside is that it's written in C and there have been more than a few exploits found inside it over the years. While it's pretty damn mature by now and there has been a huge amount of fuzzing done to tease out remaining issues, fact remains that it's yet another attack vector that makes your resident security guy or gal raise a few eyebrows.

With zune-jpeg you get a fully memory-safe implementation that can act as a drop-in replacement, and when you turn on the (admittedly experimental) SIMD support for things like the IDCT and color-space conversion it's as fast, or in places even faster, than libjpegturbo/mozjpeg, even on very large (e.g. 4k by 4k) images.

inoop · 2024-09-28T15:07:05 1727536025

As always, it depends a lot on what you're doing, and a lot of people are using Python for AI.

One of the drawbacks of multi-processing versus multi-threading is that you cannot share memory (easily, cheaply) between processes. During model training, and even during inference, this becomes a problem.

For example, imagine a high volume, low latency, synchronous computer vision inference service. If you're handling each request in a different process, then you're going to have to jump through a bunch of hoops to make this performant. For example, you'll need to use shared memory to move data around, because images are large, and sockets are slow. Another issue is that each process will need a different copy of the model in GPU memory, which is a problem in a world where GPU memory is at a premium. You could of course have a single process for the GPU processing part of your model, and then automatically batch inputs into this process, etc. etc. (and people do) but all this is just to work around the lack of proper threading support in Python.

By the way, if anyone is struggling with these challenges today, I recommend taking a peek at nvidia's Triton inference server (https://github.com/triton-inference-server/server), which handles a lot of these details for you. It supports things like zero-copy sharing of tensors between parts of your model running in different processes/threads and does auto-batching between requests as well. Especially auto-batching gave us big throughput increase with a minor latency penalty!

jgraettinger1 · 2024-09-28T18:52:43 1727549563

> For example, imagine a high volume, low latency, synchronous computer vision inference service.

I'm not in this space and this is probably too simplistic, but I would think pairing asyncio to do all IO (reading / decoding requests and preparing them for inference) coupled with asyncio.to_thread'd calls to do_inference_in_C_with_the_GIL_released(my_prepared_request), would get you nearly all of the performance benefit using current Python.

saagarjha · 2024-09-28T18:29:01 1727548141

Machine learning people not call their thing Triton challenge (IMPOSSIBLE)

buildbot · 2024-09-28T19:34:10 1727552050

This (Nvidia’s) triton predates openAI’s by a few years.

inoop · 2024-09-27T12:41:39 1727440899

"This is actually good for Bitcoin"

inoop · on May 20, 2024

> you could get yourself 25 top-notch engineers without AI PhD

Not in the US though. According to levels.fyi, an SDE2 makes ~275k/year at Uber. Hire 25 of those and you're already at $6.875MM. In reality you're going to have a mix of SDE1, SDE2, SDE3, and staff so total salaries will be higher.

Then you gotta add taxes, office space, dental, medical, etc. You may as well double that number.

And that's just the cost of labor, you haven't spun up a single machine and or sent a single byte across a wire.

silverquiet · on May 20, 2024

Work from home doesn't mean that home has to be in the US.

cthalupa · on May 20, 2024

> Then you gotta add taxes, office space, dental, medical, etc. You may as well double that number.

Economies of scale help a bit with this for larger companies, so it's probably not quite double for Uber, but yeah, not too far off as a general rule of thumb. Probably a 75% increase on the employee facing total comp to get fairly close to the company's actual cost for the employee.

cdchn · on May 20, 2024

"and have another mil left over for metal" was the part accounting for hardware, infrastructure, etc.

And you can fudge the employee salary a mil or two either way, but the point is that spending that much on a team to build something isn't infeasible or even unreasonable.

inoop · on May 20, 2024

> I'm guessing they know a lot about their costs, and you know very little.

I'm curious what makes you believe the OP doesn't know about cost? They might be director-level at a large tech company with 20+ years experience for all you know...

> There's little value in insulting the team members like this.

I'd argue it's not insulting to question a claim (i.e. 'we saved $6MM') that is offered with little explanation.

qaq · on May 20, 2024

Regardless of position at some other company it will tell you precisely 0 about this specific situation.

inoop · on May 20, 2024

I'd be curious as well to see a more complete cost-benefit analysis, and I'd be especially interested in labor cost.

We don't know how much time and head count Uber committed to this project, but I would be impressed if they were able to pull this off with fewer than 6-8 people. We can use that to get a very rough lower-bound cost estimate.

For example, AWS internally uses a rule of thumb where each developer should generate about $1MM ARR (annual recurring revenue). So, if you have 20 head count, your service should bring in about $20MM annually. If Uber pulled this off with a team of ~6 engineers, by AWS logic, they should about break even.

Another rule of thumb I sometimes see applied is 2x developer salary. So for example, let's assume a 7-person team of 2xSDE1, 3xSDE2, 1xSDE3, and 1xSTAFF, then according to levels.fyi that would be a total annual salary of $2.3MM. Double that, and you get $4.6MM/year to justify that team annual cost footprint, which is still less than $6MM.

Of course, this is assuming a small increase in headcount to operate this new, custom data store, and does not factor in a potentially significant development and migration cost.

So unless my math is completely off, it sounds to me like the cost of development, migration, and ownership is not that far off from the cost of the status quo (i.e. DynamoDb).

shrubble · on May 20, 2024

If the savings are 6 million per year, then in later years it should pay off since the development is a one time cost.

inoop · on May 20, 2024

The cost doesn't suddenly drop to zero once development is done. Typically a system of this complexity and scale requires constant maintenance. You'll need someone to be on-call (pager duty) to respond to alarms, you'll need to fix bugs, improve efficiency, apply security patches, tune alarms and metrics, etc.

In my experience you probably need a small team (6-8 people) to maintain something like this. Maybe you can consolidate some things (e.g. if your system has low on-call pressure, you may be able to merge rotations with other teams, etc.) but it doesn't go down to zero.

shrubble · on May 20, 2024

If you follow the various links on the Uber site, you see that they have multiple applications sitting on the same database. see https://www.uber.com/blog/schemaless-sql-database/ . It's not just 1 design of a database, with 1 application on top...

Rastonbury · on May 20, 2024

Not an engineer, but something like this takes 6-8 people working on only this for a full year?

inoop · on May 20, 2024

That has been my experience, yes. You need one full-time manager, one full-time on-call/pager duty (usually a rotation), and then 4-6 people doing feature development, bug fixes, and operational stuff (e.g. applying security patches, tuning alarms, upgrading instance types, tuning auto-scaling groups, etc. etc.).

Maybe you can do it a bit cheaper, e.g. with 4-6 people, but my point is that there's an on-going cost of ownership that any custom-built solution tends to incur.

Amortizing that cost over many customers is essentially the entire business model of AWS :)

inoop · on April 25, 2024

> In 2012 we had Internet deployment of fully decentralised machine learning (SGD)

Peer-to-peer training of the next big LLM when? :)

inoop · on Feb 5, 2024

Kayak lets you do it, although there's no guarantee the airline won't swap out the plane from under you after you've completed booking.