Four obvious reasons: 1. They've been doing it for ages. They had cars on the st...

MrQuincle · 2025-01-05T15:14:42 1736090082

Any background info on the betting on cameras alone? It sounds as silly as betting on an artificial version of our proprioception to be implemented in cars to measure acceleration. I also don't think they went all the way regarding neuromorphic engineering with spiking neural nets and artificial retinas. It's just so random to me what was decided to be good enough for autonomous navigation.

Tesla went from very expensive cars down to cheaper ones. It would make so much more sense to do the same for perception. First go over board and go for high bandwidth input and lots of processing power and optimize later.

tim333 · 2025-01-05T15:44:02 1736091842

The betting on cameras alone is basically an Elon Musk thing. His reasoning is basically that if humans can do it AI should be able to do it. So far the software isn't really up to it but time will tell. Some stuff - https://www.engineering.com/now-revealed-why-teslas-have-onl...

tzs · 2025-01-05T17:33:07 1736098387

I wonder how that approach handles dense fog?

I used to regularly have to make a left turn onto a rural highway on foggy mornings. Sometimes people drive faster than they should in fog. Sometimes fast enough that by the time they could see I'm in the intersection turning they would be too close to stop.

Cars going fast enough to have that problem made enough sound that they could be heard quite a bit farther away than they could be seen. I'd open my windows at the intersection and listen until I couldn't hear any highway traffic. Then I'd know that any approaching cars are far enough away that I should have time to turn onto the highway and get up to speed before they arrive.

tim333 · 2025-01-06T03:23:02 1736133782

Yeah. Also I don't know how good the Tesla cameras are but my car has a reversing camera and it's ok for going back 2m at 2mph but kind of terrible compared to looking forward through the windscreen.

theptip · 2025-01-05T15:57:12 1736092632

Karpathy discussed this at length on the Lex Fridman podcast: https://lexfridman.com/andrej-karpathy/

IIRC I think it’s the section (1:23:25) – Camera vision

The TL;DR is that sensor fusion is really hard, and their bet was that keeping the training pipelines simpler would let them scale faster/easier, and human vision is the existence proof that it can be done without lidar.

minwcnt5 · 2025-01-05T18:16:12 1736100972

One of the big flaws in Karpathy's logic is that it implies human vision is acceptable and sufficient for an AV. The reality, as Cruise found out, seems to be that society will demand AVs are much safer than humans.

Human vision is an existence proof for human-level performance without lidar, but Waymo is an existence proof for 10x human performance WITH lidar. Right now the latter is where the bar is, and it'll keep being raised. I don't think at this point one could get away with deploying AVs at scale that are significantly less safe than Waymo.

Also: if sensor fusion is so hard, why is Waymo able to solve it but not Tesla?

wenc · 2025-01-05T21:21:22 1736112082

> Also: if sensor fusion is so hard, why is Waymo able to solve it but not Tesla?

I think Karpathy's point is that Tesla wants to try to avoid the "entropy" that comes from adding a sensor (senior software engineers and higher understand this concept). Every sensor (and every version of it -- sensor hardware does get updated) you add requires recalibrating the software stack, the hardware design, which introduces points of failure every time you roll it out.

According to Karpathy, Tesla does use Lidar -- but only at training time, as a source of truth. Once the weights are learned, they operate without the Lidar.

Have a full sensor suite may work for Waymo at the current scale (limited cities), but scaling beyond that poses problems.

Whereas Tesla has to work with a different set of scaling economics -- that of a mass market vehicle already deployed globally.