Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This whole question about the vision boils down to "humans don't need it so cars should not need it too" the problem with this statement is that humans does not have wheels to move around, they have legs, but wheels are ridiculously simple compared to 4 legs tapping 160km/h on a highway. Same for birds - they also does not need jet engines to fly around, but imagine Airbus A380 flapping its wings and what kind of complexity would you need to flap 800km/h through air.


More importantly, we have a tremendous data "engine" processing input from our senses. So assuming for a second that cameras match what our eyes can do, you still do not have a processing engine on the level of our brain to make sense of those inputs.


soooo... you're agreeeing that non-vision isn't necessary since the control domain is so much simpler?

I personally think they should use as much data inputs as possible: radar, IR, LIDAR, mesh networks, fixed route information.

Where tesla went particularly wrong IMO is ignoring some sort of route-based chunk information which is how humans navigate. IIRC Elon said something to the effect of just having an algorithm to work everywhere.

Humans use the basic algorithm "stay in lane, drive forward" and then decorate with signs, knowledge of curves, locations of potholes, dangerous low-viz corners, likelihood of surprise stopped traffic, obscured driveways, general character of neighborhoods, road purpose. Weather. Windy sections, icy sections, light availability anomalies. What type of vehicle. Repair state of vehicle.

A general AI algorithm will never be able to properly account for flavors/tags/chunk info on routes. Especially since cloud precomputation is so available these days.

Anyway, while recognizing that Tesla's "Fully Self Driving" is not as advertised, and we are a ways from self driving for any statistical measure of superiority to a healthy aware adult, it is still damn impressive what FSD vids show.

Do AI driving systems try to make "subsystems" of AI networks to reduce inputs to various higher-level inputs, or do other just throw a ton of inputs at a big ass network and just let the entire system rise from the soup of information?


The Tesla AI day videos [1] go into some detail about this. They use multiple networks that are dedicated to specific tasks.

[1]https://www.youtube.com/watch?v=j0z4FweCy4M (2021), https://www.youtube.com/watch?v=ODSJsviD_SU (2022)


> Humans use the basic algorithm "stay in lane, drive forward"

If you've ever driven in Vietnam, that is so not true.


Hell, even in the northeast US (particularly the cities) this isn't true. Self-driving cars today seem to have a dogmatic focus on California-style driving.


But the question at hand is system control, not locomotion! You're not asking the automation to walk (well, I mean someday we will, but Teslas have wheels), nor the aircraft to flap. We want the automation to do what a human pilot would do. And that works with eyes.

No, I think this argument is largely correct. And frankly settled: anyone who's driven recent FSD beta versions knows very well that the cars "see just fine". They don't hit anything, they see and avoid obstacles. Frankly they're much more observant than humans are, my car will twitch when pedestrians turn as if they're going to enter the road (where human drivers mostly don't notice, and if they do they ignore it). What problems still exist are in planning: things like sign reading, lane selection, etc... still need some work. But collision avoidance just isn't an issue. It isn't. The LIDAR folks were wrong, basically.

(I will admit though that I'm a little sad about the removal of the ultrasound sensors though. It's true the autonomy probably doesn't need them, but I really like having the chimes to guide parking and garage maneuvering.)


> No, I think this argument is largely correct. And frankly settled: anyone who's driven recent FSD beta versions knows very well that the cars "see just fine". They don't hit anything, they see and avoid obstacles.

Only if you ignore times where intervention stopped it from hitting something, times where it did actually hit something, massive amounts of jitter and popping in the visual output, phantom braking, etc.

Unless of course "recent" means n+1 where n is the version that crashed into something.

Collision with bollard in Feb 2022: https://www.youtube.com/watch?v=sbSDsbDQjSU

attempts to plow through cyclist Feb 2022: https://www.youtube.com/watch?v=a5wkENwrp_k

almost crashes into tram (can't gauge speed or direction?) Jun 2022: https://www.youtube.com/watch?v=yxX4tDkSc_g

Crashes into curb Aug 2022: https://youtube.com/shorts/8Mh1GjejdsI

Phantom brake Sep 2022: https://www.youtube.com/shorts/5v6j_oL7S-g

Almost colliding with bridge pillar 2 weeks ago: https://www.youtube.com/watch?v=5CMYkDWaqn0

Crashes into various objects in testing 2 weeks ago: https://www.youtube.com/watch?v=yyDxqEzV5Zc


> The LIDAR folks were wrong, basically

I think your mistake is thinking LiDAR exists to solve the happy day scenario. It doesn't.

Vision is sufficient for the majority of use cases. Where LiDAR comes into its own is in the edge cases because it almost guarantees accurate bounding box detection. Which is where vision is at its weakest.

So I want to know what does FSD do when it sees a billboard of a person or when it is seeing a new object for the first time.


As long as the billboard is not moving into the street or standing in the middle of it what would you expect it to do ?


> The LIDAR folks were wrong, basically.

This is far, far from settled at this point.


No, it's over. Look, the LIDAR value proposition was necessarily "Yes, we're outrageously expensive and involve major tradeoffs in physical design of the vehicle, but vision can simply never do what we do". And... vision does. It does, every day. On hundreds of thousands of cars.

In point of fact FSD beta vehicles are out there every day in environments where LIDAR has never been deployed, nor likely ever tested. And we're not seeing clear "it can't do this" failures. Anywhere.

The closest you're going to get to evidence against vision are things like that "It Hit A Kid!" stunt from a few months back that turned out to be basically faked[1].

[1] At least the perps went silent and no one was ever able to reproduce. I mean the whole idea was ridiculous: my car twitches at pedestrians, including kids, including my kids, every day. It literally draws them on the screen.


> And we're not seeing clear "it can't do this" failures. Anywhere

Do you have evidence of the number of accidents, disengagements etc by region ?

Because you're making awfully definitive statements about FSD safety.


Tesla doesn’t release any data. But there are community trackers [1] that puts the disengagement rate in the single digit miles. In comparison, Waymo and Cruise had a rate of roughly 30,000 miles/disengagement according to CA DMV data. That’s how much worse Tesla FSD is.

[1] https://www.teslafsdtracker.com/


Do you actually have FSD Beta? I do. It's not anywhere near the level of polish that you seem to imply it is. It gets things wrong all the time. Turns are downright dangerous.


> The LIDAR folks were wrong, basically.

According to who? Tesla? Because Tesla has a vested interest in trying to prove that they're right even if they're obviously wrong. That's why they constantly try to downplay failures, software issues, device issues etc.

I'm very confused by the attempts to discredit the usefulness of LIDAR. It's another tool you can use to improve the accuracy of your model. Sure, you can use a screwdriver, flip it around and use it as a hammer. But if you need to deal with nails, it's better to grab a hammer instead.


The causality goes the other way. The LIDAR claim was that Tesla's vision approach couldn't work. It did.


Tesla’s vision approach doesn’t work. There is a reason it requires a driver to actively prevent crashes and has a disengagement rate in the single digit miles [1].

All over this thread you keep making grand statements that “it works”, which is just completely false. It’s simple — if it worked, there would be no driver.

[1] https://www.teslafsdtracker.com/


Who made this claim? You keep making these grand statements but without linking to people or providing any proof.

Like there are going to be some environments where Tesla's vision is going to struggle, that's just a fact because you're relying on a more linear set of data. That's why you incorporate as many data points as possible for reasons other commentators have brought up. And I'm confused by how you qualify it as 'working', given we've seen multiple issues which are directly related to their vision approach.


Tesla has not solved self driving, and by all accounts never will with their existing compute and sensor stack.


Let's do a thought experiment: if Waymo could have seen 10 years ago how well FSD perception works today, would they have invested so heavily into LiDAR? Maybe the answer is still yes, because with the low volumes of vehicles they have they can afford to put it in, but it's not clear cut. If you could show FSD to ML/CV engineers 10 years ago their minds would have been completely blown. My mind is still blown by how well it actually works.


> my car will twitch when pedestrians turn as if they're going to enter the road (where human drivers mostly don't notice, and if they do they ignore it)

As long as those pedestrians DO NOT actually enter the road after those turns, any "twitching" of your car in response is an ADDITIONAL SAFETY PROBLEM, because other drivers might notice the erratic movements of your car and do erratic things as well, which in the end might result in accidents that wouldn't have happened had your car not "twitched".

Especially "twitchy" AIs like that of your car might very well "re-twitch" on noticing your car doing small, but erratic and rapid changes in behavior, thereby initiating a "twitch escalation spiral".


Disregarding everything else about your post, which was better addressed by others, I'm amused that you think the FSD being twitchy reflects safety.


The thing is, they don't see as well as humans. They don't respond to changes in the environment until a car is actually in the middle of changing lanes.

It's like being driven around by a drunk person - the reaction happens loooooong after the action that causes it has started.


> We want the automation to do what a human pilot would do. And that works with eyes.

Humans can’t really turn senses off, so they have coffee when driving. Touch and hearing are quite important to “read the road”. Equilibrium too.


Humans work with much more than just eyes. We subconsciously move our heads in case of uncertainty in the stereo vision algorithm and have pretty good IMUs. And yet, everyone has the experience of wrongly focusing on a repeating vertical pattern (vertical blinds or coiled cord for example) and getting disoriented. And every experienced driver has experienced at least some of the following: a moment of glare from a wet road, driving into a sunset, snowy road with curves in flat lighting conditions, dirty windshield/headlights/backup camera.

All of those are challenging for humans and and probably even more challenging for computer vision with cameras only. But except for the last point, all are obviously improved by lidar.


A good human driver also gauges the limits of their own experience and 'phase transitions' into a more cautious mode of driving.

Is that something the algos can do? Infer the familiarity of the situation?


The planning absolutely should take that into account and it does somewhat already. When it is really tight for example, it can slow down to a crawl.

It should do that in many cases, wet roads, busses with open doors, busses in general maybe, blind corners (does that already to some degree), many people nearby etc.


> You're not asking the automation to walk

Tesla should aim for parking first. Teslas do poorly at self parking:

https://www.youtube.com/watch?v=nsb2XBAIWyA


Exactly. The way biology solved something may not always be the best way to do it with technology, because the constraints or so different. And to be more blunt, I think none of the problems where technology surpassed human performance were achieved by doing it the exact same way. From locomotion (legs vs. wheels) to playing chess (strategic intuition vs. billions of calculations).


> "humans don't need it so cars should not need it too"

I think of parking and I'm reminded of "the camry dent"

https://duckduckgo.com/?q=the+camry+dent&iax=images&ia=image...


Human binocular vision is what has been used to drive cars up until now, so it can be done (with a few thousand million years of iteration).

Ideally cars will be self-driving using only passive sensors - but I do think that Musk/Tesla completely missed the value of active sensors in training.


Pretty sure humans haven't been striving for drivers licenses for millions of years...

Tesla does use Lidar on a small number of test vehicles for assessing ground truth. However, they have built enough of a data pipeline and fleet data acquisition to use repeat clips to determine ground truth better than human labelers.


But the "system" is so adaptable from bipedal locomotion, spotting predators or prey and identifying unfit food, figure out social hierarchy, human facial expressions that driving a car is easy.


No, he precisely said that the difference Lidar made was tested, and the delta (difference made) was quite small; not enough to outweigh the downsides. Elon has noted that humans do well, and that's relevant, but that observation was also tested, re lidar.


>and the delta (difference made) was quite small

But why. Because LIDAR doesn't help much in general or because the Tesla engineers aren't good at using the sensor data?

Same with the manufacturing.

Sounds to me like Tesla can't handle complexity. And if they can't handle the complexity of manufacturing, they surely can't handle the complexity of full autonomous driving.


I don't think interpreting the lidar data and integrating it is a super-tough problem, it kinda comes in 3D, unlike stereoscopic vision. So I take it this means that the Lidar data rarely differed and it rarely mattered when it did.

Elon's companies have a long history of handling complexity very well (even, you know, actual rocket science) precisely because they relentlessly simplify everything they can. Raptor one is more complex that Raptor two, but I'll take the latter any day. Nobody else has a full-flow rocket engine. Many previous attempts were swallowed by the complexity of the task. Even Raptor one looks like a rat's nest - but unlike other attempts, it worked.

Tesla's manufacturing margins are far out in front of any other car company (see David Lee On Investing podcast.) Having simpler, larger parts made by much larger pressing machinery is a big part of why. Looks like they are (now) handling the complexity of manufacturing very well.


Basically everything he said as a justification (sourcing, firmware, etc.) applies to every sufficiently advanced part of the vehicle. By that logic, they should not be using touchscreens on the center console, etc.


If the safety delta was low after removing them, as 'tis with Lidar, absolutely.


It's also possible their LIDAR implementation was poor. Waymo uses LIDAR and has fewer incidents per mile.


Waymo has HD maps that require regular 'trawling', just like Google Street view. They are also very conservative in turns, generally avoiding unprotected lefts. They also are much less human-like.


Good point. I'm pretty sure it was poorer, Waymo really goes all-in on Lidar, they still top-mount it don't they? Waymo is also solving a much more constrained problem, with far fewer vehicles, so fewer accidents doesn't surprise me.


They are doing exactly what Tesla isn't willing to do and are being responsible in exactly the way Tesla isn't.


But it's an expensive dead-end for most purposes, esp re the insane mapping. Ok for city robo-taxis. Doesn't solve the real problem to be solved.


Cars drive in cities, last I checked.


I guess you're saying that "cities are mapped, dude." But not mapped in anything like the way Waymo maps for this purpose: if I heard right, yesterday, mapped to the centimeter and very frequently updated to deal with any changes of any kind!


Driving in cities is a real problem that will really need to be solved.


> This whole question about the vision boils down to

Is that really what the problem boils down to? Or how it was decided? Or are you just questioning a common meme that comes up in internet debates about car AI?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: