Deep reflections on FSD

  • Thread starter Thread starter hobbit
  • Start date Start date
  • Replies Replies 6
  • Views Views 2K

hobbit

Well-Known Member
A few days ago, Tesla apparently had a big gathering at their headquarters
called "autonomy day". It was intended to present the recent work leading
up to Hardware 3.0 and the updated software that runs on it, all having to
do with the endless journey toward full self-driving. There are a couple
of different Youtubes covering it, most of them very long as it's the entire
presentation portion of the conference. I watched a bunch of it the other
night, two or three hours worth, and got a better idea of what they're after
and how they're working toward it.

The hardware is a distinctly impressive effort -- big custom ASICs
optimized for the neural-network they need to run the camera video through
for recognition, and over 20 times faster than the best Nvidia GPUs on
the market for the purpose. One of the lead fellows working on the
software side went into a deep description of the machine learning
process and training the neural net, and I had to do a bunch of ancillary
searching/reading to get a glimmer of what he was actually talking about.
It was all rather enlightening.

While it's a complex and noble effort, it inevitably lacks one key attribute
that HUMANS bring to the table: *intuition*. You can train by rote, using
thousands of pictures of cars, lanes, landscapes, obstacles, crashes
happening in realtime, water distortion and weird lighting effects at
night, or whatever, and that alone will never get you to level 5 autonomy.
There will always be *some* wacky situation that the best-trained neural net
will not know how to respond to, and usually in such situations there is
NOT enough time to alert a human driver and have them wake up and take over
in a competent and perceptive enough fashion. The big problem is, that 99.8
percent "FSD" efficacy that gets most people to work and back most of the
time without incident simply cannot assess things to the depth that an
alert and engaged human driver does, and breeds insidious complacency on
the part of the drivers that it's "good enough". When it abruptly turns
out that it's not good enough, those drivers are *not* ready to do their
job in time.

The introductory example that the software guy gave was a picture of an
iguana. An untrained neural network will just as easily decide that it's
a picture of a boat. By showing it hundreds, thousands more pictures of
iguanas in all different positions, with different lighting, coloration,
shadows, occlusion, etc -- gradually the net can build up a pretty solid
model of what an iguana looks like and maybe even how it's moving. We,
humans, can basically do that from ONE picture, and immediately extrapolate
what any iguana will look like after it's rotated, injured, greyscaled,
partially obscured by an object in front, or all of that -- without much
additional help or input. It's just something we do, applying our generalized
intuitive models for how objects move and appear in space. We do the same
for all the traffic dynamics we handle on a daily basis. We don't need to
have seen a thousand different pictures of a semi to have a good sense for
it volume and motion paths. We are able to spot a generic SUV sitting
unmoving near the end of a driveway like any vision system could, but the
fact that its reverse lights are on tells US to anticipate something very
different from a parked car, and it seems unlikely that Tesla's best efforts
would pick up on a subtlety like that until it actually started moving and
intersecting the forward path lines. Too late in some cases.

Great example from earlier this week: an empty flatbed had just pulled into
the driveway of a shop, but needed to turn around. It started backing out,
into the road I was on approaching it. The shape that intruded into the
space of the road was completely unlike any car -- it was the narrow wedged
end of the flatbed and the supporting framework under it, visually connected
to the side of the road, sitting fairly low down, not as high as a vehicle
but not lying on the road surface either. I had the briefest moment of
"wtf is *that*" while approaching, and quickly figured it out. If I had
kept going in a straight line I would have slammed into it; but without even
thinking much about it I evaulated the existing data I had already collected
about traffic in the lane to my left, which was clear, and smoothly swerved
into the next lane and around the end of the flatbed [which had stopped, but
still hanging out into the road, intending to back out the rest of the way
and get himself straightened out].

My intuition figured out very quickly what the strange shape actually was
and where the rest of that vehicle was and even why it was there and what
its driver was trying to accomplish. I was also fully ready for him to
*not* stop moving farther into the road, with a clear mental picture of all
the surrounding space I had to utilize and/or how long I'd need to stop
if needed. The deepest-trained "passive" neural net under the Tesla model
would still have no idea, and even after such an "incident" got uploaded to
the mothership and supposedly evaluated by a human for further training, I
have my continuing doubts. FSD's path to production will be burdened by
countless "woulda/coulda/shoulda" excuses for avoidable incidents that
weren't avoided.

And I certainly wouldn't pay an extra two grand for the privilege of
beta-testing something whose faults and deficiencies I now understand
at a fairly comprehensive level. Just ... no. *My* neural net's traffic
training has been decades in its tuning, and I'll continue to rely on
that. It has the additional layers to evaluate "what does all this
actually *mean*", which by comparison is priceless. Machine learning
is still a long way from that point.

_H*
 
A few days ago, Tesla apparently had a big gathering at their headquarters
called "autonomy day".
I remember seeing the "autonomy day" from 10 months ago and haven't found anything more recent. Same set?

Minor correction, in October/November, I paid $6k for Full Self Driving beating out the first price increase to $7k by a week. With V2.5 hardware, all I got was summon.

I am not part of the Tesla software team so I'll share my expectation. I expect the neural network to include but not be limited to just recognition of common shapes. But rather to identify objects even without a specific label and avoid running into them. For example, a bag of trash or other nondescript debris falls off a pickup truck or a plastic bag flowing in the wind, I'm expecting the neural network to avoid running the car into or over immobile objects yet not emergency brake on the floating bag. So if the 'flying spaghetti monster' decides to stage a vision in the highway, the car will do what it can to avoid impact with anything solid including vehicles to the side and rear.

I am not expecting perfection but 'good enough'. So one of the things I do is collect and document what Autopilot fails, not the 99% success, but the edge cases that Full Self Driving should correct:




I still have V2.5 hardware so my expectations are modest. Summons is fun with the dogs in the car. I tell others about how smart my dogs are to fetch the car. But summon has some strange behaviors such as sometimes 'backing up' instead of proceeding forward or turning away. Curious puzzles to figure out. But Autopilot already paid for itself.

On a visit to my Mom, we had a medical problem. Sad to say, I'd not packed my CPAP machine (sleep apnea) which means I became subject to "micro sleeps." Driving home, I had five micro sleep events yet the car stayed in its lane and did not go in the ditch or adjacent lane. My wife recognized what happened and soon as I got to Decatur, a biology break, walk-about, and cup of coffee got us home.

So I'm pretty happy with what I've already got and look forward to what is to come. But I've also noticed a lot of resistance from 'traditionalists' who cite the User's Manual instead of exploring the boundaries. I have a little more curiosity.

Bob Wilson
 
There was an episode of NOVA a few months back that talked about some of these same challenges of training the neural net, and identifying all of the different edge cases that you need to train for. And at the end, once you have trained for that, the question comes up what else is there that you failed to train for?
 
Okay, my bad, it was 10 months ago. But apparently owners are going in
for hardware swaps more recently, maybe that's what threw me off.

Our intuition is what will most likely handle edge cases. We humans learn
differently too -- some are better at rote memorization, e.g. the people who
are good "test-takers" and can just spit back the same info or take first-order
responsive actions. But ask most of them to actually innovate or figure
things out from a sketchy set of inputs, they're often lost. Others *want*
to attain deep understanding of principles and how they were applied,
but often spend too much time obsessing about the details.

What do you think you might have observed about the trailer at night
that the car didn't? What do you think Tesla's folks would have done with
those video segments to improve matters?

_H*
 
The Tesla have a Continental radar unit which based on the specs should have detected the trailers. But I found it only works on metal skirts. There are non-metallic skirts that remain invisible. Yet if headed to the rear bogies, it defects the trailer each time.

The radar specs suggest the trailer should have been radar detected. This makes me wonder what role it plays.

Looking at the Continental specs suggests having two radar units mounted 90 degrees to each other and 45 degrees relative to the horizon, one could assemble a usable, radar map of the road ahead for obstacle avoidance.

If I had motivation, one behind each BMW kidney would be a neat solution.

BTW, you might look at “comma 2”, an open source, automated driving system. Neither of our cars are compliant but you might have a candidate.

Bob Wilson
 
Continental must make a ton of stuff for the automotive industry...
including the cell-data modem I summarily ripped out of the Kona.

The training undoubtedly will not [yet] include features like pothole
detection/avoidance, how "fresh" a green light is, how likely that
stopped opposing car is to whip the left right in front of me using
its driver's facial expression as part of the input ... basic pre-spacing
for smoother merges [difficult in New England, at a minimum] ... those
subtle nuances that experienced and aware drivers process all the time,
often subconsciously but it does happen and often saves lives and property.

I'd probably have to wait a very long time for any car to include a fully
autonomous yuppie button to deal with the situation to the rear.

_H*
 
I would agree that Autopilot and HW3.0 FSD need training that emphasizes the limits. That is why I seek out the problem areas.

Sad to say some readers roll-up the User Manual restrictions and smack back in their posts. No contribution to our knowledge yet that seems the limit of their understanding. <MEGA SIGH!>

Driving should be fun and I would never try to force my preferences on others. But I will share my lessons learned.

Bob Wilson
 
Back
Top