A few days ago, Tesla apparently had a big gathering at their headquarters called "autonomy day". It was intended to present the recent work leading up to Hardware 3.0 and the updated software that runs on it, all having to do with the endless journey toward full self-driving. There are a couple of different Youtubes covering it, most of them very long as it's the entire presentation portion of the conference. I watched a bunch of it the other night, two or three hours worth, and got a better idea of what they're after and how they're working toward it. The hardware is a distinctly impressive effort -- big custom ASICs optimized for the neural-network they need to run the camera video through for recognition, and over 20 times faster than the best Nvidia GPUs on the market for the purpose. One of the lead fellows working on the software side went into a deep description of the machine learning process and training the neural net, and I had to do a bunch of ancillary searching/reading to get a glimmer of what he was actually talking about. It was all rather enlightening. While it's a complex and noble effort, it inevitably lacks one key attribute that HUMANS bring to the table: *intuition*. You can train by rote, using thousands of pictures of cars, lanes, landscapes, obstacles, crashes happening in realtime, water distortion and weird lighting effects at night, or whatever, and that alone will never get you to level 5 autonomy. There will always be *some* wacky situation that the best-trained neural net will not know how to respond to, and usually in such situations there is NOT enough time to alert a human driver and have them wake up and take over in a competent and perceptive enough fashion. The big problem is, that 99.8 percent "FSD" efficacy that gets most people to work and back most of the time without incident simply cannot assess things to the depth that an alert and engaged human driver does, and breeds insidious complacency on the part of the drivers that it's "good enough". When it abruptly turns out that it's not good enough, those drivers are *not* ready to do their job in time. The introductory example that the software guy gave was a picture of an iguana. An untrained neural network will just as easily decide that it's a picture of a boat. By showing it hundreds, thousands more pictures of iguanas in all different positions, with different lighting, coloration, shadows, occlusion, etc -- gradually the net can build up a pretty solid model of what an iguana looks like and maybe even how it's moving. We, humans, can basically do that from ONE picture, and immediately extrapolate what any iguana will look like after it's rotated, injured, greyscaled, partially obscured by an object in front, or all of that -- without much additional help or input. It's just something we do, applying our generalized intuitive models for how objects move and appear in space. We do the same for all the traffic dynamics we handle on a daily basis. We don't need to have seen a thousand different pictures of a semi to have a good sense for it volume and motion paths. We are able to spot a generic SUV sitting unmoving near the end of a driveway like any vision system could, but the fact that its reverse lights are on tells US to anticipate something very different from a parked car, and it seems unlikely that Tesla's best efforts would pick up on a subtlety like that until it actually started moving and intersecting the forward path lines. Too late in some cases. Great example from earlier this week: an empty flatbed had just pulled into the driveway of a shop, but needed to turn around. It started backing out, into the road I was on approaching it. The shape that intruded into the space of the road was completely unlike any car -- it was the narrow wedged end of the flatbed and the supporting framework under it, visually connected to the side of the road, sitting fairly low down, not as high as a vehicle but not lying on the road surface either. I had the briefest moment of "wtf is *that*" while approaching, and quickly figured it out. If I had kept going in a straight line I would have slammed into it; but without even thinking much about it I evaulated the existing data I had already collected about traffic in the lane to my left, which was clear, and smoothly swerved into the next lane and around the end of the flatbed [which had stopped, but still hanging out into the road, intending to back out the rest of the way and get himself straightened out]. My intuition figured out very quickly what the strange shape actually was and where the rest of that vehicle was and even why it was there and what its driver was trying to accomplish. I was also fully ready for him to *not* stop moving farther into the road, with a clear mental picture of all the surrounding space I had to utilize and/or how long I'd need to stop if needed. The deepest-trained "passive" neural net under the Tesla model would still have no idea, and even after such an "incident" got uploaded to the mothership and supposedly evaluated by a human for further training, I have my continuing doubts. FSD's path to production will be burdened by countless "woulda/coulda/shoulda" excuses for avoidable incidents that weren't avoided. And I certainly wouldn't pay an extra two grand for the privilege of beta-testing something whose faults and deficiencies I now understand at a fairly comprehensive level. Just ... no. *My* neural net's traffic training has been decades in its tuning, and I'll continue to rely on that. It has the additional layers to evaluate "what does all this actually *mean*", which by comparison is priceless. Machine learning is still a long way from that point. _H*
I remember seeing the "autonomy day" from 10 months ago and haven't found anything more recent. Same set? Minor correction, in October/November, I paid $6k for Full Self Driving beating out the first price increase to $7k by a week. With V2.5 hardware, all I got was summon. I am not part of the Tesla software team so I'll share my expectation. I expect the neural network to include but not be limited to just recognition of common shapes. But rather to identify objects even without a specific label and avoid running into them. For example, a bag of trash or other nondescript debris falls off a pickup truck or a plastic bag flowing in the wind, I'm expecting the neural network to avoid running the car into or over immobile objects yet not emergency brake on the floating bag. So if the 'flying spaghetti monster' decides to stage a vision in the highway, the car will do what it can to avoid impact with anything solid including vehicles to the side and rear. I am not expecting perfection but 'good enough'. So one of the things I do is collect and document what Autopilot fails, not the 99% success, but the edge cases that Full Self Driving should correct: I still have V2.5 hardware so my expectations are modest. Summons is fun with the dogs in the car. I tell others about how smart my dogs are to fetch the car. But summon has some strange behaviors such as sometimes 'backing up' instead of proceeding forward or turning away. Curious puzzles to figure out. But Autopilot already paid for itself. On a visit to my Mom, we had a medical problem. Sad to say, I'd not packed my CPAP machine (sleep apnea) which means I became subject to "micro sleeps." Driving home, I had five micro sleep events yet the car stayed in its lane and did not go in the ditch or adjacent lane. My wife recognized what happened and soon as I got to Decatur, a biology break, walk-about, and cup of coffee got us home. So I'm pretty happy with what I've already got and look forward to what is to come. But I've also noticed a lot of resistance from 'traditionalists' who cite the User's Manual instead of exploring the boundaries. I have a little more curiosity. Bob Wilson
There was an episode of NOVA a few months back that talked about some of these same challenges of training the neural net, and identifying all of the different edge cases that you need to train for. And at the end, once you have trained for that, the question comes up what else is there that you failed to train for?
Okay, my bad, it was 10 months ago. But apparently owners are going in for hardware swaps more recently, maybe that's what threw me off. Our intuition is what will most likely handle edge cases. We humans learn differently too -- some are better at rote memorization, e.g. the people who are good "test-takers" and can just spit back the same info or take first-order responsive actions. But ask most of them to actually innovate or figure things out from a sketchy set of inputs, they're often lost. Others *want* to attain deep understanding of principles and how they were applied, but often spend too much time obsessing about the details. What do you think you might have observed about the trailer at night that the car didn't? What do you think Tesla's folks would have done with those video segments to improve matters? _H*
The Tesla have a Continental radar unit which based on the specs should have detected the trailers. But I found it only works on metal skirts. There are non-metallic skirts that remain invisible. Yet if headed to the rear bogies, it defects the trailer each time. The radar specs suggest the trailer should have been radar detected. This makes me wonder what role it plays. Looking at the Continental specs suggests having two radar units mounted 90 degrees to each other and 45 degrees relative to the horizon, one could assemble a usable, radar map of the road ahead for obstacle avoidance. If I had motivation, one behind each BMW kidney would be a neat solution. BTW, you might look at “comma 2”, an open source, automated driving system. Neither of our cars are compliant but you might have a candidate. Bob Wilson
Continental must make a ton of stuff for the automotive industry... including the cell-data modem I summarily ripped out of the Kona. The training undoubtedly will not [yet] include features like pothole detection/avoidance, how "fresh" a green light is, how likely that stopped opposing car is to whip the left right in front of me using its driver's facial expression as part of the input ... basic pre-spacing for smoother merges [difficult in New England, at a minimum] ... those subtle nuances that experienced and aware drivers process all the time, often subconsciously but it does happen and often saves lives and property. I'd probably have to wait a very long time for any car to include a fully autonomous yuppie button to deal with the situation to the rear. _H*
I would agree that Autopilot and HW3.0 FSD need training that emphasizes the limits. That is why I seek out the problem areas. Sad to say some readers roll-up the User Manual restrictions and smack back in their posts. No contribution to our knowledge yet that seems the limit of their understanding. <MEGA SIGH!> Driving should be fun and I would never try to force my preferences on others. But I will share my lessons learned. Bob Wilson