Back to homepage
                    Cameras and Lidars are Not Enough
                       - what we can learn from Uber's recent fatal accident

Xi Chen and Xue Liu, 2018-03-22

So, what happened?

At about 10 pm on Sunday (March 18) evening, a Uber self-driving  SUV killed a woman in the street in Tempe, Arizona [news link here]. Tempe Police said that the vehicle was in autonomous mode at the time of the crash, and that the vehicle hit a woman, who was walking outside of the crosswalk and later died at a hospital. Sylvia Moir, the Tempe Police Chief who viewed footage from two of the vehicle’s cameras, pointed out that the woman "came from the shadows right into the roadway" [news link here].  Both the Uber self-driving technology and the human safety driver behind the wheel failed to notice the pedestrian, and didn't slow down the vehicle when it approached the victim. It was believed to be the first pedestrian death associated with self-driving technology. 

However, this was not the first fatal tragedy of self-driving vehicles. Almost two years ago, a Tesla driver was killed in Florida, when his car struck a tractor-trailer. The Tesla Model S was "on a divided highway with Tesla Autopilot on when a tractor trailer drove across the highway perpendicular to the Model S. Neither Autopilot nor the driver noticed the white side of the tractor trailer against a brightly lit sky, so the brake was not applied" [news link here]. Consequently,  the windshield of the Model S impacted the bottom of the trailer, resulting in the death of the Tesla driver.

As Uber's tweet after the accident said, our hearts all go out to the victims' families. Yet, it is also important to understand why Uber, Tesla and some other major players' self-driving technologies failed in these accidents, so that we can advance the solutions with new technologies and help prevent such tragedies from happening again. 

Why current self-driving technology is not good enough?
    • How self-driving sense the surrounding?

Self-driving systems used by Uber and Tesla (as well as Google) are mainly based on cameras and lidars . 

As an example, let's dig deep with this blog to see how Uber self-driving works. 

Uber’s SUVs are equipped with several different imaging systems, which work with both ordinary duty and emergency duty.

Fig. 1. Uber's self-driving system (by ATG, Uber).

    • Top-mounted lidar. The bucket-shaped item on top of these SUVs is a lidar, or light detection and ranging, system that produces a 3D image of the car’s surroundings multiple times per second. Using infrared laser pulses that bounce off objects and return to the sensor, lidar can detect static and moving objects in detail day or night.
    • Front-mounted radar. Radar, like lidar, sends out a signal and waits for it to bounce back, but it uses radio waves instead of light. This makes it more resistant to interference, since radio can pass through snow and fog, but also lowers its resolution and changes its range profile.
    • Short and long-range optical cameras. Lidar and radar are great for locating shapes, but they’re no good for reading signs, figuring out what color something is and so on. That’s a job for visible-light cameras with sophisticated computer vision algorithms running in real-time on their imagery. The cameras on the Uber SUV watch for telltale patterns that indicate braking vehicles (sudden red lights), traffic lights, crossing pedestrians and so on. Especially on the front end of the SUV, multiple angles and types of cameras would be used, so as to get a complete picture of the scene into which the SUV is driving. Such a camera subsystem is an analogy to human, who use eyes to observe the surrounding traffic conditions.
  • Why Uber's self-driving SUV didn't detect the bicyclist?

Cameras have inherent drawbacks

Clearly, the optical cameras of Uber's system didn't accomplish one of its missions, i.e., pedestrian detection. To investigate why it failed, we reviewed the recently released video that was taken by the cameras mounted on the Uber SUV (YouTube link here).  

As shown in the video, a road section was covered in shadows. The bicyclist started to cross the street from the left side of the street, and entered the shadowed area. At first, the SUV's exterior camera revealed nothing but a total darkness around the shadowed area. The camera didn't detect the bicyclist, until the SUV got close enough to allow the light of its headlamp to reach her. However, at this moment, the bicyclist was only several meters in front of the SUV, which was driving at around 40 mph. There was no time for the self-driving SUV and the backup driver to react. Thus, the accident happened. After watching the video, we may have to admit the following: under such a lighting condition, the accident was still very likely to happen, even if the vehicle was driven carefully by a human driver.  (We notice that there are also voices saying that the light condition was not as poor as shown in Uber's released video. Yet, what can be certain is that the shadowed area was darker than other sections of the street).
Fig. 2. The crash scene.

This is just one of the tragic examples, where machine and human vision systems failed. In another fatal accident of Tesla's self-driving car, both the camera and the driver failed to detect a trailer "against a brightly lit sky at daytime". Different from the dark environment the Uber SUV encountered, the camera of the Tesla car was exposed to a highly bright condition. From these two accidents, we can summarize an inherent drawback of the vision-based self-driving systems. 

  • Drawback 1: Vision is unreliable under poor (e.g., too dark or too bright) lighting conditions.
Some more examples of compromised driving views under poor lighting conditions are illustrated in Fig. 3.
Fig. 3. Examples of compromised views.

Another situation could be frequently observed as well when we are driving. When our cars are crossing an intersection, the view of oncoming traffic may be partially obstructed. This obstruction may be caused by something natural such as a tree, hedge, or shrub, or something man-made such as a sign, fence, or wall. Drivers must exercise greater than normal caution to navigate such intersections safely, but unfortunately accidents still occur sometimes. Fig. 4 shows a few examples of such a situation.
    Fig. 4. Examples of obstructed views.
Thus, we can summarize another inherent drawback of vision-based self-driving systems.
  • Drawback 2: Vision can be partially blocked by obstacles, leading to a total unawareness of the blocked area.
In a broader sense, the current vision-based systems fail because they are just trying to reproduce a human-like driver. Therefore, they cannot overcome the physical limits
of a human driver (e.g., blurred or obstructed views).

The lidars are not designed for pedestrian detection
One may also wonder, why the lidar also failed to detect the bicyclist? It should be working fine even in total darkness (think of the bats). Unfortunately, it was not designed for pedestrian detection and didn't detect the bicyclist, due to several limitations.
  • Lidar is an acronym of "Light Detection And Ranging". It is a surveying method that measures distance to a target by illuminating the target with pulsed laser light and measuring the reflected pulses with a sensor. Commercial lidars are good at measuring distances and some rough shapes. However, they are not good at recognizing objects (e.g., telling a vehicle from a bicycle) in real-time. This is because of several reasons including i) they cannot retrieve the valuable color information; ii) the resolution of lidars is quite limited, especially for distant objects (the laser beams will be too spread out to return a viable image); iii) the refresh rate is relatively low for high-speed scenarios.
Below is our hypothesis of what happened with the lidar system. (Of course, the real scenario should be clear when Uber analyze the logs from their lidar system and releases the report.) When the bicyclist entered the lane on the opposite direction of the SUV, the lidar did detect something. However, due to the aforementioned limitations, it cannot tell whether that is a bicyclist, a vehicle, a tree, a traffic sign or something else. An object exists on the opposite lane is perfectly OK and extremely normal when we are driving. For example, this object could be another vehicle heading the other direction parallel to us. Hence, if an object is detected on the other lane, there is no need to slows down, unless the cameras recognize a pedestrian or a traffic sign. Unfortunately, the cameras saw nothing in the shadow, as discussed before. Therefore, when the bicyclist suddenly entered the lane of the SUV, she was already too close to the SUV. The lidar detected something in front of the SUV. But it was too late.

How to avoid such accidents? Upgrade self-driving with V2X!

Clearly, just mimicking people's vision system is far from enough for self-driving. It won't allow us to completely avoid the accidents Uber and Tesla have experienced. If we want a self-driving robot that is fundamentally superior to human drivers, we need to equip self-driving cars with abilities that human do not have. For example, had the pedestrian been able to communicate her existence to the Uber SUV, or had the tractor trailer been able to inform its dimension to the Tesla car, the accidents can be avoided, or at least, not serious as people lost their lives. But can we do this now in a cost-effective way?

The answer is affirmative!

The Vehicle-to-Everything (V2X) technologies are designed exactly for this mission. As an example, DSRC (Dedicated Short-Range Communications) allows traffic participants to periodically (e.g., 10-20 messages per second) exchange with each other their own dynamic information, including position, speed, acceleration and direction, as well as other traffic related information. In this way, a driver or a self-driving vehicle can be aware of the status of other traffic participants, even the vision was compromised by poor lighting conditions or blocked by obstacles. Moreover, this awareness can be extended up to several kilometers, as the V2X messages can travel in the wireless channels far beyond one's immediate view.

Let's revisit Uber's accident to see how we can avoid it from happening and save lives using V2X communications.

Scenario 1: Vehicular-to-Pedestrian (V2P) Communications.

In this scenario (as shown in Fig. 5), a bicyclist B and a vehicle V1 are both equipped with embedded V2P radios. When the bicyclist tries to cross the street, the vehicle is able to predict that such a behavior will come across with its future path. This is easily achieved from the vehicle's side, as the position, speed and direction information of the bicyclist is delivered to the vehicle via V2P wireless channels. By analyzing the bicyclist recent information of positions, velocities and directions, the vehicle can draw a movement trace of her, and further predict her future path by extending this trace. In this way, a potential collision can be detected and avoided in time.

Fig. 5. Scenario 1: V2P

Scenario 2: Vehicular-to-Vehicle (V2V) Communications.

One may argue that a bicyclist may not carry a relatively heavy V2P device (although V2X chips for smartphones and smartwatches are in mm-level. see Qualcomm's chip here). In this case, we can rely on V2V to integrate the vision from multiple vehicles for a more clear view. In this scenario (as shown in Fig. 6), we consider a device-free bicyclist B (i.e. the bicyclist does not carry a V2P device),  and two V2V-enabled vehicles, V1 and V2.  Again, bicyclist B is crossing the street and may be hit by vehicle V1, of which the camera system fails to recognize the bicyclist in shadows. This time, the bicyclist is device-free and hence cannot inform V1 her existence. Luckily, we have vehicle V2 come in to help. V2 is looking at the scene from another angle, and thus may have better vision than V1. Now, V2 can (actively) communicate with V1 of what it sees, and V1 can slow down to prevent the potential collision from happening.

Fig. 6. Scenario 2: V2V

Scenario 3: Vehicular-to-Infrastructure (V2I) Communications.

It is also possible that the light is so dark for any camera to see bicyclist B clearly, or V1 is the only car on the street. In this scenarios (as shown in Fig. 7), we can leverage these sensors deployed on roadside infrastructures to detect the existence of bicyclist B. The information is then broadcast to all the surrounding vehicles from the mounted V2I Road-Side Unit (RSU). In this way, vehicle V1 can again notice the bicyclist, and slows down to avoid collision. 

Fig. 7. Scenario 3: V2I

The above scenarios are only the basic application scenarios of V2X, with 2 to 3 traffic participants. In practice, there may be 
thousands even tens of thousands of participants, directly sharing hundreds of thousands of messages in a 1-km range. Moreover, important information will be extracted and delivered to the whole city-level transportation system through multi-hop or back-haul supported data networks. This means the traffic can be scheduled and coordinate at different granularity, such as
vehicle level, intersection level, street level, district level, and city level.

Dive into V2X
  • What is V2X?
Quoted from Wikipedia: "Vehicle-to-everything (V2X) communication is the passing of information from a vehicle to any entity that may affect the vehicle, and vice versa. It is a vehicular communication system that incorporates other more specific types of communication as V2I (Vehicle-to-Infrastructure), V2V (Vehicle-to-vehicle), V2P (Vehicle-to-Pedestrian), V2D (Vehicle-to-device) and V2G (Vehicle-to-grid)."
Quoted from U.S. Department of Transportation (USDOT): "V2V systems potentially address 79 percent of all vehicle target crashes, 81 percent of all light-vehicle target crashes, and 71 percent of all heavy-truck target crashes. V2I systems potentially deal with 26 percent all vehicle target crashes, 27 percent of all light-vehicle target crashes, and 15 percent of all heavy-truck target crashes. Combined V2V and V2I systems potentially address 81 percent all vehicle target crashes, 83 percent of all light-vehicle target crashes, and 72 percent of all heavy-truck target crashes. "
From us
: V2X is a powerful platform to connect vehicles, drivers, self-driving robots, pedestrians, traffic infrastructure, roadside sensors, administrations and beyond. Wireless waves penetrate obstacles easily under any lighting condition, delivering valuable traffic information to and from every participant. These information flows form a giant data network that covers the whole transportation system. With a largely extended awareness of the surrounding, human and machines can react to traffic conditions much faster than ever before. More importantly, V2X enables traffic participant to communicate and coordinate with each other, allowing everyone to proactively contribute to a safer and more efficient traffic.
  • V2X Standardization - DSRC vs LTE-V
Vehicles and machines are made by different manufactures. In order to let them communicate in an orderly, efficient and fair manner, we need to regulate their transmission/reception behaviors with communication standards. 
There are two major V2X standard groups, i.e., the 
Dedicated Short-Range Communications (DSRC) developed by IEEE and supported by USDOT and major car manufacturers, and the long-term evolution-vehicle (LTE-V) developed by 3rd Generation Partnership Project (3GPP).
For DSRC, its standardization started as early as 2004, when IEEE started to work on wireless access for vehicles under the umbrella of their standards family IEEE 802.11. for Wireless Local Area Networks (WLAN). Their initial standard for wireless communication for vehicles is known as IEEE 802.11p. Around 2007 when IEEE 802.11p standard got stable, IEEE started to develop the 1609.x standards family standardizing applications and a security framework, and soon after the Society of Automotive Engineers (SAE) started to specify standards for V2V communication applications. SAE used the term DSRC for this technology. (Cited from Wikipedia, make sure to revisit here
LTE-V was announced recently by 3GPP on 2017. It is a set of 
physical layer standards for V2I and V2V communication that uses a radio technology different from IEEE 802.11 WLAN. One of its advantages is that, it can seamlessly connect with the current cellular network, and thus effectively integrate vehicles to the existing data networks. 
We notice that there are studies showing that DSRC is superior than LTE-V, and vice versa. Researchers and engineers from both camps are insisting that their 
technologies are better than the other. We don't intend to add fuel to the fire. In our opinion, both DSRC and LTE-V are very promising technologies, with similar technical performance under a fair comparison. Both standards can coexist and will be reciprocal. The final decision of which standards/technologies to be used, is usually determined by something other than the performance. In North America, DSRC is and will very likely be dominating, since it has the support from USDOT and major automakers. In addition, field tests have been extensively conducted to validate DSRC. It would be really hard for LTE-V to catch up with the deployment speed of DSRC in North America. However, in other nations, both DSRC and LTE-V have a chance to win.

Fig. 8. DSRC vs LTE-V (A. Filippi et al., "Ready to roll: Why 802.11p beats LTE and 5G for V2X")

The below table compares DSRC with LTE-V in several important aspects.

   Developed by  Supported by Standard
Latency Bandwidth  Mass Deployment
Car manufacturers, e.g., GM, Ford, Toyota, Honda, Mercedes, BMW, ... 
Completed now,
validated by field tests
Lower  High from 2021
 LTE-V  3GPP  Cellphone and chip manufacturers Since 2017,
still ongoing
Low  Higher  Unknown

In summary, to upgrade the current self-driving systems, and to boost the driving safety, we believe that DSRC is a better candidate in North America.
  • More about DSRC
DSRC is under active development and deployment in the United States and in many other countries. In February 2014, the USDOT first announced to commit to the use of the DSRC technology on new light-duty vehicles [ref. link here]. From then on, regulation work has been actively advanced by US Congress, USDOT, IEEE and major car manufacturers such as GM, Toyota, Ford, Honda, etc. Specifically, the National Highway Traffic Safety Administration (NHTSA) would require DSRC in cars for V2V safety. Cars must  send and receive Basic Safety Messages (BSMs) via DSRC. This requirement is supported by most automakers, but is opposed by some Wi-Fi, Cellular, and Privacy stakeholders. The possible timeline for mass deployment is shown as below:
F.g. 9. The possible timeline for mass DSRC deployment, by John Kenney, "A Status Update on US DSRC", ITS world congress 2017.
Some facts about DSRC: 
  • Dedicated bandwidth: 75 MHz of spectrum in the 5.9 GHz band allocated exclusively to DSRC for traffic safety. This is different from, for example, Wi-Fi, which shares the 2.4 GHz band with Zigbee, Bluetooth, etc.
  • Short Range Communications: target communication range 100 m - 1 km (relatively short compared to other communications, e.g., cellular and satellite communications)
  • The bandwidth is divided into 7 channels, as below. 
Fig. 10. DSRC channel division.
  • Each DSRC radio contends for channel access, following a CSMA protocol.
  • Basic Safety Messages (BSM) are exchanged periodically (10-20 messages per second per vehicle) in Channel 172. Emergency alerts will be transmitted upon requirement in Channel 184, with a higher priority.
  • Each BSM includes two parts of information. The mandatory part contains position, speed, direction, angle, acceleration, brake system status, and vehicle size. The optional part contains extended safety information such as ABS system status, path history and direction, sensors data, steering status, etc. Below is a detailed break down for BSM format.
Fig. 11. DSRC BSM format (from SAE, "DSRC Implementation Guide")

Enhancing V2X - Our OnCAR Approach
Next, we will discuss our OnCAR approach to improve the performance of V2X technologies, so that V2X can better serve self-driving and automated traffic management. Note that our OnCAR solution is compatible with DSRC, LTE-V and other V2X standards/technologies.
  • The Challenges
V2X is operating in the volatile and dynamic vehicular environments. Compared to other traditional communication systems such as Wi-Fi and cellular communications, V2X faceS some unique challenges.
  1. Highly dynamic environments. For example, the topology and density of traffic participants can vary dramatically, as each vehicle quickly passing through different traffic areas. The majority of traffic participants may change in seconds, e.g., approaching a crowded crosswalk. The list can be really long.
  2. Severe transmission collisions. During rush hours, the density of vehicles can be super high (e.g., as some intersections), leading to intensive channel contention.
  • Our Solution - OnCAR  (pdf)(ppt)
To address these challenges and guarantee the performance of V2X, we are in need of an approach to adopt V2X radio behaviors to the ever-changing environments. The V2X radio behavior is controlled by many radio variables, such as data rate, transmission power, contention window size, target transmission range, etc. Also, there are multiple performance metrics to be guaranteed and optimized, which include but are not limited to goodput (how many packets being delivered successfully each second), packet delivery ratio - PDR (how much packets being delivered successfully out of all the transmitted packets), end-to-end delay (how much time does each packet take to reach the destination), etc. Moreover, all the parameter-metric pairs need to be considered and adjusted accordingly to the fast-varying environment features like traffic density and signal-to-interference-and-noise ratio (SINR, which indicates the quality of V2X channels).
To accomplish this complex yet important missions, we proposed an approach named OnCAR (click here for OnCAR paper, click here for OnCAR slides). OnCAR outperforms other state-of-the-art adaptation algorithms and approaches, because:
  1. OnCAR is able to optimize multiple performance metrics by synchronously controlling multiple radio variables, using its advanced Multiple-Input Multiple-Output (MIMO) control model and systematic control design.
  2. OnCAR embraces the online learning ability, which allows it to quickly react to any kinds of environment variations.
The architecture of OnCAR is illustrated as follows.
Fig. 12. OnCAR architecture

OnCAR consists of two MIMO control loops. The feedback loop is in charge of fine-granularity performance improvement. The feed forward loop is used to speed up the control process. Concretely, the feed forward loop takes the environment variables such as traffic density as reference, and produces a baseline parameter vector for multiple radio variables. The feedback loop then evaluates a cost function formed by the delta changes of the performance metrics, online learns and updates the control strategies, and produces a delta vector to refine the baseline parameter vector. The online adaptive controller, which does the learning task, is illustrated as follows.
Fig. 13. The online adaptive controller in OnCAR

This controller utilizes a learning technique, called Recursive Least Squares (RLS), to online estimate the relationship between multiple V2X variables and multiple performance metrics. The learned relationship is then fed to a delta sub-controller to produces the variables adjustments for optimal performance.
  • Evaluation
OnCAR has been evaluated under a data-driven evaluation platform.
Evaluation Setup:
The whole structure of our evaluation platform is shown as follows. The protocols and our OnCAR solution are implemented in ns-2, using C++. The wireless signal propagation model is implemented in C++ based on  a model proposed by L. Cheng et al. The evaluation platform takes traffic traces from government sources. These traces are fed into SUMO (Simulation of Urban Mobility:, to generate per-vehicle movement traces for fine-granularity ns-2 evaluation. The vehicle speed ranges from 5 km/h to 100 km/h.
Fig. 14. The structure of our evaluation platform 

As examples, we adopt two traffic traces collected from government resources. One data set records the traffic density of Berkeley on Jan. 17, 2007. The other data set traces the traffic density of San Diego on Oct. 1st, 2014. The traffic density variations of these two data sets are presented as follows. 
Fig. 15. Traffic density variations of the two real-life traces.

Evaluation Results:
We compared our results to several variants of the stat-of-the-art approaches. 
  • TPA is an individual transmission power adaptation approach, developed based on state-of-the-art individual power adaptation approaches.
  • DRA is an individual data rate adaptation approach, implemented based on state-of-the-art individual data rate adaptation approaches
  • JPRA is a joint and heuristic adaptation approach, which first determines the transmission power using TPA, and then selects a data rate based on DRA
  • OnCAR is our proposed approach. For the sake of fairness, we only allow OnCAR to adjust transmission power and data rate.
Fig. 16 shows the evaluation results. Compared to the other approaches, OnCAR increases the reliability of V2X by at least 23.7% (using PDR as the metric), improves the efficiency of V2X by at least 30.1% (using goodput/thoughput as the metric), and boosts the fairness among vehicles by at least 40.1% (using the coefficient of variation on PDR as the metric).

Fig. 16. Evaluation results.
From the results, we can conclude that OnCAR can significantly improve the V2X performance under volatile vehicular environments.
we can use this OnCAR-enhanced V2X to upgrade self-driving into a much safer level.
  • OnCAR to Support Automated Transportation - VSmart Demos
We implement a testbed called VSmart (link here) to evaluate and demonstrate the V2X-assisted automated transportation. We recorded several demos to show OnCAR's ability in improving automated transportation applications. The advanced safety application scenarios being investigated include:

More information about our work and demos on V2X commutations can be found here

Some take-home messages
  • Cameras and lidars are not enough to guarantee driving safety for self-driving cars.
  • V2X communications can upgrade the current self-driving systems for significantly enhanced safety.
  • Our OnCAR solution improves the baseline V2X with real-time adaptation and online learning abilities, and thus better support self-driving systems and automated traffic.

Reference links


We would like to thank Xuepeng Xu, Qiao Xiang and Linghe Kong, for their assistance in building the VSmart testbed and recording the demos.