Abstract
Driver attention prediction is currently becoming the focus in safe driving research community, such as the DR(eye)VE project and newly emerged Berkeley DeepDrive Attention (BDD-A) database in critical situations. In safe driving, an essential task is to predict the incoming accidents as early as possible. BDD-A was aware of this problem and collected the driver attention in laboratory because of the rarity of such scenes. Nevertheless, BDD-A focuses the critical situations which do not encounter actual accidents, and just faces the driver attention prediction task, without a close step for accident prediction. In contrast to this, we explore the view of drivers’ eyes for capturing multiple kinds of accidents, and construct a more diverse and larger video benchmark than ever before with the driver attention and the driving accident annotation simultaneously (named as DADA-2000), which has 2000 video clips owning about 658,476 frames on 54 kinds of accidents. These clips are crowd-sourced and captured in various occasions (highway, urban, rural, and tunnel), weather (sunny, rainy and snowy) and light conditions (daytime and nighttime). For the driver attention representation, we collect the maps of fixations, saccade scan path and focusing time. The accidents are annotated by their categories, the accident window in clips and spatial locations of the crash-objects. Based on the analysis, we obtain a quantitative and positive answer for the question in this paper.
Abstract (translated by Google)
URL
http://arxiv.org/abs/1904.12634